Learning from One Data Point

Sometimes I get annoyed by all the pious statistician-types I find all around me. They aren’t all statisticians, but there are a lot of people who raise analytics and “data-driven” to the level of a holy activity. It isn’t that I don’t like analytics. I use statistics whenever it is a “cost of doing business.” You’d be dumb to not take advantages of ideas like A/B testing for messy questions.

What bothers me is that there are a lot of people who use statistics as an excuse to avoid thinking. Why think about what ONE case means, when you can create 25 cases using brute force, and code, classify, cluster, correlate and regress your way to apparent insight?

This kind of thinking is tempting, but  is dangerous. I constantly remind myself of the value of the other approach to dealing with data: hard, break-out-in-a-sweat thinking about what ONE case means. No rules, no formulas. Just thinking. I call this “learning from one data point.” It is a crucially important skill because by the time a statistically significant amount of data is in, the relevant window of opportunity might be gone.

Randomness and Determinism

The world is not a random place. Causality exists. Patterns exist. In grad school, I learned that there are two types of machine learning models in AI. Models based on reasoning, and models based on statistics and probability. This applies to both humans and machines. Both are driven by feedback, but one kind is driven mainly by statistical formulas, while the other kind is driven by thinking about the new information.

The probability models, like reinforcement or Bayesian learning, are very easy to understand. They involve a few variables and a lot of clever math, mostly already done by smart dead people from three centuries ago, and programmed into software packages.

The reasoning models on the other hand, are complex, but largely qualitative, and most of the thinking is up to you, not Thomas Bayes.  Explanation-Based Learning is one type. A slightly looser form is Case-Based Reasoning. Both rely on what are known as rich “domain theories.” Most of the hard thinking in EBL and CBR is in the qualitative thinking involved in building good domain theories, not in the programming or the math.

The former kind requires lots of data involving a few variables. Do people buy more beer on Fridays? Easy. Collect beer sales data, and you get a correlation between time t and sales s. Gauss did most of the necessary thinking a couple of hundred years ago. You just need to push a button.

EBL, CBR and other similar models are different. A textbook example is learning endgames in chess. If I show you an endgame checkmate position involving a couple of castles and a king, you can think for a bit and figure out the general explanation of why the situation is a checkmate. You will be able to construct a correct theory of several other checkmate patterns that work by the same logic. One case has given you an explanation that covers many other cases. The cost: you need a rich “domain theory” — in this case a knowledge of the rules of chess. The benefit: you didn’t waste time doing statistical analyses of dozens of games to discover what a bit of simple reasoning revealed.

Looser case-based reasoning involves stories rather than 100% watertight logic. Military and business strategy is taught this way. Where the explanation of a chess endgame could potentially be extended perfectly to all applicable situations, it is harder to capture what might happen if a game starts with a “Sicilian defense.” You can still apply a lot of logic and figure out the patterns and types of game “stories” that might emerge, but unlike the 2-castles-and-king situation, you are working in too big a space to figure it all out with 100% certainty. But even this looser kind of thinking is vastly more efficient than pure “brute force” statistics-based thinking.

There’s a lot of data in the qualitative model-based kinds of learning as well, except it’s not two columns of x and y data. The data is a fuzzy set of hard and soft rules that interact in complex ways, and lots of information about the classes of objects in a domain. All of it deployed in the service of an analysis of ONE data point. ONE case.

Think about people for instance. Could you figure out, from talking to one hippie, how most hippies might respond to a question about drilling for oil in Alaska? Do you really need to ask hundreds of them at Burning Man? It is worth noting that “random” samples of people are extraordinarily hard to construct. And this is a good thing. It gives people willing to actually think a significant advantage over the unthinking data-driven types.

The more data you have about the structure of a domain, the more you can figure out from just one data point. In our examples, one chess position explains dozens. One hippie explains hundreds.

People often forget this elementary idea these days. I’ve met idiots (who shall remain unnamed) who run off energetically do data collection and statistical analysis to answer questions that take me 5 minutes of careful qualitative thought with pen and paper, and no math.  And yes, I can do and understand quite a bit of the math. I just think 90% of the applications are completely pointless. The statistics jocks come back and are surprised that I figured it out while sitting in my armchair.

The Real World

Forget toy AI problems. Think about a real world question: A/B testing to determine which subject lines get the best open rates in an email campaign. Without realizing it, you apply a lot of model-based logic and eliminate a lot of crud. You end up using statistical methods only for the uncertainties you cannot resolve through reasoning. That’s the key: statistics based methods are the last-resort, brute force tool for resolving questions you cannot resolve through analysis of a single prototypical case.

Think about customer conversations. Should you talk to 25 customers about whether your product is good or bad? Or will one deep conversation yield more dividends?

Depends. If there is a lot of discoverable structure and causality in the domain, one in-depth customer conversation can reveal vastly more than 25 responses to a 3 question survey. You might find out enough to make the decision you need to make, and avoid 24 other conversations.

But it takes work.  A different kind of work. You can go have lunch with just ONE well-informed person in an organization and figure out everything important about it, by asking about the right stories, assessing that person’s personality, factoring out his/her biases, applying everything you know about management theory and human psychology, and spending a few hours putting your analysis together. You won’t produce pretty graphs and “hard evidence” of the sort certain idiots demand, but you will know. Through your stories and notes, you will know. And nine times out of ten, you’ll be right.

That’s the power of one data point. If you care to look, a single data point or case is an incredibly rich story. Just listen to the story, tease out the logic within it, and you’ll learn more than by attempting to listen to fifty stories and fitting them all into the same 10-variable codification scheme. Examples of statistical “insights” that I found incredibly stupid include:

  1. Beyond a point, more money doesn’t make people happier
  2. Religious people self-report higher levels of happiness than atheists

Duh. These and other “insights” are accessible much more easily if you just bothered to think. Usually the thinking path gets you more than the statistics path in such cases. I cite such results to people who look for that kind of verification, but I personally don’t bother analyzing such statistical results deeply.

Sure, it is good to be humble and recognize when you don’t have enough modeling information from one case. Sure, data can prove you wrong. It doesn’t mean you stop thinking and start relying on statistics for everything. Look at the record of “statistics” based thinking. How often are you actually surprised by a “data-driven” insight? I bet you are like me. Nine out of ten times you ask “they needed a study to figure THAT out?”

And the 1/10 times you get actual insight? Well, consider the beer and diapers story. I don’t tell that story. Statistics-types do.

This means going with your gut-driven deep qualitative analysis of one anecdotal case will be fine 9 out of 10 times.

The Real Reason “Data Driven” is Valued

So why this huge emphasis on “quants” and “data driven” and “analytics?”  Could a good storyteller have figured out and explained (in an EBL/CBR sense) the subprime mortgage crisis created by the quants? I believe so (and I suspect several did and got out in time).

I think the emphasis is due to a few reasons.

First, if you can do stats, you can avoid thinking. You can plug and chug a lot of formulas and show off how smart you are because you can run a logistic regression and the Black-Scholes derivative pricing formula (sorry to disappoint you; no, you are not that smart. The people who discovered those formulas are the smart ones).

Second, numbers provide safety. If you tell a one-data-point story and you turn out to be wrong, you will get beaten up a LOT more badly than if your statistical model turns out to be based on an idiotic assumption. Running those numbers looks more like “real work” than spinning a qualitative just-so story. People resent it when you get to insights through armchair thinking. They think the “honest” way to get to those insights is through data collection and statistics.

Third: runaway “behavioral economics” thinking by people without the taste and competence to actually do statistics well. I’ll rant about that another day.

Don’t be brute-force statistics driven. Be feedback-driven. Be prepared to dive into one case with ethnographic fervor, and keep those analytics programs handy as well. Judge which tool is most appropriate given the richness of your domain model. Blend the two together. Qualitative story telling and reasoning and statistics.

And if I were forced to choose, I’d go with the former any day. Human beings  survived and achieved amazing things for thousands of years before statistics ever existed. Their secret was thinking.

Get Ribbonfarm in your inbox

Get new post updates by email

New post updates are sent out once a week

About Venkatesh Rao

Venkat is the founder and editor-in-chief of ribbonfarm. Follow him on Twitter

Comments

  1. “This means going with your gut-driven deep qualitative analysis of one anecdotal case will be fine 9 out of 10 times.”

    ROFL :)
    You should’ve mentioned a confidence interval as well.

  2. Reminds me of a quote:

    “If your result needs a statistician then you should design a better experiment”
    Ernest Rutherford

  3. This reminds me of the UI designer who left Google due to the their “design by numbers” approach. I think you have already brought that one up, though, when discussing the decline of Google.

    Another example is illustrated by the search for a cure for a Parkinson’s.
    http://www.wired.com/magazine/2010/06/ff_sergeys_search/all/1

  4. kiran nayak says:

    Hi there

    Very intresting analysis.It seems its a egg-hen situation where one compliments the other.

    Statistics help trigger\support\prove logic and logic is in based on experience which is a kind of data.

    In the end what matters is selecting the best tool as per circumstances

    Kiran

    • In the end what matters is selecting the best tool as per circumstances

      Sure, sure, and…
      WHAT is the method for selecting the best tool?
      “Turtles all the way down” ?

  5. Venkat,

    I suspect your rant on statistics and statisticians may have been better understood if presented in whatever context initiated your emotive response.

    The appropriate use of statistics involves the development of a hypothesis to which statistics are applied to determine whether your single (or otherwise limited) interview really did get you any generalizable insights. (If done appropriately, statistical analysis can also be used to open further avenues of productive investigation).

    As implied by in a famous phrase about statistics traditionally attributed to Mark Twain – statistics, just like any other tool, can be, will be, and is often misused – sometimes from ignorance and other times with malice aforethought – but that is not an indictment of the scientific process, but of the lack of rigor on the part of those who would accept statistics without first vetting the hypothesis on which the study was based.

    • Very perceptive of you to detect that there is a background story. Unfortunately, I can’t share any of the several stories here…

      Venkat

    • Rick, you take the words out of my mouth. When smart guys who know better take an extremist position, there’s usually context. I have spent 5 minutes in my armchair, and I deduce that the perpetrator is a tall, clean-shaven Caucasian male, probably a Vice President, likely responsible for something high-falutin’ like Corporate Strategy.

      Venkat: if I got you right, your problem is people using inductive methods to solve problems quite tractable by deductive reasoning. Fair enough.

      But where do these domain theories come from? In cases where we have cooked up the axioms and rules ourselves, like chess, or mathematics, deductive reasoning is king. But what about unknown systems? Unlike Deep Thought, which deduced the existence of rice pudding and income tax from cogito ergo sum, we are going to use inductive reasoning to derive axioms (“all swans are white, matter and energy are always conserved, etc.”) which can then drive deductive reasoning. Science.

      Consider the hippie question: your deduction of the hippie position from talking to a single hippie implies that you have, at some point long past, done the statistics to adequately support the inductive hypothesis that all hippies think alike on environmental issues.

      As you’ve mentioned, thinking is hard, and at the end of it, you might very easily be wrong. Not all people can think at a “high-enough” level, where they’ve accounted for enough factors to make good judgements. I’ve lost count of the number of times I’ve been so sure I was right, but the universe disagreed.

      One of the most important reasons for the rise of the data-driven approach is that it can be used to hammer home an argument and get to consensus much quicker. Unless you’re king of the castle, you need to persuade others that your reasoning is correct. This requires the other parties to share the same axiom base and the ability to walk through your chain of logic. Data-driven approaches reduce the amount of shared-axiom-state needed to achieve consensus; hence their great power.

      Say you have a complicated curve and you need to find the area under that curve. Having done a lot of integration in your misspent youth, you do the substitutions and carry the cosines and finally get an answer. Your audience, less skilled in the art of integration, may be impressed, but skeptical, since they couldn’t follow the steps. You could teach them calculus, of course; but how long would that take?

      Now show them a Monte Carlo simulation, where you throw dots randomly at a rectangle enclosing the curve and find the area by the ratio of the number of dots falling inside the curve to the total. Brute force, certainly. But everyone – even the ones who’re not mathematically inclined – will instantly agree on the validity of the approach and the result.

      Consider: I tell upper management that I had a long chat with one dude, and from my deep understanding of organizational structure and dynamice, I conclude that morale in the whole organization is severely down and we will see attrition levels of 10% per month.

      I may be right, but I will instantly get pushback – “That guy is a troublemaker”, “It’s a one-off case” and so on. If I conducted a survey and gave them the results, it’s less easy for them to wiggle away from the uncomfortable reality.

      How does this square with your previous piece on legibility? There is a legibility event horizon – lower for the shallower/lazy among us – beyond which any amount of trying to “be one” with the problem is doomed to failure. You’re better off cranking the stats machine, stat.

      A bit of an aside – I have been doing a lot of reading for the last couple of years, and often thinking, “Boy, this thing was within my intellectual grasp fifteen years ago, why on earth didn’t I read it then? My life might have been so different!”
      On further thought, this appears to be incorrect. I might well have understood the axioms and rules, but they would have been confined to a sandbox, not integrated with the whole thinking apparatus. I might have been able to repeat the arguments and apply them in a limited way, but they wouldn’t have been life or thought-changing. Without a mass of raw experience-data and tentative hypotheses based on statistical inference to act on, it’s useless junk. Without working and observing the corporate beast for years, the Gervais Principle is merely dark humour, and Yes, Minister is just a British comedy.

      • “Consider the hippie question: your deduction of the hippie position from talking to a single hippie implies that you have, at some point long past, done the statistics to adequately support the inductive hypothesis that all hippies think alike on environmental issues.”

        I think this is where you and I differ. I make a strong distinction between “intuitive statistics” (neural nets are universal pattern recognizers in the machine learning sense after all) and actually sitting down with an Excel spreadsheet or R or SPSS or whatever and using formal stats math constructs to think.

        Of COURSE there is an inductive-deductive loop inside our heads. Both the pieces of the loop can be formalized (with formal deduction models for the one, formal stats for the other). My assertion is that we are doing too much of the latter out of a misplaced sense of machismo and (as you point out) risk aversion and the need to lay paper trails (spread-sheet trails? CYA?) in orgs. We then end up only looking where the statistics floodlights can shine.

        Legibility: I’ll have to think about that. Both statistics and reasoning can be used to confuse or clarify.

        • OK, let me take it a bit further with my two current favourite hobby horses: Taleb’s Golden Rule and evolutionary psychology.

          Taleb’s Golden Rule: “We favor the visible, the embedded, the personal, the narrated, and the tangible; we scorn the abstract.”

          I think it’s true, but the question is, why? The answer is simple – so simple that I’m instantly suspicious. The brain is modular and layered; its evolutionary development is incremental (indeed, given the large number of holdovers from our piscine and reptilian ancestors in the mammalian body plan, it would be quite ridiculous to assume that human brain architecture emerged de novo). While we do have a facility for general intelligence, we also have older special-purpose modules which are faster for solving domain-specific problems. Engaging these deeper-seated modules in learning and computation helps enormously.

          Abstract problems are hard, because, well, they’re hard, and we have poor abstract computation facilities. Quickly need to reach for surrogate memory (pencil and paper) to make significant progress. The same problem reworked as a story with a social context (see Wason selection task) may be more easily solved, engaging the “hardware acceleration” provided by the social modules.

          A general purpose CPU can certainly do all the computations required for graphics, but a GPU is better at it. In fact, the current trend is to see what general-CPU problems can be put in graphics terms to utilize the GPU’s ferocious floating point capabilities.

          So I might rephrase what you’ve said: Problems, especially involving humans, are tractable in the face of thinking about them as narratives, giving enough material for your innate ape-modelling hardware to sink its teeth in.

          Of course, social abilities are far from uniform. At the extreme end of the spectrum, we have people who can barely model other humans and are unable to comprehend why they cheat. Such people, I suspect, will perform no differently in the two forms of the Wason task.

          Perhaps a more generic restatement would be: recast problems and solution approaches to play to your mental strengths.

          This sounds so banal that I almost deleted this comment. But what the hell, I have to dump my quota of half-assed speculation somewhere, and you’ve drawn the short straw :)

          • I have been gingerly tiptoeing around the broad-brush ‘narrative fallacy’ approach of people like Taleb and Tyler Cowen, but my position is basically in favor of narrative thinking when done intelligently. So you quoting Taleb… well, even though I don’t quite see the connection to this particular quote, I basically agree with it.

            We favor the tangible yes. Teaching goes better with examples and metaphors than abstractions. Yes. But it is not entirely clear to me what the implications of that bald, almost tautological fact is.

            Let me try one though: abstract is harder than concrete, but in general abstract problems can be recast as concrete problems, and some concrete problems are better than others. For example, humans solve certain behavioral economics style reasoning problems better when they are embedded in some narratives (I forget the clever example, but people got the answer right when it was framed in terms of “I am getting cheated” as opposed to a more artificial narrative). Narratives can also deliberately confuse, as in the case of Marilyn von Savant’s 3 doors/goat/tiger type problem.

            But one non-trivial implication is that there are problems (both deductive and inductive) that simply cannot be cast into familiar, tangible, visible forms. A trivial example is higher-dimensional physics. Anybody who claims to actually understand higher-dim math is basically lying. They are merely able to think in terms of the math abstractions. When they try to make it tangible (“in 11 dimensional superstring theory, the original symmetry breaking of the universe is like a fitted sheet on a mattress coming loose at one corner” … HUH??) the results are silly, as you find out if you go through the math and have the capacity to compare the math insight to the metaphor (I cannot in this case, but I have been able to in less esoteric domains).

            So you should recast to mental strengths, but some mental strengths have more scope than others. I remember a comparison of Julian Schwinger and Feynman (who shared the Nobel) in terms of thinking style. For the same problem, Schwinger ploughed through the abstract deductive math, while Feynman showed a clever, evocative visualization that helped get to the same place (the tangible Feynman diagrams). This was used to argue that Feynman was smarter. But in a way, people who are able to labor on through pure math without aid of cleverness can sometimes get places that are beyond the reach of metaphor and narrative.

            In other words there are things that are simply not expressible or explainable in terms of anything tangible. Actually, prime numbers are an example. I sometimes say they are ‘atoms’ of the number system but that is fairly meaningless.

            I don’t know where I was going with this either :) But interesting Saturday morning speculation.

  6. This is a decent post by your standards but I feel that it’s been unnecessarily weakened by using the antiquated term castle for the chess piece now known as the rook.

  7. As an attorney, which is just another way of saying I’m a story teller, I appreciate the defense of case-based reasoning. I’d much rather deal with a mental narrative than employ the dreaded mathematics. But one flaw in your logic is the idea that one case is sufficient. That is true if you are arguing by analogy or distinction between your test case and some comparative case. The difficulty arises when you need to identify your test case as the governing precedent. To use your example, you would have had to have several prior encounters with more than one hippie in order to reliably identify and qualify your ONE hippie as a reliable candidate for deep discussion from which to generalize the hippie stance on an issue. So, it really isn’t fair to talk about a single data point in splendid isolation.

    • Well, okay, if you are a lawyer, you understand that I was saying “learning from one data point” by way of rhetorical exaggeration. I am sure if you actually work out the hippie case, you’d find a few different sub-archetypes that would cover most cases, or (in stats terms), a clustering of points in the “hippie” feature vector space.

      Venkat

  8. That’s some fascinating stuff! I was reminded of Gary Klein’s case studies of decision making. I’ve thought, and I don’t know how reasonable this is, that if in many ways we’re average, than looking at one case is likely to give us many ideas about what is average among similar cases. I guess this doesn’t hold for power laws or bimodal distributions, but the idea seems like a cheap means of developing an model, and you can test it if it’s not to expensive.

  9. A very insightful article, thank you. I think a broader view (as others touched upon here also) is the hypothetic deductive method used in science. You use thinking to come up with what you _think_ is right, your hypothesis, and then test it.

    The problem with all inductive reasoning, is that you often can’t KNOW if your theory is right. My thinking might not get the same results as your thinking. For example one could ASSUME that all religious people perceive themselves as more happy, but reality is often surprising and counter intuitive. So you need data to verify your theory, if you are going to rely on it. How much “security of knowledge” you need depends entirely of how you plan to use the results.

    I do fully agree that statistics is a last resort, and one should aim for more direct confirmation/disproval if possible. Also, as you say, blind usage of statistics is a terrible way to arrive at _new_ insights, especially since there are so many pitfalls with data-driven conclusions (statisticians are warned about these in their education but they go ahead and do it anyway.) That said, a surprising statistical result can of course inspire new thinking and new hypotheses. It’s just not reliable as a consistent source of such.

    – a physicist

  10. The problem with all inductive reasoning, is that you often can’t KNOW if your theory is right.

    There is no such thing as a “right theory”, only theories which have not yet been invalidated/superseded by another one or which are reasonable approximations or a more intricate/precise one.

  11. I know this is an older post, but I just had to post this link here:
    http://lesswrong.com/lw/dr/generalizing_from_one_example/

    Should make you think about how dangerously flawed the one-data-point-approach can be, especially when it is based on the assumption that it is ok to not gather more data because you already KNOW how it works.

    And regarding the hippie example: How do you know that he’s a hippie? You probably categorized him based on his appearance, behavior etc. If this categorization check results in a prototypical hippie (and you have read Lakoff) you can now safely infer that he will give you prototypical hippie answers to your questions. Not because you are infering so smartly (what you call “thinking” here, and which statisticians obviously don’t do), but because your brain/mind has already gathered tons of information on hippies, a ressource you can subconciously tap. In reality we hardly ever deal with ONE data point – we are implicitly recycling previous “studies”. At least you didn’t call it intuition… ;)

    • My point was about formal empiricism. Of course zeitgeist/context matters. Of course my “brain/mind has already gathered tons of information on hippies.” That’s basically tautological. You can’t equate that with the formal exercise of random sampling 150 (or whatever) hippies with a data collection form.

      The “dangers of one data point” have been ridiculously over-stated. That’s what I was responding to. I respect the thinking that goes into lesswrong (and I know some of the people behind it very well), but there is an emerging broader cultural suspicion of narrative-based thought that is simply not warranted.

      To paraphrase Sarah Palin, there is a “gotcha empiricism” emerging.

  12. I can see why Sarah Palin wouldn’t want empiricism as the dominant paradigm… kind of contrasts uncomfortably with visceral appeals to people’s fears and prejudices. ;)

  13. “Religious people self-report higher levels of happiness than atheists”

    I’m curious, how does one conclude this by just thinking? It didn’t strike me as “duh”, so I’m curious what this thought process looks like.

    • Tom Bushell says:

      Jason, I’m with you – this is not something I could deduce either, although this correlation has been widely reported.

      In addition, I recently read an article on happiness in Scientific American Mind, and this reported “fact” may be incorrect, or at least incomplete as commonly stated.

      Apparently, this correlation between happiness and religious belief is only found in societies that attach a high value on being religious (such as the US).

      In other countries that don’t place as much social value on being religious (e.g. some European countries, China, etc), this is not the case – religious people self report being LESS happy than non-religious people.

      The happiness results from conforming to social norms, not being religious per se – at least, that’s the theory as I understand it.

      “The Many Faces of Happiness” link (registration required):

      http://www.nature.com/scientificamericanmind/journal/v22/n4/full/scientificamericanmind0911-50.html

  14. I’m somewhat in the field of applying statistics to everything, so this hit a bit close to home.

    I agree that there is such a thing as an abuse of statistics. A professor once told our class a horror story: while reading through some industrial code, he noticed that the programmers had tried to find the area in the overlap of two circles by doing a Monte Carlo simulation — splatter dots across the image and count how many are in the overlap. Such stupidity does exist.

    Less overt, but still real, stupidity, I think, has to do with ignoring internal structure within a problem. I remember working on a research project related to reconstructing the 3d shape of a protein from many noisy 2-dimensional images taken from different viewpoints. The computer scientists working on this problem defined the images to be mere vectors, and “forgot,” in their model, everything about where they came from. The mathematicians (including me, as a student) based their model around the fact that viewing angles correspond to points on a sphere, which has group structure, symmetry, and other nice properties. The biologists went even farther in the direction of problem-specific structure and built into their models all kinds of “cheating” details about the specific nature of different possible proteins. By certain metrics, the mathematicians got the cleanest results. And I take this to be a sort of motivating parable about the dangers of over-abstraction (ignoring all the internal structure of the problem, and thus being able to prove only weak statements) and over-specificity (building a model so tightly adapted to the current problem that it can never generalize to future ones, or expose underlying principles). I think both directions are errors, but drawing the right balance in practice is tough.

    I’m more of an innocent in the business world, but I do still think that statistical low-hanging fruit is going to be available for a little while longer. The world outside of math/statistics/cs is full of examples that should *obviously* be automated or made statistical, but haven’t been yet. Example: every summer, another friend of mine does an internship in a new doctor’s lab, writing image recognition software to replace the process of sorting through images by hand to find the tumor/fracture/whatever. Doctors seem to discover the power of statistical automation *one lab at a time*, and apparently very slowly.

    Finding statistical low-hanging fruit is fundamentally an exploitative business — once you do it, it’s done, and you’ve put yourself out of a job. When most of the low-hanging fruit is gone, the industry becomes less interesting (this is what I’ve heard about high-frequency trading, that by now building “new trading algorithms” is a very incremental and imitative process, by necessity.) At some point, it’s no longer going to be easy to make a living by making more things statistical. I’d predict that (like other industries) it’s going to have a long period where there’s no chance to make it as a plain old R-cruncher, but the mathematically sophisticated few will have long careers.

    But: we’re not there now. The impression I get is that fairly *basic* machine learning skills are still highly valued in the most technical industries, and unheard of in many other industries. The quantitative revolution has a ways to go before it becomes mature, and a lot of the time statistics aren’t competing against deeper insights but against habit and prejudice.

  15. Senthil Gandhi says:

    Reminded me of this quote, I will just leave it here:

    Pure logical thinking cannot yield us any knowledge of the empirical world; all knowledge of reality starts from experience and ends in it. Propositions arrived at by purely logical means are completely empty of reality. -A.Einstein.

    • That’s why I said ONE data point :) And very rich/illegible/fertile singletons at that (=narratives).

      Agree with the Einstein quote, though in this case, philosophers say it better (eg. Gilbert Ryle vs. Descartes).