Seth Godin’s Dip and Multi-armed Bandits

Seth Godin, who I first discovered through his bestselling Permission Marketing has made something of a specialty of writing compact and focused books around single clear ideas. His latest, a tiny little book called The Dip, is his most abstract yet, but still fits the mold and develops a single punchy idea. The idea is this: there is a transient dip in the effort-to-returns graph of any project, and deciding what project to quit, and when, in terms of this graph, is a critical skill. It is almost too much of a complex idea for him to handle, but he just makes it. It is interesting to see him arrive at a fresh insight into a problem that has nearly a half-century long history in the context of an academic model of decision making called the multi-armed bandit. His fresh and original take provides some insight the bandit mathematicians never stumbled upon, but at the same time, he seriously underestimates the complexity of the idea, and the difficulty of operationalizing it.

Deciding how to invest resources among multiple projects is a key problem in decision science, and two sorts of tradeoffs are generally recognized:

  • The trade-off between exploration and exploitation of options, otherwise understood as the trade-off between discovering new information and making use of the information you already have
  • The trade-off between getting the nonlinear returns that result when you concentrate your resources, and the risk of putting all your eggs in one basket (“critical mass” and “spread too thin” issues are special cases)

Not all these trade offs appear in all situations, but a vast variety of decision problems (including stock selection, research portfolio management, marketing, troop concentration management in warfare, and job hunting) can almost completely be framed in terms of these two trade offs.

For reasons I’ll get to in a minute, it is very rare to see both addressed together. With The Dip, Godin rushes in where angels fear to tread, and actually gets somewhere. Let’s understand his (qualitative) model, and then understand the (more quantitative and technical) reasons he might have over-reached, despite bringing a crucial new element to the party.

The Dip

The entire book is driven by a single visualization, which I’ve sketched below, with an addition of my own. We’ll call the whole black curve a Dip (capital D) and the trough in the middle the dip.

The Dip

His idea: projects that go after true high-value opportunities follow a particular effort/returns profile: you get some easy “beginner’s luck” returns, you go through a transient dark phase, and then you get back on an increasing returns curve. His reasoning is largely phenomenological and domain specific. In manufacturing, for instance, the dip might represent the costs of tooling and building a sales and product launch team around a successful prototype. In software, a successful piece of demo software might need to be refactored and extensively rewritten to be industrial strength (or even go through entire terrible versions, as with Microsoft’s famous Version 3 effect). For PhDs, it might be the extended ABD (“All but the dissertation”) phase many go through, between the exciting and rewarding original research and the actual degree. In blogging, it might be persevering through between the exciting initial phase of writing online and the long slump between starting and getting popular (most new bloggers give up in a couple of months of activity — I just crossed the 2 month mark, and last month was horrible – I am hoping my dip is over). Chances are, the dip profile is recursively repeated, fractal like, for each phase and continuing to the right, but let’s not nit-pick. This is a good first-order abstraction.

The point he makes with the curve is this: the dip is the cost you must pay to dominate the opportunity the project is going after. If you power through and win, it becomes the barrier to entry (moat?) and your competitive advantage. So far, nothing new. Here is his point: in life you (individual or organization) are going to have to play with a whole bunch of different projects, abandon some, complete some. As critical as it is to quit/persevere with the right things, it is equally (perhaps more) important to do so at the right time. If you are going to quit, quit before the dip.

The pathological behavior patterns, in his view, are to either never power through the dip for anything, or to make a habit of going through the dip at enormous expense and always quitting just before you hit the rewards.

He makes a second, less powerful point, about non-Dips, which he calls cul de sacs. These are opportunities with a return profile that flattens out and goes nowhere or diminishes (there’s obvious links to diminishing marginal utility models in economics, but let’s not digress). These, he claims, should be avoided at all costs.

If you think this is all obvious, you’d be surprised how many planning processes don’t recognize such effects. Many management charts for example, plot a linear effort/returns curve and mark milestones along that, like the red line in the picture (we do that where I work, at Xerox). There’s a related diagram called the Takanaka diagram that encourages a similar framing error. Even basic nonlinear plots (such as, say a logistic S-curve) are rare, barring the famous exponentially-increasing returns curve of the Internet boom era. Economists would probably just shove an initial zero returns phase before a growth phase to model startup costs, but the initial baby hump models a crucial real-world effect which models how most real projects get off the ground through a proof case/pilot/early win.

So The Dip is a surprisingly sophisticated curve. It meets Einstein’s make things as simple as possible, but not too simple criterion. I’ll tell you why decision scientists haven’t thought about this yet (to my knowledge).

So Godin has definitely made a conceptual contribution. Where he errs is in asserting (without believable proof) that once you’ve framed the decision in terms of the dip, quit/persist decisions are actually easy. All you need to do is a) quit cul de sacs the moment you recognize them as such and b) Make a habit of always quitting dip-curves before the dip phase, and persisting beyond the dip in your chosen battles.

So what can go wrong with such seemingly sane advice?

The Bandit and the Lanchester Game

Let’s get back to the 2 trade offs. Decision theorists have typically modeled the first one (exploration/exploitation) using a metaphoric slot machine. A regular Vegas style slot machine — a one-armed bandit in the lingo — is expanded, in this thought experiment, to have multiple arms. Trick: you don’t know the return statistics (mean/variance) of each. You have a limited budget, say $100. How do you allocate this so you develop a good estimate of the return statistics of each, but still have enough left over to invest in the “best” arm (highest mean payoff if variances are not an issue, your chosen risk/returns poison if they are)?

This is a non-trivial problem, but surprisingly the trade off can be optimally solved in terms of something called the Gittins Index. But this is only for the most basic case where all returns are uncorrelated, independent and so forth and where you have the important advantage of being able to invest one token at a time and seeing a return sample each time.

Techie Note: That is important, and one of the reasons why bandits have been used to model everything from stock-picking to job hunting, and are also the mathematical basis for genetic algorithms and certain approaches to global optimization problems.

I love the bandit model, and have used it in my research, but here’s the bad news. Almost any real set of additional real-world features will break the nice behavior of the problem and make it computationally intractable. You have versions with infinite arms, dynamics (payoff statistics change over time), coupling (what you do on one arm affects returns on the unplayed ones) and delays before payoff, and most of them are basically impossible. Unfortunately, most real problems map to these bandit-plus versions (many to an important one called the restless bandit).

What about the other trade off, the one about concentrations? The best-known model there is called Lanchester Dynamics, which was developed to model attrition in air combat in World War I. The basic idea is not too tricky: if army A of size 10 soldiers is exchanging fire with army B of size 5, then army B will face an attrition rate four times as high as A. That’s because twice the firepower is being concentrated on half the target. The model doesn’t actually work out of the box for any real situation, but captures the idea of resource concentration sufficiently well that you can do useful war gaming with it. Lanchester is used for things like planning marketing campaigns, not just war gaming. You can probably figure out how to use Lanchester-like attacks to model “critical mass” effects.

Now you could actually put the two together to make Lanchester bandits (I built a silly video game a few years ago involving moving troops from one front to another), but besides providing some fun, you don’t get tractable, easily solvable models in all but the simplest cases.

So what does this have to do with the dip? Well, think in terms of expected returns curves (actual returns would be scatter points around them). Bandit models so far have mostly been about linear expected return curves. A bandit model problem in Godin’s Dip terms, (call it a Godin Dipping Bandit) would, I conjecture, be really really hard to solve mathematically. Here’s a picture:

Godin Bandits

I think the reason decision scientists haven’t worked the Dip until now is that it seems arbitrary. If you are going to abandon linear returns curves, why not just develop a model around arbitrary curves? Why this 1-max-1-min curve? But if the dip really is a phenomenologically useful model, it is well worth special attention.

A critical feature: the Dip captures critical mass issues (“enough resources to get through the dip”) without needing to bring in the complexity of things like Lanchester models.

But despite the neatness of the idea, I think the Dip is still too complex to be solved mathematically (I suspect any reasonable mathematical statement of a dipping bandit problem will be NP or PSPACE hard), which means domain-specific heuristics will be needed to make effective decisions around dip curves. And I don’t think these heuristics will work well enough to write self-improvement books about.

So I recommend the book and the idea, and suggest you keep it in mind qualitatively, but don’t expect it to make your life significantly easier. Here are 7 ways I think you’ll see Dip-based decision making break down:

Failure Modes of Dip-Based Decision Making

  1. You will generally find it hard to tell how far away the end of the dip is; the upswing will be so sharp you won’t know for a long time whether to expect 1 year or 10 years of grit, making the difference between a Dip and a cul de sac a matter of faith for practical purposes.
  2. You will find that practical issues (such as needing a paycheck, or synchronizing with an annual or quarterly planning cycle to switch, start or drop projects) will often prevent you from quitting a Dip at the right moment.
  3. The beginner returns slope is no indicator of final returns slope
  4. The length of the Dip will be strongly dependent on the resources available, and the dependence not easily computable a priori
  5. While the Dip is probably a common returns profile, there will be enough exceptions (with say multiple humps/dips) that operating with a Dip mental model will get you into trouble fairly often
  6. Whether something is a cul de sac or Dip will depend on who’s asking the question. Your cul de sac may be another person’s Dip.
  7. Finally, the Dip is likely to be very hard to tell apart from the cul de sac even for a single decision maker. We are capable of massive amounts of self-delusions about the returns we are really experiencing. The self delusion will add enough noise that we’ll often quit/persist far away from optimal points.

I’d be delighted to be proved wrong. If Dip decision-making is practically or mathematically tractable then life would be much easier for us all.

But still, keep the Dip, the Bandit and Lanchester handy. All three are great decision-making metaphors.

Get Ribbonfarm in your inbox

Get new post updates by email

New post updates are sent out once a week

About Venkatesh Rao

Venkat is the founder and editor-in-chief of ribbonfarm. Follow him on Twitter

Comments

  1. Matt Gershoff says

    Interesting. Would it make sense to think of the ‘dip’ as a POMDP? Perhaps the belief states would be over the Cul de sac, Linear, nonlinear states of each arm? Just a thought.

    Cheers

    Matt

  2. James Sellers says

    This rise-crest-rise model is mirrored in the investment strategy books written by William O’Neal. He describes it as a “cup with handle” pattern in the price pattern of the stocks, and suggests getting on at the low point of the dip to ride the back end. A little different way of looking at the curve.

  3. When Seth and your ideas met, it was a mixture of great discernment all laid in a distinctive explanation. I like how you carefully dissected all the information about The Dip. I’ve been hearing a lot about Seth and I think it’s really good to read about a concrete reason why a lot of people reads his books.

  4. My inference from the book is to accept your ignorance of the cul-de-sacs, and choose projects with shorter dips, which amounts to something like failing fast. Also, the true chart is the average of many dip-like curves, so creating regular, small wins keeps the minimum from getting too low.

    @James nice take on it – if you can identify the low point of dips of others, you can buy low. Still have a problem of the cul-de-sac uncertainty, though.