Lies, Caffeinated Lies, and Operating Systems

This is a guest post by Tim Herd.

Computer science is not about computers. It’s about computation, a much wider subject. Creating abstractions, essential representations of things, be they objects, processes, ideas, and manipulating those representations. Manipulating these representations and letting their movements inform and power the outside world. These representations are organized in the computer, but there’s no law saying they have to be. The organizational principles and structures are more fundamental, and can be applied to anything. A cafe, perhaps?


Right now you’re reading this on a computer, and that computer is running an operating system. Windows 10, macOS, one of a billion different linuxes. But what is an operating system?

Modern operating systems do a million things, but their fundamental job is to lie to programs. Each and every program running on your computer thinks it is the only program running on the computer. Programmers like me write programs assuming that no other pesky programs will get in the way. It’s the Operating System’s job to make sure the farce is believable.

The operating system crafts a believable farce by deftly managing the various programs, matching up the resources available with the programs that need them, on the fly, so that the programs never notice there’s anything strange at all. The job of the operating system then, heavily abstracted, is: Given a set of resources I can use to fulfill service requests, and a queue of service requests (each of which may have radically different needs), how do you match up the former to the latter, to make sure that all of the service requests get fulfilled in the most efficient manner possible.

For starters, we need to define efficient. There are many different ways you could, given different priorities. But generally, a weighted measurement of total wait time is used. That is, we want to minimize the total amount of time that service requests spend waiting before they are served. The weighting is used to prevent any one request from having to wait too long. We would prefer two processes each wait a short time, than one wait a long time.

Talking about resources and service requests in this way shows that this is much more abstract and general than just programs and processors. These patterns apply generally to any structure where some clients are waiting on a service resource to be served. Consider the coffee shop.

Starbucks and Tim Hortons are proprietary software architectures for fulfilling caffeinated beverage requests. Each is capable of running in a multi-processor, or multi-barista in the case of coffee shops, environment, typically supporting up to 8 independent processing units. They take in requests using an input queue, and fulfill them efficiently. How do you think these operating systems should work?

Well, the naive solution is always readily available: round-robin servicing. Each barista selects a request off the top of the queue. They fulfill the request, from start to finish. For example, if there are twenty requests in the queue, and four processors, the first four will be processed in full, from taking the order to giving the drink, and the sixteen behind must wait until this finishes before they can be served.

This is a simple queueing model, and it comes into play everywhere in society. The line at the bank, at most grocers, and at Tim Hortons, functions this way. Customers want to get served, they enter the line, and the ‘next available attendant’ serves them until they’re done. It’s a simple system to set up, but it has it’s problems. Consider the following:

There are eight people in line, and three baristas on duty. The first, that one elderly woman who can’t figure out how to write a cheque for a black coffee. Behind her, the father with eight children in tow who is really going to regret giving them all that sugar. Finally, someone with 37 gift cards, each with 15 cents left. You check your watch repeatedly before phoning your wife and telling her you’re going to be half an hour late coming home. You’re stuck waiting behind long-running requests.

How can we do better on this? We can start by separating the coffee service into separate tasks. Specialization worked for capitalism, why can’t it work here? In serving a customer, a few discrete states exist, with well-defined transitions between them. A customer comes up. Their order is taken. Their payment is processed. Their beverage is prepared. Their completed product is delivered. Each of these steps is an independent phase of service that could theoretically, be partitioned. The Tim Hortons model bundles them up together, a simple, honest process. But what happens if we don’t?

Before we can improve on this process, we’ll need to figure out which lies to tell. Computers lie, after all, and whether they’re silicon-and-electron computers, or human-and-coffee computers, the principle is the same. These lies collectively make up the interface to our program. They instruct us as to how to interact with the system. They need not correlate with any specifics at all, so long as the lie is convincing enough that we can rely on it.

In the coffee system, there is only two lies we really have to maintain. One: serve people in the order in which they’ve arrived. Two: process orders as they come in, and don’t stop until they’re done. As you’ll see in a second, we can gain a lot of efficiency if we could violate these requirements, but due to human nature, they are non-negotiable. However, we have the next best thing. We can lie.

Going back to the long-running request example, there are five very quick requests stuck behind three very long requests. If only we could serve the five first, a lot of waiting time would be saved. Doing that, however, would re-order the line, and that’s not acceptable. Is there a way we can re-order the line in practice, while maintaining the lie?

There is, and it’s the first efficiency gain that Starbucks enjoys. The service in Tim Hortons is what’s known as a synchronous response. You order your coffee, and you wait there until it comes. You expect that your interaction starts and ends in one chunk. But Starbucks doesn’t do this. Starbucks operates asynchronously. At Starbucks, you enter the line, and you are processed in “order”. But processing, at Starbucks, only means taking your order and payment. At Starbucks, once you have paid for your drink, you exit the line. You do not wait for your drink to be made. Instead, you go do something else (say, getting a table) while the rest of your order is processed in the background. When it’s done, they alert you and you come get the drink.

This is an asynchronous process, in the sense that the cafe’s actions are not synchronized with your own. Crucially, this lets us re-order the orders without anyone feeling like they have been cut in line. Orders are taken in order, but are not necessarily fulfilled in order. This empowers staff to make point optimizations, such as pouring five black coffees before making three pumpkin spice lattes.

Additionally, it represents a pure improvement over wait times. In the synchronous mode, when you’re waiting in line, it is a busy-wait. You are not able to use that time for anything else. In the asynchronous mode, while you are waiting you are free to use your time for other things.

With our first lie told, it’s time to look at the second. Do we really need to process each order as a unit? Can we get more efficiency gains if we break this constraint? Can we convincingly lie about it?

The second question is the easy one: yes we can. Once someone has left the coffee line, they are not paying attention. In Tim Hortons, we could not easily tell this lie; the customer is interacting with their dedicated barista the entire time. At Starbucks, thanks to lie #1, the customer is safely distracted while waiting for her drink. Behind the counter is a black box to her.

The how is more complex. There are two main strategies that Starbucks uses for this. The first is what is known as ‘preemptive scheduling’. This is a catch-all term for task management strategies that allow interruptions. Going back to the five and three example: How do the baristas know to wait for your fast orders before starting the slow orders, when they don’t yet know what you are ordering. The preemptive scheduling strategy is that they start on the elaborate drinks, and when they get the order for a quick drink, they pause, context-switch, and make the black coffees. The employees can use their discretion to preempt long running drinks to get a few done quickly.

The second strategy Starbucks uses is a concept called pipelining. This is very similar to the idea of mass production, factory conveyor belts, etc. Say there are four baristas on shift at Starbucks. One may be dedicated to the cashier role. The other three may specialize at different stations within the coffee preparation process. Perhaps one is dedicated to brewing coffee, while the other two operate on the more elaborate elements of the process.

This gives us efficiency gains due to specialization. There are fewer context switches, one person can focus on doing one thing more effectively, and there is an efficiency gain as a result. This also creates well-defined break points. After all, it’s just an extension of the first lie. We’re bringing asynchrony behind the counter. As different processes take different amounts of time, baristas can be freed up to do other things while they wait. They can quickly pick this up, as the work needing to be done is discretely parcelled out into small, well-defined chunks, as would be necessary to reap the benefits of specializing on tasks in the first place.

The end result of all this lying is a much more efficient caffeination machine. Using our measurement of minimum total waiting time, we’ve done much better than Tim Hortons. But this has come at a cost. Minimizing that time required re-ordering the service requests. Just as computer OSs do all sorts of things while maintaining simple lies, we were able to re-order requests, while maintaining the fiction that we did not.

And this is an important gain. Machines are often better at accomplishing things than humans. But people often have great difficulty operating within the requirements of a machine. By lying our way into a human-friendly interface, we have improved everybody’s experience. People are served much more rapidly, and with fewer busy-waits. The cafe can prepare many more drinks, with fewer staff members, saving time and money.

Operating systems function over computer hardware, fabricating interfaces out of whole cloth to facilitate the smooth function of a complex machine. Social systems function over human hardware, fabricating narratives out of whole cloth to facilitate the smooth function of a complex society. This applies not just to cafés, but each and every human system, fractally, at every level. Let’s learn from the lies, cross-pollinating between systems, to build better, faster, stronger, more efficient systems.

Get Ribbonfarm in your inbox

Get new post updates by email

New post updates are sent out once a week

About Guest Contributor

Ribbonfarm is a longform blog featuring a variety of themes and perspectives. One-off contributions are published under this Guest Contributor account. Contributors with 2 or more posts have their own bylines, and are listed here


  1. I love this post. I’ve spent quite a bit of time recently thinking about and researching “technomimicry”, and your analogy here is superb.

    Do you think the concept could include personal growth and development? Specifically, can technomimicry help us understand how an individual might operate at higher and higher levels, sustainably?

    I think so. Technologically, the focus would be on boosting performance (the sustainability catch: do it optimally and without overheating).

    One way to boost performance is by “overclocking”; a term used to describe the act of resetting hardware to run at a higher speed than the factory settings. This line of thinking is a tempting locus for productivity gurus. They try to reset their hardware with productivity hacks, or habit-forming regimens. There can be benefits, but they are rarely optimal and technology explains why.

    On its own, overclocking provides little boost because it can only go as far as the software allows it. It doesn’t matter if your computer can process 3 billion calculations per second if your software only asks to process 2 billion calculations per second.

    Instead, overclocking is most effective as a response-to-stimulus. When updated software outperforms the hardware it’s run on, overclocking allows that hardware to adapt to the new requirements (up to a limit).

    With this understanding, a syntax begins to emerge. In order for overclocking to provide an optimal outcome, it must be in response to updated software.

    The software of our minds are things like our perceptions, biases, or psychological stages. Updating these areas can provide the foundation for overclocking optimality.

    But we still have a problem. If we’ve updated our software and overclocked our hardware we still run the risk of overheating. Colloquially, we call this burnout, and it’s the reason gyms are busier in January than they are in July.

    The technological answer is simple: cooling. Electric fans are efficient for heat dispersion and can be the final step in sustainably boosting performance.

    The answer is more difficult for individuals. Finding your fan may require eating your shadow. Not easy, but once you find it you are well on your way to sustainable personal growth.

    P.S. I’ve been to both cafes and while I’ve never been impressed with how long it takes to deliver my black coffee at Starbucks, I realize it’s probably a structural labor issue (college age baristas incapable of asynchronous action) not the operating system.

    • I relate this to personal development a lot. I think there’s value, both practically and philosophically here.

      Practically, I believe that I’m eg. more productive at work, and more resilient in my personal life, because I can engage in technomimicry to build skills and habits. For a random example, when I have to sort physical things in the real world, I typically do some kind of merge sort. Or, actually, insertion sort, because in the real world shifting a list is often O(1), making insertion sort super efficient.

      Philosophically, I’ve been thinking about the recent hubbub about fake news, and why it doesn’t bother me. My flippant response is “I’ve been telling you it’s all fake for five years now, this is old news” but that’s not quite right. I’ve been thinking about it now from an OO vs FP perspective. Pre-fake-news, our news came with all sorts of implicit context with it. This is the instance variables/etc. Things as mundane as “reporters are reliable” to as complex and abstract as “Fox News has a Republican bias”. But the demands of scaling mean we have to embrace functional programming, and that means that state is the root of all evil. Now, when we read the news, it’s a message to be handled, an argument to our function. It is what it is, only what it is, and nothing more than what it is. We no longer get reliable context to our news, _at all_. It’s up to me to contextualize it properly, because the only data I get is the data that was passed in when they called my function.

  2. Going the way you described above, viewing everything as abstract representations doesn’t we reach a break point? In case of machines, it would damage software/application and hardware in extreme cases. But what about humans we we apply this abstraction. Isn’t it going to be very psychologically depressive and physically exhaustive?

  3. This is a heheh moment. The mobile app ordering for Starbucks is screwing up the yield of their cheatin’ lyin’ OS