Sapir-Whorf, Lakoff, Metaphor and Thought

“What is thought?” is a question that is foundational by any reasonable measure. The best short answer I have found so far has been “thought is conceptual metaphor,” and it is one of the enduring regrets of my life that it took me so long to encounter this answer. An undergraduate friend (hi there Max!) introduced me to George Lakoff and the notion he introduced, conceptual metaphor, just as I was finishing up my PhD, and it radically altered my thinking (and my thinking about thinking, a.k.a philosophy) from that point on. I can only wonder how different my life would have been if I’d read Metaphors We Live By as an undergraduate. So here is a discursive introduction to these ideas.


I was educated to believe that the fundamental historical schism in modern cognitive science was the one between behaviorism and the cognitive school, which came to a head with Chomsky’s review of Skinner’s Verbal Behavior (1959). That particular drama though, however you interpret it, is actually rather superficial.

The truly interesting yin-yang dynamic in the study of language (and as you will see, that implies the study of thought itself) is between the relative roles of universal and local components of thought. Chomsky’s “Universal Grammar” plays the former role. The role of “local component” is played, I will argue, by Lakoff’s idea of conceptual metaphor, which I’ll explain in a minute. The foundation of conceptual metaphor though, is a somewhat controversial (and in a strict form, untrue) statement called the Sapir-Whorf hypothesis, one articulation of which (Whorf, 1959) reads as follows:

We dissect nature along lines laid down by our native language. The categories and types that we isolate from the world of phenomena we do not find there because they stare every observer in the face; on the contrary, the world is presented in a kaleidoscope flux of impressions which has to be organized by our minds — and this means largely by the linguistic systems of our minds. We cut nature up, organize it into concepts, and ascribe significances as we do, largely because we are parties to an agreement to organize it in this way — an agreement that holds throughout our speech community and is codified in the patterns of our language […] all observers are not led by the same physical evidence to the same picture of the universe, unless their linguistic backgrounds are similar, or can in some way be calibrated.

Stated as baldly as above, the hypothesis is actually not much deeper than the idea that language is a matter of social convention (an angle which has been studied by philosopher David Lewis in Convention: A Philosophical Study). In other words, arbitrary conventions which arise as a matter of historical accident and path-dependence determine language, which in turn shape thought.

If this concept is new to you, it will take some thought to get past the superficial contradiction and develop the right mental model of this particular yin-yang. Roughly, the synthesis is similar to that of the interplay of nature and nurture in genetics (the former defines the potential, the latter the expression thereof in a particular environment). You could say that Chomsky tells us what sorts of thoughts we could think, while Sapir-Whorf hints at what types of thoughts we do think.

This level of resolution seems to satisfy many linguists and computer scientists (correct me if I am wrong, but roughly the current consensus in the NLP community is that human languages are describable by “weakly”context-sensitive grammars).

An Example

Let’s look at an example to highlight some issues. I’ll use English and Hindi, two languages I understand, to highlight some aspects of thought-as-language. I recently wondered about the Hindi equivalent of the common English word waste.

Despite racking my brain and looking up a Hindi-English dictionary, I could come up with nothing that means precisely what the word means in English. There are related words — kuda or kachda (garbage), banjar (wasteland), nasht (destruction), barbaadi (ruin), bekaar (useless), anavashyak (unnecessary) and fazool (superfluous).

Even adjusting for the fact that Hindi (and Urdu) have much smaller, slower-growing vocabularies, it seems curious that such a basic word does not exist. The clue lies in the fact that our modern sense of the term waste is something like “to not use/use suboptimally a valuable commodity or resource, and thereby allow it to degenerate.” If you check out the English etymology of “waste” though, you’ll find that older uses of the word “waste” were closer to the set of related Hindi words I’ve listed. Usages like to lay waste, or ruin or waste away comprise the origins of the term.

Though I can’t prove it, a reasonable hypothesis is this: the modern sense of “waste” relies on a protestant/puritan ethic and an industrial cultural context that has a general mental model of “manufactured, value-added resource” and “optimal resource utilization.” Older cultures have specific notions of “waste-able” resources like land, and their associated notions of loss involve natural or destructive-intent processes (such as ruin or decay). In an industrial context, there is an added sense of hard-won added value being lost. In other words, waste is mainly a waste of effort today.

The implication here is that a very complex environmental factor (an industrial culture) can affect the meanings of individual words.

But the basic linguistic analysis of this example is unsatisfactory. Chomskyean notions of grammars have nothing to say at all. Sapir-Whorf merely implies that perhaps modern English speakers can think thoughts involving the concept of “waste” (not the word) that Hindi speakers cannot. To this, I can testify; I cannot think certain thoughts in Hindi that I can in English. The converse also happens, but less often (a larger, faster-growing language does not necessarily subsume a smaller, slower-growing one — the set of all even numbers below 1000 does not subsume the set of multiples of 5 under 500).

Lost in Translation

The problem of translation sheds further light on the relation between thought and language. Imagine a pair of languages to be represented by a dynamic Venn diagram with a large, fast-growing circle (English) intersecting a smaller, slower-growing one (Hindi). Given the non-subsumption remark, and the fact the the evolution of language is a creative-destructive rather than monotonic process, perhaps we should imagine growing, morphing fractal shapes, but we’ll let that go for now. Would it be fair to say that pieces of text involving only words in the intersection are perfectly translatable both ways? This is obviously silly. The effect of a piece of text is the collective impact of a set of words and their nearby connotations, within a cultural context. Here is an example, a verse from a popular 1950s Hindi song:

Sab kuch seekha humne, na seekhi hoshiyari
Yeh sach hai duniya waalon, ki hum hai anadi

which translates roughly to:

I learned everything, but I did not learn street-smartness
It is true, oh people of the world, that I am a naive bumpkin

This is what I would call a partially idiomatic translation — a subjectively judged mix of direct translation, and translation of the original’s inferred intent, that makes use of idiom and metaphor. Let’s look at a couple of pieces in detail.

Zooming in on the red bit, the Hindi-Urdu hoshiyari derives from the stem hosh, which means both “alert” (in both the medical sense of “lucid” and the sense of “watchful”) as well as “conscious.” Behosh is “unconscious” and Hoshiyar! is equivalent to the Western military call Attention! Hoshiyari as an abstract noun denotes a mix of intelligence, pragmatism and cleverness. You’d apply it to a precocious child, or in the sense of “too clever by half” to an adult. A Hindi speaker then, would find evoked mentally, a conscious sense of “clever/wordly-wise” and a subconscious web of associations stretching to concepts like “conscious” and “lucid.” Clearly the closest direct translations (such as “clever”) wouldn’t work. You need an English idiom to capture the Hindi word, because you are trying to capture the impact of a web of connotations around n particularly leaky, ambiguous term.

Something similar holds for the blue highlighted bits. The closest direct word I can think of, for anadi, is ingenue, but this explicitly refers to a female sufferer of naivete, and has the added disadvantage of a particular cultural prototype (a young French girl being introduced to Paris, say). Anadi on the other hand, has the prototype of a bumbling male simpleton, often a rural migrant to the Big City. Something like “country mouse” would work very well, but the reference to a very specific Western folk tale has its own problems, hence the decision to go with a clumsy compound non-metaphoric phrase, naive bumpkin.

A word also has irrelevant connotations, via similar-sounding but unrelated words. Anadi (sometimes spelt anari) sounds similar to anar (pomegranate). A Hindi-speaking listener to the original, therefore, might have a subconscious sense of “pomegranate” which an English-speaking listener to the translation would not.

So, to go back to our original Venn diagram, given any set of words in the intersection, the webs of associations and non sequiturs radiating out from original and translated versions will inevitably leak all over, leading to very different effects in the two languages. Translation then, is a matter of keeping the webs as contained and close as possible. In my limited experience (I have done a couple of serious Hindi-English translations like this one), the balance between metaphoric/idiomatic and literal that best captures the original intent as perceived by a single individual is a delicate one that depends a LOT on the piece under consideration. Add the fact that you generally want to use syntactic structures of similar compactness in translation, to generate comparable rhythms and cadences on both sides, so nonverbal effects harmonize, and translation begins to appear impossible.

But it gets worse. Meaning, remember, is constructed by an individual listener with a history and a mental model of the world, not by an abstraction like “English Speaker.” We’ll get to that level of detail after reviewing conceptual metaphor.

Conceptual Metaphor

Ask yourself this: in the previous example, why did a more ambiguous construct (the metaphor street-smart) work better to contain the original meaning of hoshiyar than a nominally-close direct translation like “smart”? One key is that the metaphor anchors to a non-linguistic sensory experience (a swaggering person on a street — perhaps English speakers think of John Travolta in Grease, while Indians think of Aamir Khan in Rangeela) that constrains meaning more powerfully than abstract words could.

Lakoff and Johnson, in their seminal Metaphors We Live By, turned that sort of tiny, trivial observation into a powerful theory of how thought and language work. Their crucial first step was to distinguish visible, obvious metaphors, or figurative metaphors, from more systematic, large-scale and linguistically less-visible entities they named conceptual metaphors. “Inflation is up” is an example. “Up” is a geometric notion that relates to space and inflation is an abstract financial quantity. The statement involves not one but two conceptual metaphors: a spatial-0rientational one (“Up”) and a material-expansion one (“inflate”). The breakthrough achieved by Lakoff and Johnson was in realizing that these are not a minor, subtle and exceptional third category beyond literal and figurative language. Their work showed that such conceptual metaphors account for practically all language, and also get to vast realms of non-linguistic (such as mathematical) thought. In the Lakoff-Johnson approach, you describe a conceptual metaphor with an “X is Y” title. Here are parts of their opening examples (Chapters 1 and 2 of MWLB; you can find many more in this repository):


Your claims are indefensible
He attacked every weak point in my argument
I’ve never won an argument with him
You disagree? Okay, shoot!


You’re wasting my time
I don’t have the time to give you
I’ve invested a lot of time in her
He is living on borrowed time

These are very different from figurative metaphors (“He was a lion in the battle” or “The architecture was very musical”). They are at once more subtle, broader and more systematic in scope, and fundamentally, not about language at all. MWLB points out that this is the normal way English speakers talk about arguments and time. These are not poetic or extraordinary uses. Moreover, these are not necessary ways of talking about arguments or time (though I believe that some conceptual metaphor is usually necessary, just not any specific one). MWLB offers up an quick analysis of the first one: “Imagine a culture where an argument is viewed as a dance, the participants are seen as performers, and the goal is to perform in a balanced and aesthetically pleasing way.” The fact that we can imagine an alternate possible world where arguments are understood differently tells us that ARGUMENT IS WAR is not the only way to structure thinking about arguments.

Conceptual metaphor is a very complex concept. MWLB covers very fundamental ones (such as spatio-temporal, ontological and causation metaphors) that appear in every language, as well as more localized, less fundamental ones. It is hard to define the idea, but here is one of the better articulations (Chapter 1):

The essence of metaphor is understanding and experiencing one kind of experience in terms of another.

The key attributes of conceptual metaphor are systematicity (concepts maintain their relationships in a gestalt sense), incompleteness (ARGUMENT IS WAR and ARGUMENT IS DANCE highlight different subsets of the framework of concepts and relations called ARGUMENT, and neither is complete), and sensory nature (the conceptual metaphor generally maps a more abstract domain to a more sensory one). There are also much deeper aspects, like the distinction between metaphor on the one hand and generalization and abstraction on the other.

It is not the intent of this piece to provide a tutorial on conceptual metaphor. You could try some online material, such as the Wikipedia entry, but there is really no substitute to reading Metaphors We Live By. But we can get a sense for the profundity of the idea simply by looking at the prolific output of Lakoff and his collaborators since MWBW. In books like Don’t Think of an Elephant and Moral Politics the ideas of metaphoric frames and narratives are explored in the context of political discourses. In Philosophy in the Flesh the ideas are ambitiously applied to the entire field of philosophy of mind.

But perhaps most interesting for people like me, is Where Mathematics Comes From, which extends these ideas to an examination of mathematical thought. If you thought mathematics could be anchored in sensory thought only to the limit of 3d geometry, think again — all mathematics is driven metaphorically. Why do you “plug” things into equations? Why do you “move” terms in equations, or “group” them, or “crank through” a derivation, when all that is physically happening is repeated writing of related symbolic sentences? Even at a very superficial level, a MATHEMATICS IS MECHANICAL MANIPULATION metaphor works very well. You can get much deeper, all the way to a beautiful explanation of Euler’s very abstract, apparently far-beyond-intuitive grasp equation,


using an appropriate application of conceptual metaphors at work (explaining the equation above metaphorically is the tour de force bit of WMCF).

Beyond Conceptual Metaphor

The idea of conceptual metaphor goes far beyond its linguistic roots, once you understand that it is a way of talking about mental models. We organize our sensory experience, starting with what William James called the “blooming buzzing confusion” perceived in childhood, using some very basic pre-linguistic conceptual categories, relations and intuitions of causation and dynamicity. On this base is constructed layer after layer of inter-related models of parts of experience. While conceptual metaphor manifests itself most vividly in language, it also manifests itself in mathematics, geometric thinking, abstract visual thinking (every graph you ever drew or saw is a metaphor), narratives and storytelling and modal/subjunctive reasoning about possible worlds. It is also central to sophisticated decision-making. It is also foundational to the thinking style that is loosely referred to as “right brained,” which explains why Lakoff looms large in Dan Pink’s A Whole New Mind . Conceptual-metaphoric thought is also fundamental to ideation.

Not all these directions have been thoroughly explored, but there is definitely no shortage of evidence that conceptual metaphor is a foundational element of thought. Even my original reductive definition: thought is conceptual definition, doesn’t seem too hasty, once you examine the full scope of ideas we are talking about.

You can even get beyond humans to computers. Kenneth Iverson’s seminal 1979 ACM Turing Award Lecture, “Notation as a Tool of Thought,” while not explicitly about metaphor, is essentially a Lak0ff-and-Johnson for constructed symbol systems used as programming or mathematical languages. Another interesting unusual domain where the ideas of conceptual metaphor (understood in the broader sense of “mental models”) shed light, is in an analysis of modern “silo” languages and their inter-relationships, and the role played by modeling and language in getting, for instance, accountants, managers, lawyers and marketers to understand each other in a modern corporation. This issue has been explored in the fascinating (and mostly non-technical) article, “On languages for dynamic resource scheduling problems”

The Last Frontier: One-to-One Communication

Ultimately, languages viewed as larger cultural entities are relatively easy. I’ll wrap up by revisiting the question I raised earlier. What happens when you acknowledge that meaning is constructed by an individual listener with a history and a mental model of the world, not by an abstraction like “English Speaker?” Analyzing communication at its atomic, 1:1 level is so hard that it has led to statements such as Wiio’s law: communication usually fails, except by accident (a commentary on this is to be found here). This observation goes beyond the trite observation that all models (conceptual-mental or mathematical) are finite and therefore incomplete.

One-to-one communication is nothing less than the interplay of two dynamic, evolving entities. It is not about the transfer of specific intended bits of information from one locus to another. It is more like billiard balls influencing each other via collisions. Yes, bits get transferred, but the intended bits getting transferred mostly happens by accident. Even toy examples are subtle. Take a trivial example: if in December I tell you “It is snowing in Rochester” I am not transferring the one bit of information “SnowRochester=ON,” I am more likely provoking the thought “This guy is an idiot who states the obvious” or “Hmm…an invitation to a casual conversation/social ritual.” Even if the bit IS substantive (like saying, in December, “It is sunny and 80 degrees today in Rochester”) the sentence will likely provoke the thought, “Hmm, there is an unseasonable weather pattern…global warming?” rather than the overt predicate being transmitted.

Two speakers of the same language are very different information processing systems. Both have different vocabularies (both conceptual and linguistics) arranged in very different mental models of the world.

Really, what happens when A speaks a sentence to B, is that B’s internal mental model of the universe, or “information state” gets updated. This is the idea underlying the relatively modern field of dynamic semantics, which I recently learned about through a friend completing a PhD in the philosophy of language. This closely parallels the model of cognition implicit in the belief-desire-intention (BDI) approach in the philosophy of action and AI, and also parallels the basic conceptual model of control theory.

Which should explain why, if there is a single most foundational idea behind much of my writing on this blog, it is conceptual metaphor.

Get Ribbonfarm in your inbox

Get new post updates by email

New post updates are sent out once a week

About Venkatesh Rao

Venkat is the founder and editor-in-chief of ribbonfarm. Follow him on Twitter


  1. Very interesting !! Looks like MWLB will be the next book I’ll pick up.

    I think the same kind of concepts apply in music. I doubt if there is anything absolute about the experience that a certain piece of music can give to a listener. There might be a few obvious things like echo= large space, fast = energetic, slow = mellow, though I have seen even these break down. The emotional experience upon listening to a certain piece of music is a result of a large number of associations that are drilled into us through TV, movies, ads, the meaning of the lyrics in the song, etc. I doubt if a raaga or scale has any kind of absolute emotional content which is independent of the built-up associations in a particular cultural setting.

  2. I had a pretty interesting experience today related to this entry. I have been making effort to learn German since I’m surrounded by German speakers. I had learned the German word for ”slide” (Powerpoint slide) the night before and decided to use it. I had to tell the person in charge of all the slides that one of mine was misplaced. I started speaking in German, ”My slide is…” but I froze because I could not think of a way to describe by problem while the sentence was booming in my head in English. My listener stared at me very expectantly because he knew I had started a metaphor of the form ”X is Y”. In this case, ”X = my slide” and ‘Y = something” which he was expecting to replace with whatever I had to say. The state of mind described by ”something”, almost like allocation of memory, is a strange one.

  3. If I can speak two languages fluently that may imply I can think in two different “conceptual metaphor” systems. Meaning, I could potentially understand more of the world than a person who’s conceptual metaphor (thought) is limited to one system. Extrapolating from that, if I have lived in two different societies my conceptual metaphors are more advanced than if I simply speak two different languages. Frankly, the more I think about it I realize that it is just a more fancy way of saying “I am the sum of my experiences”. I am not sure Immanuel Kant would agree though. It also raises another interesting question, what comes first? Thought or Experience? Does one have to experience something to be able to think? The idea of conceptual metaphors is interesting, however, it does not appear to explain thought that is not colored by metaphors. A very young child would still have thought, one that cannot be classified as Conceptual Metaphor. Milk is what it is, it becomes “Doodh” much later in the child’s life. A Rose by any other name still smells as sweet. No?

  4. Hi Venkat,
    very interesting post again! A lot of food for throught.

    Regarding “going beyond language”, have you ever tried reading “Finnigans Wake” by James Joyce? Definately a big “buzzing confusion”! It seems like pure madness at first, impossible to read but then somehow it has a big effect on your subconscious (for example I recall all dreams when picking the book up again) and you realise what Joyce is trying to write down.

    Mathematics/music: many of the great mathematicians were also excellent (classical) musicians. There are links for sure (patterns, interactions, etc)


  5. If you like reading linguists feud, check out Pinker vs Lakoff.

  6. anupama says:

    your articles are both ”long” and ”deep” because of the way you hyperlink concepts, ideas and new information…also ”wide”, i guess, in terms of what all they can trigger…

  7. Can I carry out element of the article for you personally to my internet site ? short sale coaching