logo

60 pages 2 hours read

Steven Pinker

How The Mind Works

Nonfiction | Book | Adult | Published in 1997

A modern alternative to SparkNotes and CliffsNotes, SuperSummary offers high-quality Study Guides with detailed chapter summaries and analysis of major themes, characters, and more.

Chapter 2Chapter Summaries & Analyses

Chapter 2 Summary: “Thinking Machines”

At the beginning of Chapter 2, Pinker takes us through a thought exercise: If we met an alien, “what would it have to do to make us think it was intelligent?” (61). Pinker summarizes the criteria for intelligence as acting using rules based on some relation to reality or logical inference, pursuing a goal or desire, and using the rational rules to pursue the goal in different ways depending on the obstacles in place. Not meeting these rules indicates that the being is not intelligent. Magnets, for example, are obeying physical laws in moving toward one another, but they can’t be said to “want” to be touching, and they can only move in a direct line. If an obstacle appears, they remain on either side of the obstacle, as close as they can physically get; They do not attempt other means to get closer to each other.

 

These rules bring up the question of why people choose the goals they choose. Pinker points out that we still use common sense and intuitive psychology, our internal sense of how our mind and other people’s minds work, to understand why people behave the way they do. We do not look to models or algorithms. Many of our best theories of how the mind works still involve some leap of common sense—a step that isn’t backed by math or molecular movements but instead by what we intuitively know about how the system works.

 

The mind is often thought to be separate from the brain, sometimes linked to the soul or to a magical type of tissue that we haven’t found. However, the best theories of the mind come from the idea that the brain’s interactions produce the mind, almost like a substance in and of itself. It is not the physical structure of the brain that gives rise to the mind but the pattern of activity that creates the mind. You can build a very simple information processing machine (a Turing machine) that can take a set of symbols (numbers, for example) and use any well-defined set of rules or operations to produce a solution in the form of a new set of symbols. Pinker discusses how the mind can process information and even create new information. In one example, a machine is given a set of statements about a family tree. It shows all parents and siblings, but it does not have a representation for aunt or uncle. The goal is to find out if someone is an uncle. The machine knows that an uncle is the male sibling of a parent, so it must find the person’s parents and then all male siblings to answer if a specific person is an uncle. It can then store the uncle information, thereby adding to information readily available without any processing. The computational theory of mind takes this approach to intelligence: that it is a form of computation. However, human intelligence can handle partial information and probabilistic information such that it can register the probability of a variety of outcomes instead of only handling information that will necessarily produce the same outcome with the same input.

 

There are many criticisms of the computational theory of mind, and there are still many areas of the mind we don’t fully understand and that are difficult to explain even with the computational theory of mind. The paradox of logic presented in Lewis Carroll’s “What the Tortoise Said to Achilles” shows that with any inference system, at some point, an action must simply be executed—the system does not follow rules all the way through. Warren McCulloch and Walter Pitts introduced the idea that neurons simply responded to the sum of inputs to their system to either fire or not to fire. If the sum of inputs exceeded a threshold, they fired. If it did not, they did not fire. McCulloch and Pitts showed that the firing and not firing behavior could be equated to logical statements. AND statements were neurons that needed input from A AND B to reach the threshold and fire. OR statements were neurons that needed either A OR B to provide input to fire, and NOT statements were neurons that fired when they did not receive an input and stopped firing when they received an input. Each neuron is connected to many other neurons, and it is the pattern of firing and not firing that produces specific responses or larger behaviors. Just with those three statements alone you can produce many logical statements and actions, but they are still not enough to cover all human consciousness.

 

For the human brain, the situation is even more nuanced, as neurons can respond to gradations of input. They can fire at different intensities to indicate the probability that something is true. Those probabilities build to indicate how likely it is that we should engage in a certain action and allow probabilities for other options to be considered. The network of neurons is also highly interconnected, meaning neurons that fire prompt many other neurons to fire. If those neurons also reach threshold and fire back, the neurons are reinforcing each other and engaging even more neurons in the event. The volume and complexity of options for input and output starts to look large enough to encompass human experience.

 

These networks of neurons are called auto-associators. These auto-associators are content-addressable memory, meaning they can receive the input of any property represented by a neuron in the auto-associator network and start activating the other neurons in the group. If those neurons also receive input for their property, the entire network is soon active, allowing rapid identification of something in the environment. Another property of auto-associators is their “graceful degradation,” meaning they do not require everything to match exactly to summon the correct inference. If someone has a typo in a presentation, we do not get stuck, unable to comprehend anything until the typo is fixed. We can use all contextual clues to recognize small errors, fix them automatically ourselves, and move on in a useful direction. As a similarly useful property, auto-associators can perform constraint satisfaction. When we hear a word that could be several words because of similar pronunciations, we can use clues from logic and similarity. The human brain can apply multiple constraints, and the correct option is the one that satisfies all constraints.

 

The auto-associators also use patterns of activation to indicate categories and to associate multiple individual objects in the same category. They use the basic principle that if two objects have at least one similar property, they likely have other similar properties as well. The more similar properties they have, the smaller the category they inhabit. The final property of auto-associators is that they can learn. Learning involves changing the connection weights to change the firing pattern based on new information. For instance, as children, we often learn that an initial assumption about an animal—for example, that cats are dogs—is wrong. When we learn that we are wrong, we adjust the weights to accommodate that new knowledge and separate dogs and cats.

 

These auto-associators that can learn are called perceptrons, and perceptrons have one problematic flaw: Perceptrons can’t handle an exclusive-or statement. If a neuron should fire only when A OR B input is present and not fire if A AND B are present, we essentially have two thresholds. To handle this situation, the system can have an intermediate, “hidden” unit that processes A OR B so it will only respond if A OR B is active. Another hidden unit could process A AND B and be inhibited if A AND B are present. These hidden units send signals to the next step, and if the next steps receive both A OR B and A AND B, the neuron doesn’t fire. If it receives only A OR B, it fires. This situation lets us create an exclusive OR using the same principals already discussed.

 

To move from simple constructions to more complex mental operations, we need the network to have a few more elements. Connectionism, proposed by David Rumelhart and James McClelland, argues that the simple networks already described account for human intelligence. The reason we are more intelligent than other creatures is because we have more hidden layers allowing exclusive-or-type propositions, and we have more finely tuned connection weights from interacting with humans around us (social learning). In general, connectionism views the brain as creating connections by using properties of objects and experiences instead of creating individual representations of each experience. Instead of having a representation of each dog you see, you have representations of “furry” and “barks,” and when you see enough of the representations that fall under the category “dog,” you say that what you are seeing is a dog.

 

Despite having some explanatory power, connectionism has some issues for standard examples of everyday thought. The first is the idea of the individual. If we are primed to just use properties to determine what something is, then how do we know individual people or dogs? Using the standard model of having representations of many properties, and in which the more properties we encounter the more fine-tuned we can make your judgment (e.g., from animal to mammal to cat), we would need to have a representation of something that was unique to that individual. Ultimately, we would have representations of everyone we encountered, contrary to the view of connectionism. As humans, we can go even further. We can distinguish identical twins.

 

Compositionality is another problem facing connectionism. Compositionality is the ability to create a representation in which the parts have a meaning and the way the parts are put together has meaning. The two sentences “The baby ate the slug” and “The slug ate the baby” use the same words but have very different meanings. How could we represent both sets of meaning from the same words? One way is to have one neuron for each possible sentence, but that is impossible because the number of possible sentences far exceeds estimates of the number of neurons in the human brain. Instead, we represent the concepts and their roles in the sentence separately. Then, another set of neurons fires that represent the complete sentence as a whole unit after the concepts and roles have been represented.

 

Surprisingly, quantification is another problem for connectionism. As humans, we can talk about a specific man performing an action or man in general performing an action, and we can tell the difference between those two statements and many other statements requiring knowledge about how many individuals are involved. Connectionism struggles to explain this ability. For instance, Neal Cohen and Michael McCloskey trained a network to add two digits. They first went through adding “1” to other numbers, and then they trained it to add “2” to other numbers. However, when they trained it to add “2,” it stopped being able to add “1” and could only add “2”. This ability requires the brain to hold information in mind or anchor information in a place and still develop new ways to do something. It needs to be able to hold onto the ability to add “1” and then add the ability to add “2” using the same basic machinery. David Sherry and Dan Schacter argued that humans have specialized memory systems for this purpose. Endel Tulving referred to “episodic” or event-based memory for specific events that have occurred in our life or in the world and “semantic” or generic-knowledge memory for knowledge of stable facts that are true about the world around us.

 

Another area humans handle quite well but connectionism can’t explain is fuzzy logic. We can handle categories with clear and distinct boundaries and inclusion criteria and categories that are fuzzy at the edges, with lots of examples that don’t share traits. We know that penguins are birds despite not being able to fly or sing. We know that vegetables come in many different textures and colors and tastes, but we still associate “green” and “leafy” with vegetables most often. We tend to think of green and leafy vegetables as better examples of vegetables, but that doesn’t mean we don’t know a carrot is a vegetable. We carry clear and distinct and fuzzy thoughts about the same category in our head at the same time, meaning we are representing this information in some clear and organized way even though it would appear to be very disorganized. Humans can create rule systems that supersede similarities used by the associators and group things based on explanations. Humans can carry around multiple rule systems in their heads and apply them as appropriate, knowing that in some areas two items are very similar, and in other areas they are very different.

 

Pinker discusses the various views of consciousness, including as a synonym for intelligence, as self-knowledge, as access to information, and as sentience. For self-knowledge, people typically define consciousness as, “building an internal model of the world that contains the self” (134). Self-knowledge is also the easiest version of consciousness to model. Even programs report back on their status and function, and if we have representations of other people, we can have representations of ourselves.

 

For access to information, we can see that humans can report anything they sense and perceive and anything they can hold in memory, but they can’t report on everything happening internally, like stomach acid or neurotransmitter levels. There are, therefore, different pools of information and different levels of access to that information. These versions of consciousness can be extended to computers, which know about themselves and the functions of things to which they are connected. They can report information on a wide variety of self- and other-knowledge. Finally, sentience encompasses the subjective experience of the world, the idea of “if you have to ask, you’ll never know”. This version of consciousness is the part that can’t be explained in models and appears almost magical.

Chapter 2 Analysis

Chapter 2 starts by asking two straightforward but difficult questions: Can a mechanical device ever duplicate human intelligence, and would a human-like machine be considered conscious? These two fundamental questions are important both because we are still puzzling through them for humans (let alone robots) and because they are at the core of what it is to be human. Intelligence and consciousness are concepts we readily identify but struggle to put into words. In discussing intelligence, Pinker notes three elements that start with a base of processing information. This definition helps him build a picture of intelligence starting from a very basic ability: identifying things in the environment. The basic building block of the brain’s information-processing abilities, the neuron, is simple. Neurons fire when they receive stimulation that passes a threshold. They are “yes/no” homunculi that are on or off. Despite their individual simplicity, they can combine to create complex thoughts. For identifying things in the environment, a few neurons that fire when specific properties are perceived by the senses (fur, barking, four legs) will reinforce each other until you conclude that you are looking at a dog.

 

These basic building blocks of the human mind will show up in each chapter and represent how Pinker builds his arguments, moving from the most basic element of the process to the complex thoughts and experiences we have as humans. The mind is therefore a large network of very simple elements, and it is the connections between these elements and their patterns of activity that produce the complexity we take for granted.

 

As with all Pinker’s arguments, there are several limitations to explore. In the case of connectionism and the idea that the mind is really the pattern of activity of very simple, yes/no homunculi, there are still some issues with ideas like compositionality, quantification, and fuzzy logic. These ideas aren’t explained perfectly by the connectionist view, and Pinker describes them but does not solve the issues in this chapter. Instead, he hints at important ideas that will come up later. First, the brain has developed specific systems to handle these issues. For quantification, we need to be able to hold things in mind and know that there are individual instances of people and aggregated people. For individuals, we need to be able to hold on to one individual who has their own thoughts, feelings, and actions, and another individual who has the same properties (walks upright, large head, symmetrical body) but has separate thoughts, feelings, and actions without losing either of them and always being able to distinguish them.

 

Connectionism may not be able to handle this idea on its own, but memory as an ability of the human mind can. Similarly, for compositionality and fuzzy logic, we have an interesting organization scheme in our heads that allows us to use the same words in different configurations to mean very different things and allows us to group objects in our environment in multiple categories based on similarity with other objects, explanations for how the objects behave, and cultural categories taught to us. These additional traits of the human mind are supposedly built on the basic structure already laid down: neuronal networks that can adapt to the current environment.

 

One important idea regarding the human mind is how it is “appropriately flexible.” A theme throughout the book is that the human mind is flexible in just the right ways. The example given in this chapter concerns about reading in different typefaces. If you saw a new typeface that was very elaborate and a little difficult to read, you would have to spend some time figuring out what all the letters looked like in that new typeface. However, once you figured it out, you’d be able to read all the words you already knew, and you’d already know their definitions and relations to other words and how to put them in a sentence. You wouldn’t have to relearn all those aspects of a word just because it was typed in a different typeface. Similarly, if you learned a new word, such as wapiti, which is an elk, you would be able to immediately apply everything you know about elk to wapiti.

Knowing that is how we experience our representations gives us clues as to how those representations work. They are more general so that they can be applied flexibly, and there is at least some aspect of the physical experience of the information that determines how the representation works. If we are looking at the word elk, we can read it in different colors or sizes and use our other words (wapiti) to mean elk. The fact that we are interested in the word elk and not the large, docile being “elk” means that our representation is focused on writing and semantics and the physical structure of the word. If we wanted to discuss the being “elk,” our representation would also be flexible across all relevant spaces (e.g., an elk is an elk in Africa or in North America, on a hill or by a river). We would not try to apply the same flexibility used for the word elk to the being “elk,” but both representations are appropriately flexible.

 

This appropriate flexibility implies that at some point we learn, or somehow have inherent in the system, the ability categorize things to almost an infinite number of dimensions. We can place animals in categories for what they eat, how they look, their evolutionary tree, the word used to reference them, and even more. Categories are fully flexible and functional; we can call on them at any moment, and we can switch between them seamlessly. This flexibility is most interesting because it is appropriate. If we were too flexible, never access what we need in a timely manner. The idea of appropriate flexibility is a key piece of Pinker’s arguments about the mind in general.

 

The computational theory of mind runs into many criticisms that Pinker argues are not valid. The first is that when we describe a computer “talking to” a printer, we are simply being imprecise with our language. However, spoken language is simply a system of symbols we can use to convey an idea, and in the case of the computer and printer, the computer’s code is pinging the printer with input that should produce the output of the printer printing the requested document. If this process doesn’t work, then either the computer is sending the wrong input or the printer doesn’t have the code that is supposed to respond to that input. Either way, the system is operating like “talking,” and so we use that phrase.

 

A second argument is that the computational theory of mind would require homunculi, little men, to “look at” internal representations and process them appropriately. However, bits of code are essentially very simple homunculi that then execute when the appropriate input comes through and remain dormant when the input isn’t present. A series of these “yes/no” homunculi could trigger the reactions necessary to do many basic things, including the “find the uncle” example used in the chapter. How do we decide that a symbol or bit of code represents something? In humans, there could be a physical connection, such as your representation of your mother being tied to seeing and talking to her regularly. This connection is constantly reinforced, producing the basic knowledge of what and who your mother is. Another way to create meaning and representations is through agreeing collectively what a mother is. Once society has determined what mother is, we can use that inference to reason about our mother and build information about mothers from our experiences.

 

A more elaborate criticism of connectionism came from John Searle, who created the thought experiment the Chinese Room to show that the computational theory of mind is not accurate. In the Chinese Room, there is a man who does not know Chinese. Someone slips pieces of paper under the door with squiggles on it and instructions him to create other squiggles and slip them back under the door. The man does not realize it, but he is answering questions posed in Chinese by using Chinese himself. We can’t say that the man understands Chinese, and yet he gives the appearance of understanding Chinese. Searle’s thought experiment can be refuted because of the homunculi discussed above and the speed at which the translation happens. For anyone who speaks any language, your brain essentially takes language input and creates language output that satisfies the question/statement input. At the very base of this process are neurons simply responding to stimulation from other neurons. Those neurons don’t understand Chinese, but they are why you take the input and produce the output. The process of understanding Chinese involves lots of smaller, dumber pieces that build up to the ultimate outcome: a statement made in Chinese. These smaller, dumber pieces don’t understand Chinese themselves, just like the man in the room.

 

Searle also slows down the process of receiving input and producing output to the scale of reading a sheet of paper and writing out a response, but if the man could do those things in a fraction of a second, would we still argue he didn’t understand Chinese? The little pieces that make up our language system can do their tiny parts on a nanosecond time scale such that when they all build up together, we have rapid responses that were generated almost as if a man in a box received the input and produced the output without any idea of the bigger picture.

 

These criticisms share a common thread of showing that connectionism and the computational theory of mind don’t fully explain every aspect of how the mind works. They do not show exactly how we go from yes/no neurons to full spoken sentences, and they don’t show exactly how we categorize information or create mental representations. What they offer is a way that these complex behaviors could exist from what we see in the human brain. We see neurons, we see that they are connected, and we see activity, usually through how the neurons use oxygen. Connectionism and the computational theory of mind describe how we can go from these connected neurons to the experience of the mind we have: thoughts, behaviors, feelings, memories, and knowledge. Additional research must still puzzle through how all these things happen, but using connectionism and the computational theory of mind as frameworks helps guide research and organize findings into coherent ideas.

blurred text
blurred text
blurred text
blurred text