Take it Apart and See How It Runs.
PAUL M. CHURCHLAND
Paul M. Churchland studied philosophy at the Universities of British Columbia
(Canada) and Pittsburgh, specializing in the philosophy of mind, perception, and
neuroscience as well as in epistemology and history of science. He earned his
Ph.D. in 1969 from the University of Pittsburgh. From 1966 onward he taught at
different universities in the U.S. and Canada and became full professor at the
University of Manitoba in 1979. In 1984, he moved to the University of California
at San Diego, where he has been Professor of Philosophy since.
Neural Networks and Commonsense
Why has San Diego become this center of cognitive science?
Partly it was coincidence, and partly it was design. It was coincidence that we had a very strong neuroscience community. That had been an underground research program for a long time. Intelligence or psychology.
It was one of the respects in which good oldfashioned Artificial Intelligence was a failure. The contribution of standard architectures and standard programming Artificial Intelligence was a disappointment. Artificial Intelligence has a vital, central role to play in cognitive science, because we can experiment and explore the properties of neural networks more easily in artificial systems than we can in natural ones.
Intelligence?
Its aim is to re-create and understand the cognitive abilities of living creatures generally. The old approach used the standard architectures and wrote programs. It was a wonderfully easy way to do interesting things very quickly, but it ran into trouble. We need massively parallel architectures if we want to re-create what biological creatures do simply and easily.
Now that we are attempting to do that, things are going to move very much faster. We've got to break through this wall, and in order to do so, we have to make artificial neural networks that are real networks, not simulated. VLSI chips that re-create parallel networks, rather than try to simulate them on standard machines. That is a breakthrough that needs to be made in order to continue to explore neural nets and their promise, because at the moment there is a size limit on how big a network can be simulated.
We need to get away from the idea that we are going to achieve Artificial Intelligence by writing clever programs. There is a sense in which neural networks manipulate symbols, but it is not the familiar sense. You can think of a neural network as manipulating its input vector. You can see the neural networks as symbol processors, but they are processing a very different kind of symbol from what we are familiar with, and they are processing them in very different ways from what we are familiar with.
The old symbol processing paradigm will never go away entirely, and it should not, for it is useful for many things. But it is not going to be useful for under-standing how natural intelligence does its job or for re-creating highpowered Artificial Intelligence.
In this case, it is so entertaining that it is going to keep the attention of philosophers, and there is so much methodological argument going back and forth that philosophers will play a useful role, but the interesting science will come from the theorists and the experimentalists. We can now learn things with our experimental techniques that we could not learn five or ten years ago. We are going to learn a great deal about how natural creatures do it, and we can import that into the artificial area. Another area of experiments will be to explore what neural networks can do by training them up on various kinds of problems and by using different kinds of learning algorithms.
Experiments of this kind have caused this explosion in the last few years within cognitive science.
Philosophers are supposed to learn the business of other disciplines. Some philosophers saw fundamental problems in the way old-fashioned Artificial Intelligence was working. They gave the traditional Artificial Intelligence lots of support. But just as traditional Artificial Intelligence has failed, much of traditional philosophy has failed, too.
People knew that Turing machines could, in principle, compute any function. Because even though in principle a Turing machine might be able to do what you or I or a squirrel or a mouse could do, it might be that when you are limited to that kind of architecture, it would take a Turing machine 1089 years to do it. In principle it could, but it cannot do it in real time, it cannot come even remotely close to doing it in real time. The claim that architecture did not matter was a justified claim, because in principle it did not matter.
You could compute any function with an architecture like that or with a von Neumann architecture, to come a little closer to reality, but you could not do it in real time. If you want to do it in real time as mice and birds and creatures who have to live in the real world do it, then you have to go to massively parallel architectures. The architecture does matter profoundly, even to do the job at all. Part of the reason why the neural network paradigm went into eclipse is that it lost the early battle against standard program-writing Artificial Intelligence.
The real breakthrough came when back propagation was invented, not because that is a particularly wonderful learning algorithm from either a biological or a technical point of view but because it was an easily applicable and ruthlessly efficient one. Once we had found an easy way to train neural networks, we soon discovered that they could do an amazing variety of things. It also allowed us to examine the networks once they were trained and to see how they were doing it. I am thinking here of the analysis of the hidden units, when we discovered the categorical structures that were spontaneously generated inside the network.
In the long run, back propagation will fade into the background and will be replaced by better learning algorithms, even in Artificial Intelligence. When we do "natural" cognitive science, back propagation will not be too important, but it was very important to make the breakthrough happen. It gave us a way to manipulate the networks to a degree we had never been able before, and we learned exciting things very quickly. The claim that the architecture does not matter was connected with the claim that this was the way to overcome Cartesian dualism.
In your book Matter and Consciousness5 you propose eliminative materialism as a new theory of mind that is able to overcome the weaknesses of functionalism. They may be a completely false theory, or a superficial one. Maybe we can explain what beliefs are, and they may turn out to be a small part of our cognitive activity, but a real part. One argument against eliminative materialism is that it eliminates consciousness, too.
In the case of qualia, I am disinclined to be an eliminativist. Our current neural theory explains qualia surprisingly well. Qualia, namely the features of our sensations, can be explained in terms of these multidimensional spaces, and a particular quale is going to be identified with a particular vector. We will be able to explain qualia very nicely.
I don't think we are going to eliminate qualia. We keep qualia, but we have a deeper understanding now than before. In your book you mention that the mainstream is functionalism and the case for eliminative materialism is not so strong right now. What is happening is that resistence to the idea is fading.
When elimina-tive materialism came up first, we did not have a very clear conception of what an alternative framework would be. But neuroscience and connectionist Artificial Intelligence and cognitive psychology are starting to produce an alternative conceptual framework, and we can see what it might look like. So the idea is no longer so frightening. Artificial Intelligence.
You mentioned earlier that the first networks were beaten by traditional
I don't think the Turing Test is a very deep or useful idea. Of course we want to simulate what cognitive creatures do, and one can ask what a test for a successful simulation would be like. The Turing Test invites us to look at external things, to look at behavior. We should instead be looking at a test for adequate simulation that works inside the cognitive creature.
The Turing Test remained popular just as long as good old-fashioned
Artificial Intelligence could convince us of the idea that architecture does not matter, that it was external functional form that mattered. Of course the external functional form matters, but if you want to reproduce it effectively you have to look inside the machinery that is sustaining the cognitive structure. I am not impressed by the Turing Test. Another reason is that it does push aside an area of potential disagreement between people.
Immediately raises the question "Does it have thoughts inside?" People have religious views, metaphysical views, all sorts of views that make it impossible to discuss things effectively. The Turing Test has the advantage of locating the argument in an area where people can agree.
I think he is right in criticizing the view of traditional Artificial Intelligence, the idea that we can re-create intelligence by an abstract symbol manipulating machine. It is not a promising research strategy. If I were a good old-fashioned Artificial Intelligence researcher, I would not be much moved by Searle's argument.
What do you think about the so-called commonsense problem?
Intelligence failed. It was not able to deal with the large databases that any normal human has, it was not able to deal with the problem of fast and relevant retrieval. It is dramatic that neural networks solve this problem at a blow. Because what you have in your head is a neural network, and what you draw on as a person is commonsense knowledge, and it is no accident that the two go together.
I think the projections that good old-fashioned Artificial Intelligence workers made are, in fact, going to be realized. They made them twenty years ago, and the discipline is famous for not having realized many of them, but the reason they were not realized was that they were tied to a hopeless architecture. Expert systems of the old-fashioned kind are going to be replaced by neural network expert systems. Now, all that data just begs to be put into a neural network, so that the neural network can be trained up on all these examples.
The input is a loan application, the output is a prediction about repayment behavior. Banks want to be able to predict repayment behavior. So somebody trained up a neural network on a large sample of past loan applications so that it would correctly predict repayment behavior. There is going to be an enormous market for neural networks.
This, however, is not very philosophical, and I am no better than anybody else in predicting these things. When we learn to build big neural networks, maybe as big as you and me, they will help us to automate the process of scientific exploration itself. I think that Artificial Intelligence will only become good enough in perhaps fifteen or twenty years to assist in thinking up new scientific theories, in helping us to do conceptual things. They will be able to do it 107 times faster than we will, because if we make neural networks that work on electricity and not on slow-moving electrochemical impulses on neurons, then we will have a brain that thinks as we do, but 107 times faster than we do.
AARON V. CICOUREL
Aaron V. Cicourel earned a B.A. degree in psychology from the University of
California at Los Angeles in 195 I. He pursued his studies at UCLA (M.A.) and at
Cornell University, where he earned his Ph.D. in anthropology and sociology in
1957. His professional career was initiated with a postdoctoral fellowship at the
UCLA Medical Center and then led him to Northwestern University, the University of California at Riverside, Berkeley, and Santa Barbara, and finally UC San
Diego, where he has been Professor of Sociology in the School of Medicine and
the Department of Sociology since 1970 and Professor of Cognitive Science, Pediatrics, and Sociology since 1989. He has been a visiting lecturer, scholar, and
professor in numerous countries all over the world, including Argentina, Mexico,
France, Germany, Spain, Australia, Japan, and China.
Cognition and Cultural Belief
Neural Networks and Commonsense
Why has San Diego become this center of cognitive science?
Partly it was coincidence, and partly it was design. It was coincidence that we had a very strong neuroscience community. That had been an underground research program for a long time. Intelligence or psychology.
It was one of the respects in which good oldfashioned Artificial Intelligence was a failure. The contribution of standard architectures and standard programming Artificial Intelligence was a disappointment. Artificial Intelligence has a vital, central role to play in cognitive science, because we can experiment and explore the properties of neural networks more easily in artificial systems than we can in natural ones.
Intelligence?
Its aim is to re-create and understand the cognitive abilities of living creatures generally. The old approach used the standard architectures and wrote programs. It was a wonderfully easy way to do interesting things very quickly, but it ran into trouble. We need massively parallel architectures if we want to re-create what biological creatures do simply and easily.
Now that we are attempting to do that, things are going to move very much faster. We've got to break through this wall, and in order to do so, we have to make artificial neural networks that are real networks, not simulated. VLSI chips that re-create parallel networks, rather than try to simulate them on standard machines. That is a breakthrough that needs to be made in order to continue to explore neural nets and their promise, because at the moment there is a size limit on how big a network can be simulated.
We need to get away from the idea that we are going to achieve Artificial Intelligence by writing clever programs. There is a sense in which neural networks manipulate symbols, but it is not the familiar sense. You can think of a neural network as manipulating its input vector. You can see the neural networks as symbol processors, but they are processing a very different kind of symbol from what we are familiar with, and they are processing them in very different ways from what we are familiar with.
The old symbol processing paradigm will never go away entirely, and it should not, for it is useful for many things. But it is not going to be useful for under-standing how natural intelligence does its job or for re-creating highpowered Artificial Intelligence.
In this case, it is so entertaining that it is going to keep the attention of philosophers, and there is so much methodological argument going back and forth that philosophers will play a useful role, but the interesting science will come from the theorists and the experimentalists. We can now learn things with our experimental techniques that we could not learn five or ten years ago. We are going to learn a great deal about how natural creatures do it, and we can import that into the artificial area. Another area of experiments will be to explore what neural networks can do by training them up on various kinds of problems and by using different kinds of learning algorithms.
Experiments of this kind have caused this explosion in the last few years within cognitive science.
Philosophers are supposed to learn the business of other disciplines. Some philosophers saw fundamental problems in the way old-fashioned Artificial Intelligence was working. They gave the traditional Artificial Intelligence lots of support. But just as traditional Artificial Intelligence has failed, much of traditional philosophy has failed, too.
People knew that Turing machines could, in principle, compute any function. Because even though in principle a Turing machine might be able to do what you or I or a squirrel or a mouse could do, it might be that when you are limited to that kind of architecture, it would take a Turing machine 1089 years to do it. In principle it could, but it cannot do it in real time, it cannot come even remotely close to doing it in real time. The claim that architecture did not matter was a justified claim, because in principle it did not matter.
You could compute any function with an architecture like that or with a von Neumann architecture, to come a little closer to reality, but you could not do it in real time. If you want to do it in real time as mice and birds and creatures who have to live in the real world do it, then you have to go to massively parallel architectures. The architecture does matter profoundly, even to do the job at all. Part of the reason why the neural network paradigm went into eclipse is that it lost the early battle against standard program-writing Artificial Intelligence.
The real breakthrough came when back propagation was invented, not because that is a particularly wonderful learning algorithm from either a biological or a technical point of view but because it was an easily applicable and ruthlessly efficient one. Once we had found an easy way to train neural networks, we soon discovered that they could do an amazing variety of things. It also allowed us to examine the networks once they were trained and to see how they were doing it. I am thinking here of the analysis of the hidden units, when we discovered the categorical structures that were spontaneously generated inside the network.
In the long run, back propagation will fade into the background and will be replaced by better learning algorithms, even in Artificial Intelligence. When we do "natural" cognitive science, back propagation will not be too important, but it was very important to make the breakthrough happen. It gave us a way to manipulate the networks to a degree we had never been able before, and we learned exciting things very quickly. The claim that the architecture does not matter was connected with the claim that this was the way to overcome Cartesian dualism.
In your book Matter and Consciousness5 you propose eliminative materialism as a new theory of mind that is able to overcome the weaknesses of functionalism. They may be a completely false theory, or a superficial one. Maybe we can explain what beliefs are, and they may turn out to be a small part of our cognitive activity, but a real part. One argument against eliminative materialism is that it eliminates consciousness, too.
In the case of qualia, I am disinclined to be an eliminativist. Our current neural theory explains qualia surprisingly well. Qualia, namely the features of our sensations, can be explained in terms of these multidimensional spaces, and a particular quale is going to be identified with a particular vector. We will be able to explain qualia very nicely.
I don't think we are going to eliminate qualia. We keep qualia, but we have a deeper understanding now than before. In your book you mention that the mainstream is functionalism and the case for eliminative materialism is not so strong right now. What is happening is that resistence to the idea is fading.
When elimina-tive materialism came up first, we did not have a very clear conception of what an alternative framework would be. But neuroscience and connectionist Artificial Intelligence and cognitive psychology are starting to produce an alternative conceptual framework, and we can see what it might look like. So the idea is no longer so frightening. Artificial Intelligence.
You mentioned earlier that the first networks were beaten by traditional
I don't think the Turing Test is a very deep or useful idea. Of course we want to simulate what cognitive creatures do, and one can ask what a test for a successful simulation would be like. The Turing Test invites us to look at external things, to look at behavior. We should instead be looking at a test for adequate simulation that works inside the cognitive creature.
The Turing Test remained popular just as long as good old-fashioned
Artificial Intelligence could convince us of the idea that architecture does not matter, that it was external functional form that mattered. Of course the external functional form matters, but if you want to reproduce it effectively you have to look inside the machinery that is sustaining the cognitive structure. I am not impressed by the Turing Test. Another reason is that it does push aside an area of potential disagreement between people.
Immediately raises the question "Does it have thoughts inside?" People have religious views, metaphysical views, all sorts of views that make it impossible to discuss things effectively. The Turing Test has the advantage of locating the argument in an area where people can agree.
I think he is right in criticizing the view of traditional Artificial Intelligence, the idea that we can re-create intelligence by an abstract symbol manipulating machine. It is not a promising research strategy. If I were a good old-fashioned Artificial Intelligence researcher, I would not be much moved by Searle's argument.
What do you think about the so-called commonsense problem?
Intelligence failed. It was not able to deal with the large databases that any normal human has, it was not able to deal with the problem of fast and relevant retrieval. It is dramatic that neural networks solve this problem at a blow. Because what you have in your head is a neural network, and what you draw on as a person is commonsense knowledge, and it is no accident that the two go together.
I think the projections that good old-fashioned Artificial Intelligence workers made are, in fact, going to be realized. They made them twenty years ago, and the discipline is famous for not having realized many of them, but the reason they were not realized was that they were tied to a hopeless architecture. Expert systems of the old-fashioned kind are going to be replaced by neural network expert systems. Now, all that data just begs to be put into a neural network, so that the neural network can be trained up on all these examples.
The input is a loan application, the output is a prediction about repayment behavior. Banks want to be able to predict repayment behavior. So somebody trained up a neural network on a large sample of past loan applications so that it would correctly predict repayment behavior. There is going to be an enormous market for neural networks.
This, however, is not very philosophical, and I am no better than anybody else in predicting these things. When we learn to build big neural networks, maybe as big as you and me, they will help us to automate the process of scientific exploration itself. I think that Artificial Intelligence will only become good enough in perhaps fifteen or twenty years to assist in thinking up new scientific theories, in helping us to do conceptual things. They will be able to do it 107 times faster than we will, because if we make neural networks that work on electricity and not on slow-moving electrochemical impulses on neurons, then we will have a brain that thinks as we do, but 107 times faster than we do.
AARON V. CICOUREL
Cognition and Cultural Belief
I first got interested in language and thought when I took my master's degree in 1955 at UCLA, where I was introduced to the work of Edward
These contacts stimulated me to go outside of sociology.
In the spring of 1954,1 began a series of small research projects with
Harold Garfinkel3 that led to a serious interest in everyday reasoning and social interaction. My dissertation got a lot of people in sociology upset because I talked about a number of issues that were about language, social interaction, and reasoning among the aged. I found phenomenological ideas very useful, but I was bothered eventually by the fact that there was not much empirical research attached to that tradition. My preoccupation with the local ethnographically situated use of language and thought in natural settings was not always consistent with this tradition.
By 1958, at Northwestern University, I had already read Chomsky's book Syntactic Structures4 and George Miller's5 works on memory and language. I was also strongly interested in work by Roger Brown6 and his students on language acquisition, but I could not find anyone in sociology who cared about or gave any importance to these areas.
In 1969-70 I was attracted to work in linguistics and psychology in
In 1970-71,1 received a NSF Senior Postdoctoral Fellowship to work on British Sign Language in England, where I replicated, with deaf subjects, work on hearing children's language competence and repeated some of this research in California. I moved to San Diego in 1971 and got in touch with the cognitive group in psychology and also with people in linguistics. At UCSD, an informal group interested in cognitive science began meeting, and this led to the development of a cognitive science doctoral program and the foundation of the first Department of Cognitive Science. It has been a bit strange for people in cognitive science to understand my social science background, but I sometimes remind them that my first degree was in psychology from UCLA, and this background made it easier to talk to colleagues in cognitive science than to many sociologists.
Following my early use of the work of Alfred Schutz on socially distributed knowledge, I became interested in Roy D'Andrade's and Edwin
The two notions of socially distributed cognition and socially distributed knowledge overlapped considerably because cognition is always embedded in cultural beliefs about the world and in local social practices. These conditions tend to be ignored by social scientists.
What could be the reasons why social scientists ignore this fact and why they do not play an important part in cognitive science now?
In sociological social psychology, that is, primarily symbolic interactionism, a lot of emphasis is placed on what is situational without examining language use and information processing constraints during social interaction. It seems to me that this view also ignores the constraints that the brain has placed on the way you can use conscious and unconscious thought processes, emotions, and language, and the way that language places further restrictions on what you can or want to communicate about what you think you know or don't know. A central issue of human social interaction are constraints that stem from cognitive processing. Looking at the role of social ecology and interaction requires a lot of longitudinal, labor-intensive research.
My point is that when you take for granted culture and the way it is reflected in a local social ecology you eliminate the contexts within which the development of human reasoning occurs. When culture, language use, and local interaction are part of cognitive studies, subjects are asked to imagine a sequence of events that have not been studied inde-COGNITION AND CULTURAL BELIEF SI pendently for their cultural basis or variation except for the self-evident variation assumed by the experimenter. But in everyday life, cognition and culture include locally contingent, emergent properties while exhibiting invariant patterns or regularities whose systematic study remains uneven. But what is complicated about cognitive science's focus on individual human information processing is what happens when two minds interact.
There are many constraints on how interaction can occur. For example, during exchanges in experimental settings and everyday life, problem solving can be influenced by how well subjects know each other as part of the local social ecology, how long they have known each other, the nature of the task, individual variation, and the socially distributed nature of cognitive tasks.
Suchman's Plans and Situated Action?20
She is obviously aware of environmental conditions and applies her knowledge of ethnomethodology and conversation analysis to issues in Artificial Intelligence and the situated nature of local task accomplishments. It needs to be linked to empirical research that people have done on children's cognitive and linguistic development. The most interesting parts for me are his intermediate reflections, where he brings in the cultural background as a horizon.
That tradition limits the extent to which the environment comes in. Dreyfus has been interested in the background notion because of his phenomenological perspective, but the relevance of a locally organized social ecology is not addressed as directly as can be seen in work by Alfred Schutz. Habermas, however, has been influenced by a broad range of theorists, including those interested in everyday interaction, such as G. .
What I said earlier applies: sociologists avoid cognition because from
What I have been trying to say is that the structure of social organiza-tion is important as a constraint, as well as how interaction is going to guide its reproduction. Social ecology puts limits on the kind of human cognition and information processing that can occur in a given physical space, at a given time for particular kinds of problem solving or decisions. On certain occasions, you can use only certain types of language, you can use only certain topics, you have to be seated in a certain place, and you have certain regularities about who gets to talk and when. This is the case in all social ecologies just as it applies to you and me sitting in this room.
Everything we do involves these constraints as well as the fact that social ecology or organization facilitates certain kinds of interaction and inhibits other kinds. Well, I see all of this as the interaction of different levels of analysis. I do not see why studying the brain is a problem. Researchers would like to know what neural substrates are relevant for cognitive functions while also examining and understanding these functions at their own level of analysis, despite not always having a clear picture of the neurobiology.
With what the connectionists claim—do you think that it is an advance from Newell's and Simon's physical symbol system hypothesis?
I am interested in a more macrolevel bandwidth, such as decision making in such areas as caretaker-child interaction during the acquisition of cognitive, cultural, and linguistic knowledge and practice, med-ical diagnostic reasoning, and the role of expert systems that employ rules within a knowledge base but where the time frame can be highly variable. The problem of modeling an individual mind is so complicated right now that most modelers are not prepared to do more than that. But Hutchins and others are concerned with applying connectionist models to the cognitive interaction of two or more individuals. Many people make claims about Artificial Intelligence.
The key is that Artificial Intelligence modeling can help clarify basic mechanisms involved in cognition. The value of simulation, in my opinion, depends on the way it makes use of empirical research in a given domain. I think it is difficult to apply a Turing Test to sociology because the field is too diffuse and lacks adequate articulation between general theory and the concepts that research analysts explore empirically.
Turing Test because the notion of environment remains ambiguous and unexplored. The notion of commonsense suggests a contrasting term like logical thinking or rational choice. My colleague in this research is a clinician, and we look at the question of how you actually get the knowledge from the patient that is required to use an expert system.
He addresses the cognitive processes of users of artifacts that are not addressed adequately by their producers. The notion of knowledge can be examined as a process, a set of practices and mental models about the processes and what they are to accomplish. Let me assume that the physician is a knowledge engineer. The general point is that knowledge should be conceived as a process both internally and when instantiated in experimental or practical settings.
Is it right that you claim not only that cognition is underdeveloped in social theory but also that social scientists do not have experiments or ideas for experiments?
To my knowledge, none of them have examined everyday social interaction experimentally. The experimental work done by sociologists has been confined to what is called «small group research» by a subset of social psychologists at a few universities. The study of human affect, language, learning, and decision making has progressed primarily by its reliance on laboratory research and the use of sophisticated modeling programs. The laboratory and modeling research, however, seldom includes information about the behavioral ecology of humans in natural settings or the social ecology of the laboratory.
My recent concern with the acquisition of emotions and social interaction uses experimental data obtained by Judith Reilly, Doris Trauner, and Joan Stiles on children with and without focal brain lesions. One focus of this research is the notion that processing and expressing emotions requires an intact right hemisphere and that this hemispheric specialization may occur quite early in human development. In general terms, we seek a developmental explanation of human affect and facial expression and suggest that this development is a function of the role of neural substrates and the learning of pragmatic elements of verbal, paralinguistic, and nonverbal behavior. Yet, some caretakers seem to become sufficiently depressed in socializing such infants that this can eliminate the possibility of enhancing the infant's or child's ability to adapt to typical behavioral routines that occur in local social encounters.
My goal is to examine the extent to which experimental work on cognition presumes stable forms of social interaction the infant and child must acquire to be seen as «normal.» This social and communicative competence is thus necessary to be able to conduct experiments about the infant's or child's cognitive development.
It seems to me that any problems that emerge can be useful if they force us to question the way we use existing teaching systems, medical systems, legal systems, and so on.
Let me give you a sense of my own views about how current Artificial
My concern with medical diagnostic reasoning revolves around problems of actually implementing existing systems. I think traditional production systems and the use of neural network modeling capture the essential formal criteria used to arrive at an adequate diagnosis.
DANIEL C. DENNETT
Daniel C. Dennett was born in 1942. He earned his B.A. degree from Harvard
University in 1963 and his D.Phil, from Oxford in 1965. From 1965 to 1971, he
taught at the University of California at Irvine. He moved to Tufts University in
Medford, Massachusetts, in 1971, where he has been since 1975. Since 1985, he
has been Distinguished Professor of Arts and Sciences and Director of the Center
for Cognitive Studies at Tufts University.
In Defense of AI
Before there was a field called cognitive science, I was involved in it when
At that time, I knew no science at all. I was completely frustrated by the work that was being done by philosophers, because they did not know anything about the brain, and they did not seem to be interested. Just about that time, I learned about Artificial Intelligence. When I got my first job at the University of California at Irvine in 1965, there was a small Artificial Intelligence group there.
The paper he threw on my desk was Bert Dreyfus's first attack on Artificial Intelligence. Intelligence. Since it fit with my interests anyway, I wrote an article responding simultaneously to Dreyfus and to Allen NewelPs view at that time. From 1965 to 1971, he taught at the University of California at Irvine.
Since 1985, he has been Distinguished Professor of Arts and Sciences and Director of the Center for Cognitive Studies at Tufts University. Dreyfus and Newell. « I was very interested in the debate, and I was sure that Dreyfus was wrong. The people in Artificial Intelligence were glad to have a philosopher on their side.
From where did you get your knowledge about Artificial Intelligence and computer science?
Palo Alto on philosophy and Artificial Intelligence. I still did not come out of that year very computer-literate, but I made some considerable progress. In the years following, I developed that much further, even to the point where, a few years ago, I taught a computer science course here.
It would be hard to find four more philosophical people in Artificial Intelligence than McCarthy, Pylyshyn, Hayes, and Moore. And it would be impossible to find two more Artificial Intelligence-minded philosophers than Haugeland and me.
Like most philosophers, Dreyfus tries to find something radical and absolute rather than a more moderate statement to defend. Intelligence, particularly people like John McCarthy, have preposterously overestimated what could be done with proof procedures, with explicit theorem-proving style inference. If that is what Dreyfus is saying, he is right. But then, a lot of people in Artificial Intelligence are saying that, too, and have been saying it for years and years.
Marvin Minsky, one of the founders of Artificial Intelligence, has always been a stern critic of that hyperrationalism of, say, McCarthy. Dreyfus does not really have a new argument if he is saying something moderate about rationalism. If, on the other hand, Dreyfus is making a claim that would have as its conclusion that all Artificial Intelligence, including connectionism and production systems, is bankrupt, he is wrong. There are things you cannot represent, like bodily movements, pattern recognition, and so on.
I understand that he says the human mind is not merely a problem solver, as the physical symbol systems hypothesis would imply.
So you think that one can combine the physical symbol systems hypothesis with the intuition that you cannot explain everything with a symbol?
What about connectionismf It criticizes the physical symbol systems hypothesis and seeks alternative explanations for the human mind. If you look at the nodes in a connectionist network, some of them appear to be symbols. Some of them seem to have careers, suggesting that they are particular symbols. Because it turns out that you can disable this node, and the system can go right on thinking about cats.
Moreover, if you keep the cat and dog nodes going and disable some of the other nodes that seem to be just noisy, the system will not work. The competence of the whole system depends on the cooperation of all its elements, some of which are very much like symbols. At the same time, one can recognize that some of the things that happen to those symbols cannot be correctly and adequately described or predicted at the symbol level.
There is a big difference in theoretical outlook, but it is important to recognize that both outlooks count as Artificial Intelligence. They count as strong Artificial Intelligence. In John Searle's terms, connectionism is just as much an instance of strong Artificial Intelligence as McCarthy's or Schank's7 or Newell's work. Artificial Intelligence.
Both Searle and Dreyfus generally think that it could be a more promising direction. Let's follow Searle's criticism of strong Artificial Intelligence. He has refuted strong Artificial Intelligence with his Chinese Room Argument.
What do you think about that?
First of all, the Chinese Room is not an argument, it is a thought experiment. When I first pointed this out, Searle agreed that it is not an argument but a parable, a story. Searle's most recent reaction to all that is to make explicit what the argument is supposed to be. I have written a piece called «Fast Thinking,» which is in my book The Intentional Stance.
It examines that argument and shows that it is fallacious. Briefly, Searle claims that a program is «just syntax,» and because you can't derive semantics from mere syntax, strong Artificial Intelligence is impossible. But claiming that a program is just syntax is equivocal. One of the things that has fascinated me about the debate is that people are much happier with Searle's conclusion than with his path to the conclusion.
They did not care about the details of the argument, they just loved the conclusion. Finally I realized that the way to respond to Searle that met people's beliefs effectively was to look at the conclusion and ask what it actually is, to make sure that we understand that wonderful conclusion.
Searle's conclusion. 8 / think Searle would classify that reply as the «systems reply,» because you argue that nobody claims that the microprocessor is intelligent but that the whole system, the room, behaves in an intelligent way. Searle modified the argument in response to this. He says that the systems reply does not work because it is only one step further.
I had already been through that with Searle many times. Searle does not address the systems reply and the robot reply correctly. He suggests that if he incorporates the whole system in himself, the argument still goes through. But he never actually demonstrates that by retelling the story in any detail to see if the argument goes through.
I suggest in that piece that if you try it, your intuitions at least waver about whether or not there is anything there that understands Chinese or not. We have got Searle, a Chinese-speaking robot, who has incorporated the whole system in himself. If we tell the story in a way in which he is so engrossed in manipulating his symbols that he is completely oblivious to the external world, then of course I would be certainly astonished to discover that my friend John Searle has disappeared and has been replaced by a Chinese-speaking, Chinese-understanding person. If Searle had actually gone to the trouble of looking at that version of his thought experiment, it would not have been as obvious to people anyhow.
Sure, you could ask the robot questions, observe its behavior, and ascribe it intentionality, as you wrote in your paper. But to ascribe intentionality does not mean that it really has intentionality. There is not any original, intrinsic intentionality. The intentionality that gets ascribed to complex intentional systems is all there is.
When computers came along, we began to realize that there could be systems that did not have an infinite regress of homunculi but had a finite regress of homunculi. The homunculi were stupider and stupider and stupider, so finally you can discharge the homunculus, you can break it down into parts that are like homunculi in some regards, but replaceable by machines. They can get by with systems that are only a little bit like homunculi. If we try to do it with rules forever, we have an infinite regress of rules.
This is not an infinite regress.
HUBERT L. DREYFUS
Hubert L. Dreyfus was born in 1929. He studied at Harvard and held fellowships
for research in Europe. He taught at the Massachusetts Institute of Technology
from I960 to 1968. Currently he is Professor of Philosophy at the University of
California at Berkeley.
Cognitivism Abadoned
Which brings me, by the way, to cognitivism, because representationalism is not just a thesis of Artificial Intelligence, it is a thesis of cognitivism as I understand it. It seems to me that Artificial Intelligence and then cognitivism as a generalization of Artificial Intelligence is the heir to traditional, intellectualist, rationalist, idealist philosophy. Wittgenstein's Philosophical Investigations were first published in 1953,4 shortly before Newell and Simon started to believe in symbolic representations.
Why do you think they took this approach in cognitive science and
So that, by the time Artificial Intelligence and cognitivists come along, it doesn't even seem to be an assumption anymore. It just seems to be a fact that the mind has in it representations of the world and manipulates those representations according to rules. Their idea was that you could treat a computer as «a physical symbol system,» that you could use the bits in a computer to represent features in the world, organize these features so as to represent situations in the world, and then manipulate all that according to strict rules so as to solve problems, play games, and so forth. The computer made it possible to make the intellectualist tradition into a research program just as connectionism has now made it possible to make the empiricist, associationist tradition in philosophy, which goes from Hume to the behaviorists, into a research program.
You mentioned several times the notions of cognitivistn and cognitive science.
I want to distinguish cognitive science from cognitivism. All those disciplines are naturally converging on this question, and what they do is cognitive science. But cognitivism is a special view. As behaviorism is the view that all you can study is behavior and maybe all that there is is behavior, so cognitivism is the special view that all mental phenomena are fundamentally cognitive, that is, involve thinking.
Cognitivism is the peculiar view that even perception is really thinking, is really problem solving, is really manipulation of representations by rules. I think that there isn't any philosophical argument or empirical evidence that supports it. But that would simply leave cognitive science channeled in different directions.
What role does Artificial Intelligence play in these notions?
Artificial Intelligence is a good example of cognitivism, and it was the first version of cognitivism. It is an example of trying to make a model of the mind, of making something that does what the mind sometimes does, using rules and representations and cognitive operations such as matching, serially searching, and so on. So here is Artificial Intelligence, trying to do what the mind does, using computers in a special way, namely as physical symbol systems performing cognitive operations. So cognitivism is a development out of a certain subpart of Artificial Intelligence.
Do you think that this problem with commonsense knowledge has already been realized in the Artificial Intelligence community?
It totally evades the commonsense knowledge problem and does not gain any followers outside the Carnegie group. Then there is Lenat and his wonderful but, I think, hopelessly misguided attempt to program the whole of human knowledge into the computer. Then there are the people at Stanford who had «commonsense summer.» They were trying to formalize commonsense physics.
Artificial Intelligence has said that there has been a failure in symbolic
Artificial Intelligence. But there is nothing like a unified research program.
Either you have to go back to microworlds and take a domain like the blocks world or the domains of expert systems, which are so limited that they are cut off from commonsense knowledge and have only a very special domain of knowledge, or else you have to put in a vast amount of commonsense knowledge. At the time being, the problem is how you could get this vast amount of knowledge into that superbig memory. I would doubt that you could, because the most important part of commonsense knowledge is not «knowledge» at all but a skill for coping with things and people, which human beings get as they grow up.
Since that point, cognitivism and Artificial Intelligence look new, because now there are two competing research programs within cognitivism and Artificial Intelligence. That becomes another research strategy in Artificial Intelligence. That means now that cognitive science has two tracks. It has cognitivism, which is the old, failed attempt to use rules and representations, which takes up the rationalist-intellectualist tradition.
This looked like a discredited view while rationalists were running the show, but now it looks to many people like a very promising approach to intelligence. Already in 1957 Rosenblatt was working on neural nets. So «neural net Artificial Intelligence» might and, I think, will succeed in getting computers to do something intelligent like voice recognition and handwriting recognition, but no one would think of that as a contribution to cognitive science, because it will clearly not be the way people do it. Then there will be another part of neural network simulation that is a contribution to cognitive science and neuroscience, because it will have different algorithms, the sort of algorithms that could be realized in the brain.
Only now, when symbolic information processing has been given a chance and shows that it is not the way to go, when we have computers big enough to try to do neural nets again, will all sorts of associationist models come back into cognitive science.
I think Artificial Intelligence is a great failure, but I think also there are a few useful applications. When you ask an expert for his rules, he regresses to what he remembers from when he was not an expert and still had to use rules. That is why expert systems are never as good as intuitive experts. That leaves only two areas where expert systems will be interesting, and these are relatively restricted areas.
PUFF system, but it turns out you can algorithmatize the whole thing, and you do not need any rules from experts.
It turns out that the people doing the job of customizing mainframe computers had to make a lot of calculations about the capacities of components on the basis of the latest manuals, and so, naturally, machines can do that better. There were no intuitive experts who had a holistic view of configuration. Another example is the system that the Defense Department uses, called ALLPS, which was developed by SRI and was not meant to be an expert system. It was not developed by Artificial Intelligence people.
It is an expert system, however, in that it uses rules obtained from people who load transport planes. Two experts had to calculate for many hours to do it. Now, one program can do it in a few minutes. That's a real success, and it is real Artificial Intelligence, a real expert system.
Feigenbaum once said that expert systems would be the second and the real computer revolution. Feigenbaum has a complete misunderstanding of expertise, again a rationalist misunderstanding, as if experts used rules to operate on a vast body of facts that they acquire. Then expert systems would eventually take over expertise and wisdom, vastly changing our culture. Lots of people said, «Well, it will fail to produce intelligent systems in ten years, but it will produce lots of other interesting things.» I haven't heard any interesting things that have come out of the Fifth Generation, and it's been around for five or six years, I think.
JERRY A. FODOR
Jerry A. Fodor was born in 1935 and attended Columbia College, Princeton
University, and Oxford University. He holds a Ph.D. in philosophy from Princeton
University (I960). From 1959 to 1986 he taught at MIT, first as Instructor and
then as Assistant Professor (1961-63) in the Department of Humanities and as
Associate Professor in the Departments of Philosophy and Psychology (1963-69).
From 1969 onward he has been Professor in the Departments of Philosophy and
Psychology. In 1986, he became Distinguished Professor and in 1988 Adjunct Professor at CUNY Graduate Center. Since 1988, he has been State of New Jersey
Professor of Philosophy at Rutgers University.
The Folly Of Simulation
I talked a lot with graduate students about psychology, and so I got involved in doing some experiments. For fairly fortuitous reasons, I picked up some information about philosophy, linguistics, and psychology. It turned out that my own career had elements of the various fields, which converged to become cognitive science, except computer science, about which I know very little.
In your opinion, which sciences are part of the enterprise of cognitive
If you are a cognitive psychologist and know a little bit about philosophy of mind, linguistics, and computer theory, that makes you a cognitive scientist. If you are a cognitive psychologist but know nothing about these fields, you are a cognitive psychologist but not a cognitive scientist.
So cognitive science is dominated by cognitive psychology?
Traditionally, cognitive psychologists have had the view of the world that experimental psychologists are inclined to. Cognitive science is cognitive psychology done the way that cognitive psychology naturally would have been done except for the history of academic psychology in America, with its dominant behaviorist and empiricist background.
The relation to cognitive science is actually fairly marginal, except insofar as what you learn from trying to build intelligent mechanisms, to program systems to do things, may bear on the analysis of human intelligence. You have to distinguish between that very profound idea, the central idea of modern psychology, which is endorsed by most cognitive science people and by most Artificial Intelligence people, and any particular purposes to which that idea might be put. Artificial Intelligence attempts to exploit that picture of what mental processes are like for the purpose of building intelligent artifacts. It shares the theory with cognitive science viewed as an attempt to understand human thought processes, but they are really different undertakings.
The project of simulating intelligence as such seems to me to be of very little scientific interest.
You don't do physics by trying to build a simulation of the universe. It may be in principle possible, but in practice it's out of the question to actually try to reconstruct those interactions. What you do when you do science is not to simulate observational variance but to build experimental environments in which you can strip off as many of the interacting factors as possible and study them one by one. Simulation is not a goal of physics.
What you want to know is what the variables are that interact in, say, the production of intelligent behavior, and what the law of their interaction is. Once you know that, the question of actually simulating a piece of behavior passing the Turing Test is of marginal interest, perhaps of no interest at all if the interactions are complicated enough. The old story that you don't try to simulate falling leaves in physics because you just could not know enough, and anyway the project would not be of any interest, applies similarly to psychology. The latter is the strategy of cognitive psychology.
You mentioned the Turing Test. If you don't think that simulation is the goal of cognitive science, then you are not likely to be impressed by the Turing Test.
I don't think there is any interesting answer to the question of what we should do in cognitive science that would be different from the answer to the question of what we should do in geology. The idea that mental processes are transformations of mental representations.
Do you think the question of intentionality does not matter at all?
Turing idea does not answer that question and is not supposed to. That does not answer the question of what gives our thoughts truth conditions, and that is the problem of intentionality, basically. Turing's answer seems to me to be the right one, namely that the states that have truth conditions must have syntactical properties, and preservation of their semantic properties is determined by the character of the operations on the syntactical properties. Turing talks about intelligence, not about intentionality.
Artificial Intelligence is about intelligence, not about intentionality. I think people in Artificial Intelligence are very confused about this. But Searle claims that you cannot say anything about intelligence without intentionality. The main idea is to exploit the fact that syntactical properties can preserve semantic properties.
That is the basic idea of proof theory. Searle is perfectly right when he says that the Turing enterprise does not tell you what it is for a state to have semantic properties. Whatever the right theory about intentionality is, it is not what cognitive science is going on about. I think you can have a theory of intentionality, but this is not what cognitive science is trying to give.
If we assume intentionality, the question becomes, Can we have a syntactical theory of intelligence?.
In their criticism, they mainly referred to people like Minsky and
We are at the very margins of exploring a very complicated idea about how the mind might work. The fact that the Artificial Intelligence project has failed does not show anything about the syntactical account of cognitive processes. What Bert and John have to explain away are all the predictions, all the detailed experimental results that we have been able to account for in the last twenty years. Simulation failed, but simulation was never an appropriate goal.
People like Dennett or Hofstadter3 believe in simulation, and they have a very strong view of it. They say that a good simulation is like reality. When you deal with cases where the surface phenomena are plausibly viewed as interactions, you don't try to simulate them. As I said, Turing had the one idea in the field, namely that the way to account for the coherence of mental processes is proof-theoretic.
Connectionism consists of the thought that the next thing to do is to give up the one good idea in the field.
Why it has become so fashionable is an interesting question to which
One of the things connectionism really has shown is that a lot of people who were converted to the Turing model did not understand the arguments that converted them because the same arguments apply in the case of connectionism. It seemed to me that they have a series of powerful arguments, for instance, that the human brain, on the basic level, is a connectionist network. The question of what the causal structure of the brain is like at the neurological level might be settled in a connectionist direction and leave entirely open what the causal structure of the brain is like at the intentional level.
If the hardware that implements these discrete states is diffuse, then to that extent the device will exhibit graceful degradation. Nobody, in fact, has a machine, either the classical or the network kind, that degrades in anything like the way people do. In the sense in which graceful degradation is built into a network machine, it has to do with the way the network is implemented. It is not going to be built into a network machine that is implemented in a Turing machine, for example.
In the only interesting cases, learning is not statistical inference but a kind of theory formation. And that is as mysterious from the point of view of network theory as it is from the point of view of the classical theory. I think connectionist networks will just disappear the way expert systems have disappeared. As for the future of cognitive science, we are about two thousand years from having a serious theory of how the mind works.
Did you say that expert systems will disappear?
No one takes them seriously in psychology. People in computer science have a different view. I said in psychology. In psychology, this program has never been very lively and is now conclusively dead.
My guess is that connectionist models will have the same fate. Connectionist models do have some success with pattern recognition. Up to now, we have been thinking about these devices as inference devices, devices that try to simulate the coherence of thought. What you have is not a new theory of learning but a standard statistical theory of learning, only it is embedded in an analog machine rather than done on a classical machine.
It will explain learning only insofar as learning is a process of statistical inference. The problem is that learning is no process of statistical inference. Roughly speaking, you construct a theory of gases for gases that don't have all the properties that real gases have, and insofar as the model fails, you say it is because of interactions between the properties the theory talks about and the properties that the theory treats as noise. Now, that has always been the rational strategy in science, because the interesting question is not whether to do it but which idealization to work with.
I assume that cognitive science is going to work in exactly the same way. You want to do psychology for ideal conditions and then explain actual behavior as ideal conditions plus noise. In the case of perceptual psychology, the advance has been extraordinarily rich. In logic, we idealize again to the theory of an ideally rational agent and explain errors as interactions.
These idealizations have to be tested against actual practice, and so far
I don't know any reason at all to suppose that the idealizations that have been attempted in cognitive theory are not appropriate or not producing empirical success. I would have thought that the empirical results suggest exactly the opposite inference. Let me just say a word about background information. To begin with, I think that Bert is absolutely right that we have no general theory of relevance, that we don't know how problems are solved when their solution depends on the intelligent organism's ability to access relevant information from the vast amount we know about the world.
That is correct, and it is one of the problems that will eventually test the plausibility of the whole syntactical picture. On the other hand, it is an empirical question how much of that kind of capacity is actually brought to bear in any given cognitive task. Bert is inclined to take it for granted that perception is arbitrarily saturated by background information. That is a standard position in phenomenological psychology.
That means that you can construct computational models for those processes without encountering the general question of relevance, of how background information is exploited. If you look at the successes and failures of cognitive science over the last twenty or thirty years, they really have been in those areas where this isolation probably applies. Searle uses the example "Do doctors wear underwear?" He is claiming that the answer is not found by an inference process. To go from the fact that it is a serious question to the conclusion that there is no inferential answer to it seems a little quick.
If there really is an argument that shows that no inferential process will work, then the computational theory is wrong. I could tell you experimental data for a week that look like it required an inferential solution. If you think that the fact that it is stereotyped shows that it is noninferential, you have to explain away the experimental data. Even Bert thinks that the fact that after you have become skilled you have no phenomenological, introspective access to an inferential process establishes only a prima facie case that it is not inferential.
The trouble is, there are all these data that suggest that it is, that a process like speech recognition, which is phenomenologically instantaneous, is nevertheless spread out over very short periods of time, goes through stages, involves mental operations, and so on. There are a lot of data that suggest that introspection is not accurate, that people are wrong about whether or not these tasks are inferential. That does not show that perception or action are not inferential. What it shows is that some of the things that you know are not accessible to the computational processes that underlie perception and action. That shows that if the processes are inferential, they are also encapsulated. Take visual illusions.
There is a pretty good argument that inferential processes are involved in generating the illusions. But knowing about the illusions does not make them go away. What that shows is, among other things, that what is consciously available to you and what is inferential don't turn out to be the same thing.
JOHN HAUGELAND
John Haugeland attended Harvey Mudd College from 1962 to 1966 and studied
at the University of California at Berkeley from 1966 to 1967 and again, after
working as a teacher with the Peace Corps in Tonga, from 1970 to 1973. He
earned his Ph.D. degree from Berkeley in 1976 and began his career at the University of Pittsburgh in 1974 as an instructor. He became Professor of Philosophy
at Pittsburgh in 1986.
Farewell of GOFAI ?
What are the main ideas of cognitive science, in contradistinction to
I certainly think that Artificial Intelligence is a part of cognitive science. There is one way of thinking about cognitive science that takes the word cognitive narrowly and takes Artificial Intelligence as the essence of it. There are other conceptions of cognition in which ideas from Artificial Intelligence play a smaller role or none at all.
People talk about cognitive linguistics,1 although this is again a disputed characterization. We still don't know just how cognition is relevant to linguistics. One importance of both linguistics and psychology for Artificial Intelligence is setting the problems, determining what an Artificial Intelligence system must be able to achieve. Some people also include cognitive anthropology, but that is secondary.
And, of course, philosophy is often included. I believe that philosophy belongs in cognitive science only because the «cognitive sciences» have not got their act together yet. If and when the cognitive sciences succeed in scientifically understanding cognition and the mind, then philosophy's role will recede. Some people, especially in the computer science community, don't understand what role philosophy has to play in this enterprise.
Once you have that view, once you have what Kuhn calls a paradigm,2 then all that remains is normal science. I think that philosophy is still pertinent and relevant, however, because I don't think that the field has arrived at that point yet. Indeed, we are quite possibly in a revolutionary period in this decade. You coined the term Old-Fashioned Artificial Intelligence, so you seem to think of connectionism as the new direction.
Which periods would you distinguish?
First, there is the period of pre-Artificial Intelligence, before it really got going, from the middle forties to the middle fifties, which was characterized by machine transla-tion, cybernetics, and self-organizing systems. Starting in the middle to late fifties, the physical symbol systems hypothesis began to take over. Minsky's and Papert's book3 did have an important effect on network research, but I think the symbol processing approach took over because it was, on the face of it, more powerful and more flexible.
It is not a simple question, because there are different ways in which to understand what is meant by «for» cognitive science and Artificial
Intelligence. Simon tells me that it is not, that they are trying to incorporate it in their models. In his opinion, the commonsense problem is one proof that Artificial Intelligence with the physical symbol systems hypothesis will never succeed. Nobody is in a position to claim that anything has been proved, neither Dreyfus nor Simon.
But I do believe that, in the first place, the problem of building machines that have commonsense is nowhere near solved. And I believe that it is absolutely central, in order to have intelligence at all, for a system to have flexible, nonbrittle commonsense that can handle unexpected situations in a fluid, natural way, bringing to bear whatever it happens to know.
My conviction that the physical symbol systems hypothesis cannot work is mostly an extrapolation from the history of it. I don't think that you could produce an in-principle argument that it could not work, either like Searle or like Dreyfus . I am mostly sympathetic to Dreyfus's outlook. I think it is right to focus on background, skills and practice, intuitive knowledge, and so on as the area where the symbol processing hypothesis fails.
What about the problem of skills that you mentioned?
What you call cognitive or not depends on how broadly or how narrowly you use the term cognitive. This is the difference in the conception of cognition I mentioned at the beginning of the interview. I think a psychologist or an Artificial Intelligence professor is quite right to be exasperated by any suggestion that «it just does it» meaning «so don't try to figure it out.» That is crazy. However, none of that means that the way it does it will ultimately be intelligible as symbol processing.
Newell and Simon would accept and Rumelhart and McClelland would not, nor would I, nor would Dreyfus. Either affective phenomena, including moods, can be partitioned off from something that should be called more properly «cognitive,» so that separate theory approaches would be required for each and there would be no conflict between them. Or we cannot partition off moods and affect , in which case symbol processing psychology cannot cope with it.
Do you think that physical symbol systems hypothesis could cope with the problem of qualia?
I don't know what to think about the problem of qualia. I am inclined to suspect that that problem can be partitioned off. Those are integral to intelligence and skills. Let's turn to connectionism.
The rise of connectionism has become a power in the field since then. But the idea that intelligence might be implemented holistically in networks, in ways that do not involve the processing of symbols, has been around, in a less specific form, for a long time. There are three kinds of attraction for connectionism. The hardware response time of neural systems is in the order of a few milliseconds.
If you had all kinds of time, you could implement any kind of system in that hardware. So the one-hundred-steps constraint ties you more closely to the hardware. It is one attraction of connectionism that it is an approach fitting this hardware apparently much more naturally.
That would be the difference between parallel architecture and von
Neumann architecture. But people like Newell and Simon say that they now use parallel processing. They claim that on a certain level, connectionism, too, has to introduce symbols and to explain the relation to the real world. At first I thought the main difference was between serial and parallel architecture.
Not a von Neumann machine, but a LISP machine. The deeper way to put the difference is that in a connectionist network, you have very simple processors that, indeed, work in parallel. The way in which those strengths vary on all those connections encodes the knowledge, the skill that the system has. So that, if one component in a symbol processing machine goes bad, everything crashes, whereas one wrong connection in a network can be compensated, which is similar to the brain.
The architecture is not at all a hardware issue. The hardware itself will have an architecture, of course, but the notion of architecture is not tied to the hardware. The question of whether it is hardware or software is not a deep question. The difference between a von Neumann and a LISP machine is that a LISP machine is a function evaluator.
A von Neumann machine is a machine built around operations performed on specified locations in a fixed memory. The important thing that von Neumann added that makes the von Neumann machine distinctive is the ability to operate on values, especially addresses, in the memory where the program is located. Thus, the program memory and the data memory are uniform. It has nothing to do with the hardware, except that the one-hundredsteps constraint keeps it close to the brain.
In fact, almost all connectionist research is currently done on virtual machines, that is, machines that are software implemented in some other hardware. They do use parallel processors with connections when they are available. The trouble is that machines that are connectionist machines in the hardware are fairly recent and quite expensive. In a production system, even if it is parallel, you do have somebody in charge, at least at higher levels.
Just as you can get your PC to be a LISP machine, you can also get it to be a virtual connectionist machine.
The first attraction of connectionism you mentioned is that the hardware is closer to the human brain. If you have a noisy or defective signal, the system treats that signal as if it were a noisy or deformed variant of the perfect prototype. And this really is the basis of a great deal of what the systems do. You take the pattern to be one part of the signal, and the identification of the pattern addresses a lack or apparent failure of classical Artificial Intelligence systems.
So this is the second point of attraction, of which hardware is one aspect. The third attraction is that classical Artificial Intelligence has more or less gone stale for a dozen years. It could be that when our children look back on this period fifty years from now, they will say that only the faint of heart were discouraged in this period of stagnation, which was a consolidation period with successes continuing afterward. Or it could be that this is the beginning of the end of classical Artificial Intelligence.
The symbol processing hypothesis cannot incorporate connectionism. So the «partially» would be «almost completely,» except maybe for some peculiarities like reflexes, and physical abilities like walking, and so on. There is symbol processing, and it has to be implemented in the brain.
GEORGE LAKOFF
George Lakoff received a Bachelor of Science in mathematics and English literature from MIT in 1962 and a Ph.D. in Linguistics from Indiana University in 1966.
He taught at Harvard and the University of Michigan before his appointment as
Professor of Linguistics at the University of California at Berkeley in 1972. In the
early stages of his career, he was one of the developers of transformational grammar and one of the founders of the Generative Semantics movement of the
1960s. Since 1975, after giving up on formal logic as an adequate way to represent
conceptual systems, he has been one of the major developers of cognitive linguistics, which integrates discoveries about conceptual systems from the cognitive
sciences into the theory of language.
Embodied Minds and Meanings
Meanings
Probably I should start with my earliest work and explain how I got from there to cognitive science. Jakobson on the relationship between language and literature. My undergraduate thesis at MIT was a literary criticism thesis, but it contained the first story grammar. That led to the study of story grammar in general.
Out of that came the idea of generative semantics. I asked the question of how one could begin with a story grammar and then perhaps generate the actual sentences that express the content of the story. In order to do this, one would have to characterize the output of the story grammar in some semantic terms. So 1 asked myself whether it would be possible for logical forms of a classical sort to be deep structures, actually underlying structures of sentences.
in Linguistics from Indiana University in 1966. Professor of Linguistics at the University of California at Berkeley in 1972.
He taught at Harvard and the University of Michigan before his appointment as
Since 1975, after giving up on formal logic as an adequate way to represent conceptual systems, he has been one of the major developers of cognitive linguistics, which integrates discoveries about conceptual systems from the cognitive sciences into the theory of language. «Be hot.» We eventually found evidence for things of this sort.
I had a commitment to the study of linguistic generalizations in all aspects of language. In semantics, I assumed that there are generalizations governing inferential patterns of the meaning of words, of semantic fields, and so on. So I took the study of generalizations as a primary commitment. I also assumed that grammar has to be cognitive and real.
These are systems in which arbitrary symbols are manipulated without regard to their interpretation. So I had a secondary commitment to the study of formal grammar. I also had a «Fregean commitment,» that is, that meaning is based on truth and reference. Chomskian-Fregean commitment on the other hand.
The history of my work as a generative linguist has to do with the discovery of phenomena where you could not describe language in fully general, cognitively real terms using the Chomskian-Fregean commitments. Jim McCawley, in 1968, came up with an interesting sentence that could not be analyzed within classical logical forms. In 1968,1 suggested moving away from the description of logical form to model theory.
You were still at MIT at that time?
This was about the same time that Richard Montague was suggesting that possible world semantics ought to be used for natural language semantics, as was Ed Keenan from the University of Pennsylvania. Partee sat in on Montague's lectures at UCLA, and the three of us, Partee, Keenan, and myself, were the first ones to suggest that possible world semantics might be useful for doing natural language semantics. One could not use possible world semantics so easily for natural language semantics. By 1974, it became clear that the Chomskian theory of grammar simply could not do the job.
«Around 1975, a great many things came to my attention about semantics that made me give up on model theoretic semantics. One of these was Charles Fillmore's work on frame semantics,8 in which he showed that you could not get a truth-conditional account of meaning and still account correctly for the distribution of lexical items. Eleanor Rosch's work on prototype theory came to my attention in 1972, and in 1975 I got to know her and Brent Berlin's work on basic level categorization. I attended a lecture at which Rosch revealed fundamental facts about basic level categorization, namely that the psychologically basic categories are in the middle of the category hierarchy, that they depend on things like perception and motor movement and memory.
Rosch had shown that the human body was involved in determining the nature of categorization. This is very important, because within classical semantics of the sort taken for granted by people both in generative grammar and in Fregean semantics, meaning, logic is disembodied, independent of the peculiarities of the human mind. That suggested that neurophysiology entered into semantics. This was completely opposed to the objectivist tradition and Fregean semantics.
The thing that really moved me forever away from doing linguistics of that sort was the discovery of conceptual metaphor in 1978. Ortonyi's Metaphor and Thought11 showed basically the role of metaphor in everyday language. This meant that semantics cannot be truth conditional, it could not have to do with the relationship between words and the world, or symbols and the world. It had to do with understanding the world and experiences by human beings and with a kind of metaphorical projection from primary spatial and physical experience to more abstract experience.
Around the same time, Len Talmy and Ron Langacker began discovering that natural language semantics require mental imagery. They showed that semantic regularities required an account of image schemata or schematic mental imagery. Reading their work in the late seventies, I found out that it fit in very well with all the other things I had discovered myself or found out through other people. That entailed that you could not use the kind of mathematics that Chomsky had used in characterizing grammar in order to characterize semantics.
The reason was, as we had first shown in generative semantics, that seman-tics had an effect on grammar, and we tried to use combinatorial mathematics to characterize logical form. We thought that the use of formal grammars plus model theory would enable us to do syntax and semantics and the model theoretic interpretation. However, if meaning is embodied, and the mechanisms include not just arbitrary symbols that could be interpreted in terms of the world but things like basic level categories, mental images, image schemas, metaphors, and so on, then there simply would be no way to use this kind of mathematics to explain syntax and semantics. Our work in cognitive linguistics since the late seventies has been an attempt to work out the details of these discoveries, and it changed our idea not only of what semantics is but of what syntax is.
This view is attacked as relativistic by people holding an objective semantics position. If you take the idea of total relativism, which is the idea that a concept can be anything at all, that there are no constraints, that «anything goes» in the realm of conceptual structure, I think that is utterly wrong. Rather, there are intermediate positions, which say that meaning comes out of the nature of the body and the way we interact with the world as it really is, assuming that there is a reality in the world. We don't just assume that the world comes with objectively given categories.
We impose the categories through our interactions, and our conceptual system is not arbitrary at all. Those are very strong constraints, but they do not constrain things completely. They allow for the real cases of relativism that do exist, but they do not permit total relativism. So, in your view, there is a reality outside the human mind, but this reality is perceived by the human mind.
Therefore you cannot establish reality only by logical means. The conceptual system, the terms on which you understand the world, comes out of interaction with the world. That does not mean that there is a God's-eye view that describes the world in terms of objects, properties, and relations. Objects, properties, and relations are human con-cepts.
We impose them on the world through our interaction with whatever is real.
What happened when you got these ideas about embodiment?
We now have some very solid research on non-Western languages showing that conceptual systems are not universal. Even the concept of space can vary from language to language, although it does not vary without limits but in certain constrained ways. We have been looking at image schemas. There seems to be a fixed body of image schemas that turns up in language after language.
We are trying to figure out what they are and what their properties are. I noticed that they have topological properties and that each image schema carries its own logic as a result of its topological properties, so that one can reason in terms of image schemas. The spatial inference patterns that one finds in image schemas when they apply to space are carried over by metaphor to abstract inference patterns. There have been two generations of cognitive science.
They were internal representations of some external reality. That second generation of cognitive science is now being worked out. It fits very well with work done in connectionism. The physical symbol system hypothesis was basically an adaptation of traditional logic.
We are trying to put the results of cognitive linguistics together with connectionist modeling. One of the things I am now working on is an attempt to show that image schemas that essentially give rise to spatial reasoning and, by metaphor, to abstract reasoning, can be characterized in neural terms. I am trying to build neural models for image schemas. So far, we have been successful for a few basic schemas.
Traditional generative phonology assumed the symbol manipulation view.
All over the country?
Those are two different things. In terms of cognitive science, the applications have been much more modest, and only a small number of people are working in this field. So, although connectionism has emerged around the world as an engineering device, it has merely begun to be applied to cognitive science. Some people in connectionism claim that it could be used to model the lower level processing of the human mind, like vision, pattern recognition, or perception, but that for more abstract processes like language the symbol manipulation approach could still be useful.
The people who are claiming that are those who come out of Artificial
Intelligence, raised in the symbol system tradition. They don't consider connectionism to be biological, they consider it to be another computer science view. What we are trying to do is show how meaning can be grounded in the sensory-motor system, in the body, in the way we interact with the world, in the way human beings actually function. People who try to adapt the physical symbol system view by adding a little bit of connectionism to it are not seriously engaged in that research.
PDP book18 is that radial categories arise naturally from connectionist networks. Technically this gives rise to radial categories. In the models we have we begin to see that the brain is structured in such a way that modules of retinal maps could naturally give rise to cognitive topology. At least, now we can make a guess as to why we are finding that cognitive topology exists.
Moreover, one of the things that Rumelhart discovered is that analogical structuring is a natural component of neural networks. That could explain why there should be metaphor, why the abstract reasoning is a metaphorical version of spatial reasoning. Computer models are secondary, but they may allow us to begin to understand why language is the way it is.
In connectionism, we need to get good models of how image schemas and metaphors work, and good models of semantics, fitting together the kind of work that has come out of cognitive semantics. We also need to put together a much more serious theory of types of syntax. Langacker's work on cognitive grammar is one kind of a beginning, as is my work in grammatical construction theory and Fillmore's work in construction grammar.
Dangerous Things,20 we have both pointed out that the entire notion of epistemology has to be changed within philosophy in such a way that it completely changes most philosophical views. Not only the analysis of language but also the relationship between ontology and epistemology. These systems fit our experiences of the world in different ways, as there is more than one way to fit our experiences perfectly. One consequence is that conceptual systems turn out not to be selfconsistent.
In general, human conceptual systems must be thought of as having inherently inconsistent aspects. In law, there is a commitment to objective categories, to objective reality, and to classical logic as the correct mode of human reasoning. In general, the political and social sciences are framed in terms of the old cognitive science, the old objectivist philosophy.
What do you mean exactly by »objectivist philosophy«?
It says that the world is made up of objects, these have objectively given properties, and they stand in objective relations to one another, independent of anybody's knowledge or conceptual system. Categories are collections of objects that share the same properties. Categories not only are objects and properties but they are out there in the world independent of the mind. That means that reason has to do with the structure of the world.
If all that is false, if reason is different, if categories are not based on common properties, then our view of reality has to change. Moreover, our view of thought and language has to change. The idea that our mental categories mirror the world, that they fit what is out there, turns out to be wrong. Correspondingly, our names of mental categories are supposed to be thereby names of categories in the world, and our words can thereby fit the world, and sentences can objectively be true or false.
If meaning depends instead on understanding, which in turn is constructed by various cognitive means, then there is no direct way for language to fit the world objectively. It must go through human cognition. Now, human cognition may be similar enough around the world that in many cases no problems arise.
JAMES L. MCCLELLAND
Toward a Pragmatic Connectionism
When I finished graduate school and went to San Diego, which was in
This was where I became a cognitive scientist as opposed to a cognitive psychologist, I would say. I was no longer doing experiments simply to make inferences from them, but I was also trying to formulate really explicit computational theories. I had new conceptions, and I needed to formulate a different theoretical framework for them.
Typical members of a cognitive psychology department stay very close to the data. They might come up with a model explaining the data from this or that experiment, but mostly they stay close to the facts. Given where I come from, I don't want to lose track of the data. But the notion that one needs an explicit, computationally adequate framework for thinking about the problem at hand, and that one then tries to use this framework to understand the data, is really the essence.
Computational model means that you have to be able to articulate your conception of the nature of the process you are interested in in such a way as to specify a procedure for actually carrying out the process. Now, the word computational process gets associated with this, as opposed to just process, because it is ordinarily understood, in cognitive circles, that these activities are being done removed from the actual external objects about which they are being done. This is about the whole business of the physical symbol system hypothesis. You are doing some mental activity with a certain result not on physical objects, although there is a physical substrate for the process, but on things that represent those objects, in the same way that we use numbers to calculate bushels of wheat instead of piling up the wheat.
In a computational model, we try to figure out what representations we should have, based on what information, and what process applies to these representations. This is our conception of a computational model, but it also carries with it, for me, the notion of explicitness, of specifying something that allows you to understand in detail how the process actually goes, as opposed to articulating a rather vague, general, overarching conceptual-ization such as a Piagetian theory, where you have some very interesting but often exceedingly vague concepts that are difficult to bring into contact with facts. Your definition of a computational model and of the intermediate abstraction seems to contradict the new connectionism, which goes down to the neurophysiological level. One idea of the computational model is that you can abstract from the hardware, that only the intermediate level of symbol manipulation matters.
The result of this mental process can be new representations. At some level, this sounds like the physical symbol system hypothesis. Representationalism is essential in connectionism. So the main difference is not about the view that there are representations, that there is symbol manipulation going on.
A symbolic relation to objects in the world simply means that the representation is not the thing itself but something that functions as the object for computational purposes. It can be used as the basis for further processes, which might result in the formation of new representations, just as we have the number 2 to represent the two bushels of wheat from one field and the number 5 to represent the five bushels from another field. The things that come out of it are new representations. But what those representations are, whether they are symbols in the sense of Newell's notion of symbols or not, that is where there is some disagreement.
My feeling is that the attempt to understand thinking may progress with the conventional notion of symbol, but that it is limited in how far it can go and that it is only an approximation of the real right way to think about representations. If we are able to understand more clearly how we can keep what the people who think about symbols have been able to achieve, while at the same time incorporating new ideas about what mental representations might be like, we would be able to make more progress in understanding human thought. A symbol is generally considered to be a token of an abstract type that gains power from pointing to other knowledge. The symbol itself is not the information but only a pointer to it.
When you process the symbol, you have to search down through some complex tree to find that information. That limits the ability of a symbol-oriented system to exploit the contents of the objects of thought. If the objects of thought in the representations we have of them could give us somehow more information about those contents, this would increase the flexibility and the potential context sensitivity of the thought processes. We want to capture what we can with these representations, but we think that we need to go beyond them to understand how people can be insightful and creative, as opposed to mechanical symbol processors.
One thing I share with the people who formulated the physical symbol system hypothesis is a commitment to sitting down and seeing if something can actually work. I myself am not so expert in the kinds of arguments those people propose, but it is actually my suspicion that many of the things Dreyfus points to are things that result from the limitations of symbols as they are currently conceived of by people like Newell, as opposed to some more general notion of thought as manipulation of representations. Once we achieve a superior notion of what representations are like we will be in a much better position to have a theory that won't founder on that particular kind of criticism. One point of that criticism is the so-called commonsense problem.
Symbol manipulation cannot solve the problem of this huge amount of commonsense knowledge that people have as background. The basic notion of connectionism is that a representation is not a static object that needs to be inspected by an executive mechanism. If you have a bunch of processing units, and these units have connections to other processing units, and a representation is a pattern of activity over the processing units, then the relations of the current pattern of activity to patterns of activity that might be formed in other sets of units are in the process themselves, in the connections among the units. The notion that a mental representation is not just a location in the data structure but a pattern of activity that is interacting with the knowledge by virtue of the fact that the knowledge is in the connections between units gives us the hope that we will be able to overcome this particular limitation.
I would say that we do not yet know the limits of what can be achieved by the kinds of extensions to statistical methods that are embodied in connectionist models.
What
Second, connectionist models are now being applied extensively to the task of extracting structure from sequences. In connectionism, we have a similar kind of situation. We have a basic notion about a set of primitives, a notion of what the general class of primitives might be like. Then we formulate specific frameworks within this general «framework generator,» if you like, which can then be used for creating specific models.
At the most abstract level, the idea is that mental processes take place in a system consisting of a large number of simple computational units that are massively interconnected with each other. The knowledge that the system has is in the strength of the connections between the units. The activity of the system occurs through the propagation of signals among these units. Boltzmann machine, which is a very specific statement about the properties of the units, the properties of the connections, and maybe about the way knowledge is encoded into the connections.
The Boltzmann machine can then be instantiated in a computer program, which can be configured to apply to particular, specific computational problems. What we did in the handbook that Rumelhart and I published5 was collect together a few of the fairly classical kinds of frameworks for doing connectionist information processing. They are statements about the details of the computational mechanisms, which can then be instantiated in computer programs. Not the computer programs are essential, but the statement of the principles.
We do feel that giving people some firsthand experience maybe gets something into their connections .
They try to get the model to exhibit behavior that captures other known properties of that physical circuit, perhaps in order to study particular neurons in a particular area with the model incorporating facts about the distribution of connections and so on. Other people are building connectionist networks because they want to solve a problem in Artificial Intelligence. They think that the connectionist framework will be a successful one for doing this. In that case, there is absolutely no claim made about the status of these things vis-avis human cognition.
There is often the implicit claim that humans are probably doing something like this if you believe that this is the way to build an artificial system, because humans are so much better than most existing artificial systems in things like speech recognition. Of course, when you adopt a particular framework, you make an implicit statement that it will be productive to pursue the search for an understanding within the context of this framework. «Therefore we can reject the whole framework out of which it was built». A similar thing happened to symbol processing.
I really don't subscribe to the notion that all you need to do is to take a back propagation network and throw some data at it and you will end up with a remarkable system. But we are at the point where several state-of-the-art methods for doing things like phoneme or printed character recognition are successfully drawing on connectionist insights.
What is the reason for the rise of connectionism in the last few years?
The main ideas go back to the 1960s. He said something like, «Well, I got six hours of computer time between midnight and 6 a. » The equation might lead to a breakthrough in what kind of training can occur in connectionist networks. The fact that people have workstations on their desks has created the opportunity to explore these ideas and has led to key mathematical insights.
The computational resources contribute to the development of a critical mass. You have to have knowledge about psychology, mathematics, computer science, and philosophy as well. What we have to do is create environments that bring together just the right combinations of things to allow people to make progress. At another institution, they may have a much better chance to make a connection with linguistics, because there are linguists who have the right kinds of ideas about language and connectionists who have the right kinds of tools.
Rumelhart had incredible mathematical modeling skills, and I brought an awareness of the details of the data in a particular area. I learned how to model from working with him, how to get this whole thing to work, how to account for the data I knew something about. Elman said, «Hey, I can see how this relates to language,» and he and I started to work together on language.
ALLEN NEWELL
Allen Newell one of the founders of the fields of Artificial Intelligence and cognitive science, died on 19 July 1992 at the age of sixty-five.
Newell's career spanned the entire computer era, which began in the early
1950s. The fields of Artificial Intelligence and cognitive science grew in part from
his idea that computers could process symbols as well as numbers and, if programmed properly, could solve problems in the same way humans do. In cognitive
science, he focused on problem solving and the cognitive architecture that supports intelligent action in humans and machines. In computer science, he worked
on areas as diverse as list processing, computer description languages, hypertext
systems, and psychologically based models of human-computer interaction.
The Serial Imperative
Computer science as a discipline does not show up until the early or midsixties. Before that, computers were viewed as engineering devices, put together by engineers who made calculators, where the programming was done by mathematicians who wanted to put in mathematical algorithms. As the number of scientists was fairly small, lots of things that are regarded today almost as separate fields were thrown together.
How did this lead to Artificial Intelligence?
Back in the late forties and fifties, there were already speculations about computers, cybernetics, and feedback, about purpose in machines and about how computers might be related to how humans think. Allen Newell one of the founders of the fields of Artificial Intelligence and cognitive science, died on 19 July 1992 at the age of sixty-five.
University, then worked for the Rand Corporation as a research scientist from
Their discussions on how human thinking could be modeled led Newell to Pittsburgh, where he collaborated with Simon and earned a Ph. Newell joined the CIT faculty in 1961. Carnegie-Mellon's School of Computer Science and elevating the school to worldclass status. Newell, U. He was awarded the National Medal of Science and elevating the school to world class status.
Science at the time of his death, wrote and co-authored more than 250 publications, and ten books. He was awarded the National Medal of Science a month before his death. Place, because the military is in fact the civilian population. So you get this immense turbulence, this mixing effect that changes science.
The shift in psychology that is now called the «cognitive revolution» really happens in the midfifties. First came the war, then came this turbulence, and then big things begin to happen.
Was it difficult for you to work together with people from other fields?
I went to the Rand Corporation in Santa Monica. It was only five or six years later that Herb convinced me to go back to graduate school. I have considered myself as a physicist for a long time, but at Rand I worked on experiments in organizational theory. It makes everyone appreciative of things going on in other areas.
A place like Rand is completely interdisciplinary. Operations research did not exist before the war. At the time I was there, in the fifties, the idea was born that you could look at entire social systems, build quantitative models of them, and make calculations upon which to base policy decisions. Rand is a nonprofit corporation that got its money from the Air Force.
In the early days, the only place where military people were allowed in was the director's office. It exists independently from the government, and its purpose is to provide competitive salaries for scientists, because the government does not pay enough money.
How did you get involved in computer science?
We were deeply immersed in using computers as a simulated environment. In the middle of this emerges the computer as a device that can engage in more complex processes than we had ever imagined. Herb was an economic consultant, but he became very interested in organizations. At this meeting in November 1954, Herb was present.
As soon as I hear us talk, I absolutely know that it is possible to make computers behave in an essentially intelligent way.
The so-called cognitive revolution came later?
The cognitive revolution starts in the second half of the fifties. It was a direct transfer from the analysis being done for radars into models for the humans. In the same year, we wrote about the Logic Theorist.
So you would see the cognitive revolution as a break with behaviorism?
Right, except that some of us, such as Herb and I, were never behaviorists. All this happened in the second half of the fifties. The business schools, which had been totally focused on business, underwent a revolution in which they hired social scientists and psychologists and economists. This revolution was, in large part, due to Herb.
When did you start with your work on the general problem solver
At first, I think, you thought that one could isolate the problem-solving mechanism from knowledge about the world. We had this notion that one could build systems that were complex and could do complex intelligent tasks. The first system we built was called Logic Theorist and was for theorem proving in logic. The reason it was not in geometry was that the diagrams would have increased the momentary burden on the system.
Herb and I have always had this deep interest in human behavior. The issue of always asking how humans do things and trying to understand humans has never been separate. Although we wrote an article in which we talked about it as a theory of human behavior, the connections drawn to the brain are rather high-level nevertheless. So we used the protocols of how humans do tasks and logic.
We thought, Let's write a program that does the tasks in exactly the way humans do it according to the protocol. The paper in Feigenbaum's and Feldman's Computers and Thought is the first paper on GPS that has to do with human behavior. We had this evidence of human behavior, we could see the mechanism, and, given what we then knew about the Logic Theorist, we could build a new version. Later on, one would get concerned that the symbols were formalized things and not commonsense reasoning or language.
The criticism concerning commonsense came later?
One of the things that are true for every science is that discriminations that are very important later on simply have no place in the beginning. They cannot have a place, because only when you know something about the problem, you can ask these questions. On the one hand, you write that problem solving is one of the major intellectual tasks of humans, and we think about intelligence as the ability to solve problems. On the other hand, we succeeded in building machines that could do pretty well in problem solving, but we did not succeed so well in pattern recognition, walking, vision, and so on.
Think of cognitive reasoning in terms of computational demand, and then of speech, which is one-dimensional with noise, and of vision, which is two-dimensional. There are still serious problems with respect to actually extracting the signals, but we do have speech recognition systems. We also have visual recognition systems, but there the computation is still just marginal. Today, people try to get around this bottleneck by parallel processing.
I wonder how much your physical symbol system hypothesis is bound up with serial computation. You once wrote that one of the problems with humans is that they have to do one step after the other, with the excep-tion of the built-in parallelism of the eye. Connectionism is absolutely part of cognitive science. There is a need for seriality in human behavior in order to gain control.
If you worked totally in parallel, what you think in one part of your system could not stop what you are thinking in another part. There are functional reasons why a system has to be serial that are not related to limitations of computation. The physical symbol systems hypothesis, the nature of symbolic systems, is not related to parallelism. You can easily have all kinds of parallel symbol systems.
You can think of connectionist systems as not containing any notion of symbols. So you can, in one sense, say that connectionist systems are systems that are nonsymbolic and see how far you can go when you push them to do the same tasks that have been done with symbol manipulation without their becoming symbolic systems. There is certainly an opposition between the physical symbol system hypothesis and connectionism.
What the connectionists are doing is not just trying to deal with speech and hearing. The usual way of putting them together is to say, Clearly, a symbolic system must be realized in the brain. So it has to be realized by one of these parallel networks. Therefore, symbolic systems must be implemented in systems that are like parallel networks.
A lot of connectionists would not buy that, in part because they believe that connectionist systems do higher-level activities in ways that would not look at all like what we have seen in symbolic systems. Yet production systems are massively parallel systems, not serial systems. This distinction, which is also meant to be a distinction between serial and parallel, vanishes.
STEPHEN E. PALMER
Stephen E. Palmer was born in 1948 in Plainfield, New Jersey. After attending the
public schools in his hometown of Westfield, New jersey, he entered Princeton
University, where he received a B.A. degree in psychology with highest honors in
1970. He did graduate work at the University of California, San Diego, where he
worked with Donald A. Norman and David E. Rumelhart, and received a Ph.D.
in 1975. Upon graduation, he took a position in the Psychology Department at
the University of California at Berkeley, where he has remained ever since. His
field of specialization is visual perception, where he takes an information processing approach to understanding how people perceive the structure of objects and
scenes. He is currently Director of the Institute of Cognitive Studies at Berkeley
and head of the undergraduate degree program in cognitive science.
Gestalt Psychology Redux
My training in cognitive science began as a graduate student in psychology at UC San Diego. With Donald Norman and David Rumelhart we began doing work in cognitive science before there was actually any field by that name, in that we, as psychologists, were constructing a largescale computer model of how people understand language and retrieve information from their memories.
We went up to
We were doing two pieces of cognitive science, in the sense that we were worried not only about human psychology but also about how to do computer simulation models. If it were done today, it would be recognized immediately as cognitive science, but at that time it was just some strange psychologists doing what computer scientists normally do.
As I said, I think that cognitive science probably got started at UC San
The connectionist days came later, actually after I was gone. Back then the major project that would be called cognitive science was the memory model project of Lindsay, Norman, and Rumelhart. By the time I came to Berkeley I had already started to become interested in perception more than in language. I had started out in the memory project, and I came to perception through a kind of back door, via language.
We had a nice theory of the structure of the meanings of many of these verbs, but at that time, nouns had no interesting cognitive structure in the theory. I decided we needed a better representation for nouns, and that was how I got interested in perception. Because once you start to worry about nouns and concrete objects, you start to think about the representation of their physical structure, what they look like, and how you identify them. Gestalt psychologists had been interested in.
I did not have any training in Gestalt psychology because most of the psychologists at UC San Diego were relatively young at the time and did not know the Gestalt literature. Dreyfus, for example, criticizes cognitive science from a Gestalt approach. Real human cognition is more fluid, more flexible, more context-sensitive than classical cognitive science has been able to capture in the kinds of models they use and the kind of paradigm they developed. I see it as a very central and interesting part of what is going on in cognitive science.
Is the difference between the East and West Coasts a coincidence?
The East is where things are very busy and uptight and crammed together. Simon's production systems. The family resemblance between what we were doing and the symbolic systems approach that characterizes the more formal East Coast approach to cognitive science was very strong. It was only later, when McClelland, Rumelhart, and Hinton started the connectionist work, that it got a particularly Gestalt flavor.
How would you outline your critique of the more formal approach and the advantage of a Gestalt approach?
I got interested in Gestalt theory because I was interested in phenomena that Gestalt psychologists had studied, especially contextual phenomena. The idea of holism is essentially that there is global interaction among all the parts and that the whole has characteristics that are quite different from the characteristics of the parts that make it up. You can think of Gestalt psychology as having discovered these phenomena and thus having identified a very important set of problems that need to be solved theoretically. The Gestalt psychologists managed to overthrow the prior prevailing view of perceptual theory, which was the structuralist theory promoted primarily by Wundt7 and his followers.
Eventually, they proposed their own way of theorizing about those things, but these theories were not well accepted. When Kohler8 proposed that electrical fields in the brain were responsible for the holistic interactions, it allowed people to do physiological experiments to try to determine whether these fields in fact existed and whether they had causal properties in perception. Because their physiological hypothesis was wrong, all Gestalt theoretical ideas were discounted. They thought that the brain was a physical Gestalt, that it somehow was a dynamical system that would relax into a state of minimum energy.
The electrical fields they proposed were one way in which this theory could be implemented in the brain, and it turned out to be wrong. That does not say that it is going to be the right theory, but it has given Gestalt theory a new lease on life. It was more or less banished from psychology on the basis of those experiments. There are no electromagnetic fields in the brain that cause perceptions, but there might be neural networks that are working in the same way, which are responsible for the very same kinds of phenomena arising in visual perception.
In one of your articles,10 you define five underlying assumptions of information processing, one of which is the recursive decomposition assumption. The recursive decomposition assumption is the assumption that you can characterize the human mind in terms of a functional black box, in terms of a mapping between inputs and outputs, and that then you can recursively decompose that black box into a set of smaller and simpler black boxes that are connected to one another by a flow diagram of some sort. The mind is probably a nearly decomposable system in the sense that Simon defined, down to some level. The decomposition is smooth only for a few levels, and then you get these extremely strongly interacting sets of neuronlike elements that define that module, and it does not make sense to define intermediate levels or structures within that module.
This is the point where things get very Gestaltlike. It might be that you have a module for recognizing faces that works as a Gestalt, that is, a single black box in there that just takes its input, does something mysterious with it, and in this massively interactive and parallel way produces an output. But those modules are probably put together in ways that are understandable, so that you can in fact do a certain amount of decomposition. Smolensky's notion that you can build an information processing system out of connectionist modules that would look approximately like symbol manipulation if you look at it on a grand scale.
What do you think about the physical symbol systems hypothesis?
The fact is that we have never been able to get very good models of human perception, pattern recognition, or human learning. We have never been able to get contextual effects to arise naturally out of physical symbol system models.
Do you think an algorithm like back propagation is a satisfying model?
Unlike most people, I am less enthralled with the learning aspects of the connectionist paradigm than I am with the dynamic system aspects.
In one of your articles12 you deal with cognitive representation. One of the attractive things in the connectionist paradigm for me is the idea that information can be represented in a system in a very opaque way. The standard view of representation is always that you have something in the system that is a symbol, and it corresponds in a fairly clean way to something, a relation, a property, or an object in the world. The relationship between the element inside the system and the external elements is conceptually transparent.
This has had a very liberating effect on the way we think about what representations might be like, because there is not necessarily any transparent relationship between the elements in the systems and what is out there in the world. This suggests that some of the problems involved in commonsense knowledge, like ideas we have about causality or about the physical world in terms of «If you push against something, how hard is it going to push back?» may be encoded in these networks in ways that are conceptually opaque. That may be fine for talking about how a physics student learns something about a domain of formal academic physics, but it probably does not capture anything like the kind of knowledge we have about the world we live in. Maybe the connectionist paradigm will provide an answer for how commonsense knowledge might be represented, but we don't know that yet.
The idea is that there may be some level down to which you can decompose, and what is below that is the background, which can only be captured by some very messy neural network or connectionist kind of scheme. But I still think that psychologists and cognitive scientists have to be concerned with it, because it ends up being the foundation on which so much higher-level cognition is built. There may be certain things you can do without it, but at some point you have to understand what it is. You write13 that the broadest basis of cognitive science is the information processing paradigm.
Searle says that even if you could describe a river in terms of an information processor, it does not lead to new psychological insights. We can describe what a thermostat does as information processing, but we would not call it cognition. The central issue is the nature of consciousness. What people are really talking about when they say that people and animals have cognitive systems but rivers do not is the issue of consciousness.
A good example would be the human immune system.
Argument is about, is consciousness. The deep reasons have to do with the extent to which you want to make science a third-person as opposed to a first-person endeavor. What the information processing approach can do is divide a cognitive system into a part that can be accounted for by information process-ing and a part that cannot. The part that can be accounted for is all the stuff an automaton could do if it were doing the same information processing.
The relation between the first- and third-person view of science is that to say what these experiences are like is a first-person statement, and as long as the science we are doing is restricted to third-person statements, we cannot get into that. It is methodologically closed to the kind of science we do currently. We might come to some extended notion of science that would allow those things to be part of it. When I read your articles, I thought that you believed the information processing paradigm to be the most promising approach in cognitive science.
But now you say that there is no notion of consciousness in it. I do think that information processing is the most promising approach in cognitive science, because there is not any approach I know of that would give us an account of consciousness. I just do not see how information processing is going to give us a complete account of consciousness. I do not see how information processing can ever tell the difference between a person who has conscious experience and a complete computer simulation that may or may not.
It has no features for saying whether a person or a creature or a computer is conscious or not. Let us return to the enterprise of cognitive science. The problem with cognitive science as a discipline is that the methods of the component disciplines are so different. Linguistics does not enter into it, except maybe indirectly if you want to study the interface between language and perception.
Some people think linguistics has to be a major part of cognitive science. Historically, linguistics has been a very important part of cognitive science, largely because of the relationships between modern linguistics and computers through computer languages and automaton theories. This was one of the foundations of cognitive science. But it is not necessarily an important part of cognitive science for every cognitive scientist.
For example, I do not think it is important at all for me to know government and binding theory15 in order to do interesting work in perception. Actually, I think that vision is probably the best example in which cognitive science is really working, because we know more about the biology of the system than we do in the case of language. We can actually look at the relationship between biological information processing and the computational models that have been constructed to do vision. Within vision, there is this one very restricted domain, which I think is the very best example of cognitive science anywhere in the field, and that is color perception.
You can tell a very interesting and extended story about color vision that goes all the way from the biological aspects, like the three different cone types that we have in the retina, and how that representation is transformed into an opponent red/green, black/white, and blue/yellow representation and so forth, through the psychological structure of color space and how they relate to one another, all the way up into anthropology, where you can tell the Berlin and Kay story about basic color terms in different cultures16 and how those basic color categories are grounded biologically in terms of the underlying processes and structures of the visual system. It is a beautiful story that connects all the way from biology through psychology and up to anthropology. It is the best example we have of how the different disciplines of cognitive science can all contribute to the understanding of a single topic in a way in which each one illuminates the others. It is a real problem.
It is the reason why we at Berkeley have gone very slowly in the direction of a cognitive science program. They started to move into the first-order approximation to cognitive science at places like UC San Diego, where they are designing a graduate program in cognitive science from the ground up. By this I mean that the students coming out of this program will probably know how to write LISP code, how to do computer simulation, and how to do experiments and analyze data so that they can work as psychologists as well as computer scientists. And they probably will have some knowledge of other fields as well.
In psychology, the principal methods are empirical. The techniques a psychologist learns are very different from those a computer scientist learns. You can talk about Artificial Intelligence research as being empirical in the sense that they get an idea, they build a system to see whether it can do what they think it can do, and if it cannot, they revise it and go back to do something else. You can do psychological experiments without using a computer at all or even knowing what a computer is.
The knowledge you get in graduate training in philosophy is about logic and logical argument. This is training we do not get in psychology, for example.
Is this an advantage or a difficulty?
One way is to be master of one methodology, and to have a working knowledge of other methodologies so that you can understand what is going on and be informed by what is going on in those disciplines, at least enough to influence the work you do in your own methodology. I might try in my experiment to test some theory that was constructed by a computer scientist in vision. That certainly constitutes cognitive science research, in my view. In psychology there are no such experiments where the results send a whole field into a quandary over how to solve some very deep problem.
It is because the underlying mechanisms of cognition and the human mind are much more complicated. There is always more than one way to test any given idea, and when you test it in different ways, there is always the possibility that you get different answers, because there are some factors that you have not taken into account.
Methodologically, this is a behaviorist viewpoint, appropriately aligned with the information processing approach, in the sense that if you get a machine to behave like a person, doing the same kind of things in the same amount of time and with the same kinds of errors, then you can claim that you capture the critical information processing capabilities of the organism that you study. We can do the very same experiment on a computer simulation that we do on a person. If the answer comes out the same way, we make the inference that the computer may have the same information processing structure as the person. The way we make the decision that they have the same internal structures is by doing methodologically behavioral experiments, but it is entirely compatible with cognitive science in that sense.
On the basis of all these data we can construct a spatial representation of what your color experiences are like. Maybe there we will be able to say what conscious experiences are about. We may be able to say whether there is some biological property that corresponds to something being conscious or not. But I do not see how we can get beyond information processing constraints, without changing something about the way we do objective science.
I, for one, would be loath to give up this kind of science. We had problems with this in psychology long ago when introspectionist views were predominant. If one assumes that, then in order to be intelligent it is necessary to have consciousness, intentionahty . There is no evidence that for something to be intelligent it is necessary for it to be conscious.
There are programs now that apparently can play chess at the level of masters or even grand masters. It is clear to me that, as far as behavior goes, this is intelligent. One of the most interesting things that we have found out so far in cognitive science with computer simulations is that the high-level processes seem to be a lot easier to do than the low level ones. We are much closer to understanding how people play chess than we are to understanding how people recognize objects from looking at them.
Suppose we have a machine that not only plays chess but can make breakfast. Let's further assume, with John Searle, that it is not conscious, although I do not think we have any reason to say either way. My view is that intelligent behavior should be defined without reference to consciousness. Some people may claim that it is an empirical fact that only conscious beings achieve those criteria, but I do not think so.
We already have convincing examples of intelligence in nonliving things, like the chess computers. The reason why we do not have general purpose intelligence is that so much of what is required for this is background knowledge, the low-level stuff that seems so difficult to capture.
Do you think that the connectionist approach will continue to fit into cognitive science?
The people who say that connectionism is not really cognitive science because «they are only building biological analogues» are the hard-line symbolic systems people. They seem to take a very narrow view of what cognition is, and if it turns out to be wrong, they will have to end up saying that there is not any cognition. There are phenomena of cognition, and there is, therefore, a field of cognitive science defined as the interdisciplinary study of these phenomena. It was for this very reason that I wrote the article about the information processing approach to cognition.
The symbolic systems hypothesis is a stronger view than the information processing approach. The physical symbol systems hypothesis is fairly well characterized now, but we are just beginning to understand what connectionism is about and what its relationship is to symbolic systems. The biggest breakthrough was the breakthrough that allowed cognitive science to happen.
Why did it happen so late?
They made some very well founded arguments against a certain class of perceptron-type theories, showing that these theories could not do certain things. They did not prove that those same things could not be done by a more complex class of perceptrons, which was the principal basis of the more recent advances. I doubt whether there will be another breakthrough soon, because you usually get breakthroughs under conditions where there is dissatisfaction with the way things are. Part of the underlying conditions for the connectionist movement to happen was that there was a bunch of people who were not terribly happy with certain shortcomings of the symbolic systems approach.
They found a way around some of these problems. The symbolic systems people are happy with the standard view and will continue to work within it. Most of the people who had problems with that view are either neutral or going to the connectionist camp. These people are busy just working now, because there is so much to be done within the connectionist framework.
It will keep them busy for a long time, and it will be a long time before they get to the point where they become dissatisfied in a fundamental way with the connectionist paradigm. And only under the conditions of deep-seated dissatisfaction is another revolutionary breakthrough likely. As far as the current situation goes, the next decade is going to be spent working on what the connectionist paradigm is and how it relates to the symbolic systems approach.
HILARY PUTNAM
Hilary Putnam was bom in 1926 in Chicago. He studied at the University of
Pennsylvania and at the University of California at Los Angeles, where he earned
a Ph.D. in 1951. He taught at Northwestern University, Princeton University, and
MIT until he became Walter Beverly Pearson Professor of Modem Mathematics
and Mathematical Logic at Harvard University in 1976, a position he still holds.
Against the New Associationism
I got involved in cognitive science in two ways, because I worked for many years in mathematics as well as in philosophy. I am a recursion theorist as well as a philosopher. So the theory of Turing machines is one that I have known, as it seems to me, for all my adult life. In the late fifties I suggested a philosophical position that I named «functionalism».
« The question of whether the brain can be modeled as a computer is still wide open. But what I don't think anymore is that our mental states are simply isomorphic to computational states. If you take a simple mental state, for example, believing that there are roses in Vienna, this mental state might correspond to one mode of functioning in my brain and to a quite different mode of functioning in someone else's brain. The idea that the program or some feature of the program must be exactly the same for us to have the same thought now seems to me untenable.
What was the reason for changing your view?
The mind may be a computer in the sense that a computer may be the «body» of the mind, but the states that we call mental states are not individuated in the way computational states are individuated. The way we tell whether someone is or is not in the same mental state is not the way we tell whether a computer is or is not in the same computational state. One point that made functionalism so attractive was that we have to study only the structures and not the underlying matter, and therefore the implementation does not matter. Functionalism is, after all, still very strong.
I still agree with the point of functionalism you just mentioned, that is, that at least in principle the mental states don't have any necessary connection to one particular form of chemical, physical, or metaphysical organization. I like to say that our mental states are compositionally plastic, by which I mean that a creature, for example, a Martian or a disembodied spirit, could have a very different matter from what I have and be in the same mental state. That point of functionalism I have not given up. What I now say is that our mental states are also computationally plastic.
Just as two creatures can have different chemistry and physics and be in the same mental state, it must be considered that computers can have different programs and be in the same mental state. When you describe a program by saying what it does, for example, that this program integrates elliptical functions, that statement is not in a language of computer science.
What could be the reason why people did not follow you in this abandonment of functionalism?
Our philosophy of language has moved in an antisolipsistic direction. In 1928, Carnap called his position «methodological solipsism.» And by and large, solipsism has become increasingly unpopular. Jerry Fodor still defends it, but even he is now looking mainly for a theory of what he calls «broad content,» which is to say, a theory of the nonsolipsistic aspect of meaning and reference. His next book is going to be an attempt to solve the great Kantian problem of how language relates to the world.
That is not a solipsistic problem. The other great tendency in contemporary philosophy of language, which appears in German as well as American philosophy, is a tendency toward what has been called holism. In many ways, from the viewpoint of a philosopher of science like Quine or a philosopher of language like Wittgenstein, that looks retrograde, like a move away from their insights.
What do you think about attempts like Searle's to find a way for intrinsic intentionality?''
Searle seems to be saying, Since functionalism did not work, let's go back to the old kind of brainmind-identity theory that we had in the fifties, before functionalism, with authors like JJ. I think that Searle is just going back to identity theory without having answered any of the objections to it. He has not explained either philosophically how this makes sense or scientifically how he would meet the well-known objections to identity theory. He says that the problem of intentionality is just like the liquidity of water.
If he thinks the property of referring is something we can define in terms of fundamental physics plus chemistry plus DNA theory, let him do it . We cannot give necessary and sufficient conditions for something to refer to something in the language of fundamental physics. It seems to me that his Chinese Room Argument is a very strong argument against strong Artificial Intelligence.
I have certain problems with the details of the argument, but I share
If a certain property is not sufficient for consciousness, it does not follow that it is not necessary.
The main problem with Artificial Intelligence at the moment is the enormous gap between what has actually been accomplished and the sort of talk about it. We like to speculate about computers that talk to us, and we may do that even if functionalism is false. It may be that one day we shall produce computers we can talk to, and it is very likely that we won't understand their mental states better than we understand our own.
So you don't think that Artificial Intelligence is a useful approach to understanding the human mind?
Expert systems seem to me the greatest achievement of Artificial Intelligence. But I don't believe that intelligence is just a matter of having a huge library of already-preset repertoires and some very fast methods of searching it to come up with the right repertoire. The problem that stopped philosophers when they were trying to reduce intelligence to a set of rules was the so-called problem of inductive logic.
We have not yet had a genius who came up with at least a first idea of how to tackle them.
AGAINST THE NEW ASSOCIATION S 183 twentieth century has engaged in the kind of salesmanship that Artificial
Intelligence has engaged in. A scientist in any other field who tried to sell his accomplishments in the way Artificial Intelligence people have consistently done since the sixties would be thrown out.
In a sense, the
To me, it simply is a way of recognizing that the department structure that we have inherited from the middle ages is not sufficient.
What is good about cognitive science is that debates like this one can take place beneath the general umbrella of cognitive science. One reason cognitive science has become strong, I think, was that it is an attack against behaviorism. But the other reason could be that the computer is a very powerful metaphor to study the human mind. Cognitive science is like philosophy.
Philosophy includes its opponents. His speculation about the mind as a society14 does not make essential use of the computer metaphor. A nineteenth-century philosopher could equally have said, Look, the mind is like a society. I want people to decide for themselves whether they are encountering any new ideas or whether they are constantly encountering the sort of informed, optimistic scientist who hopes for what he may be able to do some day.
What do you think about connectionism?
There are two things to say about the so-called new approach of connectionism. As far as the connectionist algorithm goes, it is important to realize that about 98 percent of what that algorithm does has been replicated using paper and pencil and ordinary desk calculators, using only regression analysis, that is, standard statistical tests for regression. In other words, what that algorithm does is find correlations between sets of variables that it is given. Things that cannot be expressed in terms of correlations between given variables cannot be found by the algorithm.
Artificial Intelligence is never going to say anything about pattern recognition. We have no real theory of learning.
DAVID E. RUMELHART
David E. Rumelhart received a Ph.D. in mathematical psychology from Stanford
University in 1967. He was Assistant Professor, Associate Professor, and then Full
Professor of Psychology from 1967 to 1987 at the University of California at San
Diego, until he became Professor of Psychology at Stanford University. While at
San Diego he was a co-founder of the Institute of Cognitive Science. During the
last ten years Rumelhart has concentrated his work on the development of "neurally inspired" computational architectures.
From Searching to Seeing
I did my graduate work in the study of memory and discovered that there were too few constraints on what we do. People can remember things that are meaningful and will forget things that are not. As a graduate student I worked in the area of mathematical psychology. I thought that the tools of mathematical psychology were by and large too weak.
One thing I saw was the work of people in Artificial Intelligence, people like Ross Quilhan, who invented the semantic network. 31 attempted to put all those things together. This led to a kind of computer simulation approach that had many elements from Artificial Intelligence work, elements taken from Fillmore's ideas and also from some of the philosophical work. Norman and I called an «active semantic network.
The three of us, Peter Lindsay, Don Norman and I, began work on a general project trying to characterize what is stored in long-term memory and what knowledge is like. It grew naturally, in my view, into the ideas that have since come to be called cognitive science. In the book that Norman and I wrote in we commented on the emerging field of cognitive science. Rumelhart received a Ph.
Professor of Psychology from 1967 to 1987 at the University of California at San
Was in another book, by Bobrow and Collins, published in 1975 as well, where cognitive science was used in the subtitle. 71 think I began working in this area when I tried to characterize the nature of memory and knowledge.
HEARSAY speech recognition system,8 where the idea of cooperative computation was central. I wrote a paper in 1975 on an interactive model of reading,9 as I called it, which was an application of the cooperative computation idea to characterize the nature of the reading process. These ideas haunted me for really a long time. I attempted to model using conventional parallel processing ideas.
For the next several years, I tried to build computer models that were appropriate for the kind of semantic information processing I was interested in. I found that it was very difficult, using the formalisms that were available to me, to try to create a cooperative computational account that I thought was right. He had a background in the area of associative memory. He had worked with Higgins and Hillshaw in Britain and had an understanding of associative memory systems, which were also neurallike memory systems.
As I began to see the relationship between associative memory and our activation models, we tried to connect them. The idea that human beings really are parallel processing processors originally stems from the work of Reddy and his colleagues on HEARSAY.
A digital computer can overcome this by doing things very fast and serially and pretending it is doing them all at once. We have to imagine that the serial part of human processing is quite slow. And this slow serial part, the part that dominates our consciousness, is an important part of human cognition. From a biological and psychological point of view, I believe that this serial processing is dependent, ultimately, on an underlying parallel processing system, namely the brain.
When we look at things that happen at a very slow time range, we see things that look familiar from a symbol processing type of perspective. When we look at things that happen at a faster time range, we see things that look more parallel, less symbol-oriented. In one of the chapters of our book we try to sketch how it might be that this longer-term serial processing results from fundamental parallel processing systems. But if you don't want to bother talking about where it comes from, you can probably talk about it in a serial way.
That is what is right about the serial processing accounts.
Given time enough, the human being can act like almost any machine. But the faster we get, the more we press the system, the more important it becomes what kind of machine we have. When we do serial processing, we have to pay the price of going slowly, very slowly, by the standards of conventional computers.
Obviously, this includes fields like Artificial Intelligence, cognitive psychology, linguistics, computer science, philosophy, cognitive anthropology, cognitive sociology, neurobiology. These people are all interested in cognition. I should think that not all people in each of those areas would view themselves as doing cognitive science. So some linguists could be considered or consider themselves cognitive scientists, whereas others might not.
Similarly in philosophy or, clearly, in computer science. Those are areas where some are cognitive scientists and some are not. At least in philosophy, it is not clear at all what «computational account» means. These things remain to be sorted out.
It involves the development of computer simulations as a mode of expressing or exploring theories, or sometimes more mathematical forms of expression. The questions of what is computation and what is not, and so forth, are semantical questions that eventually need to be resolved, but they are not the key. If it turns out that computation is not what we intuitively think it is, then we have just used the wrong word. These are issues that don't worry the working cognitive scientist very much.
Some people thought we weren't cognitive psychologists at all but simply Artificial Intelligence people. At the same time, we had made efforts to interact with linguists, with philosophers, with anthropologists, with cognitive sociologists. We already had a good deal of experience in interdisciplinary work. One of the things that came out of the Institute was a series of inter-disciplinary workshops.
Could you explain in more detail how an interdisciplinary workshop works?
The hardest part in interdisciplinary work, in my view, are the differing goals that people have. Each discipline has its goals, its questions, and also its methods. So we think we are talking about the same things, but it turns out that our idea of what a right answer would be like is different. One important thing in any interdisciplinary effort is to sort out what you are trying to find out, what the goals are.
The second important issue is that of methods. Cognitive science, I think, is largely methodologically eclectic. What it has is a collection of methods that have been developed, some unique- ly in cognitive science, but some in related disciplines. The difficult part here is learning to have respect for the disciplines of other fields.
Sometimes we use computer simulation, sometimes theorems, sometimes philosophical argumentation. Whereas philosophers feel that by argument you can actually resolve things. These are things you have to learn. We work very hard to get these ideas across to one another.
When we started our interdisciplinary postdoctoral program, we also had a philosopher. The five postdocs and Norman and I met as much as ten hours a week. They will recognize the validity of alternative methods.
What role does Artificial Intelligence play in cognitive science?
Experimental Artificial Intelligence is when you test these algorithms out by writing programs and systems to evaluate them. And applied Artificial Intelligence is when you try to take them and use them for some purpose. The techniques of experimental Artificial Intelligence are the ones that are important for cognitive science. I see theoretical Artificial Intelligence as really a branch of applied mathematics.
When they do this, the algorithms become an empirical area. Artificial Intelligence is not empirical. It is much like mathematics in this respect, which is not an empirical science, but a formal science. But cognitive science itself has both the formal part, which is Artificial Intelligence, and the empirical part, in which assertions can be correct or incorrect descriptions of the world.
Those are the cognitive science theories of how things might work in actual cognitive systems.
Artificial Intelligence?
Some claim that the conventional approach will never solve the commonsense problem. A lot of what we do when we approach a problem is to make an attempt to understand it. There is this movement from an abstract problem that we can solve maybe only by trial and error, or by some formulaic method like logic or algebra, to the point where we can understand it. At that point we have moved from the problem solving activity as search to problem solving as comprehension.
Then we move on from problem solving as comprehension to problem solving as seeing or perceiving. Most of what we know about symbolic Artificial Intelligence techniques is about search. Mostly this is not relevant to understanding, which is what we usually do, or perceiving, which is even farther from what these search methods are about.
JOHN R. SEARLE
John R. Searle was born in Denver, Colorado, in 1932. He attended the University of Wisconsin from 1949 to 1952 and studied at Oxford University, where he
received his B.A., M.A., and Ph.D. He taught as Lecturer in Philosophy at Christ
Church in Oxford from 1957 to 1959 and since then has been Professor of Philosophy at the University of California at Berkeley. He has also been a visiting
professor at many universities both in the United States and abroad, including
Venice, Frankfurt, Toronto, Oslo, and Oxford.
Ontology Is the Question
The Sloan Foundation decided to fund major research projects in the new area of cognitive science, and they asked various people in related disciplines if they would be willing to come to meetings and give lectures. That way I met a lot of other people, and I was asked to participate in the formation of a cognitive science group in Berkeley. I asked the vice president, who was in charge of donating the money, why they would give it to those two institutions and in particular to Berkeley, where there was not a typical cognitive science group.
I am not sure whether there is such a thing as cognitive science. People change their opinions, so I would not worry too much about how to categorize cognitive science. However, much of what is interesting about cognitive science is that it is partly a reaction against behaviorism in American intellectual life. It expresses an urge to try to get inside the mind to study cognitive processes instead of just studying behavioral responses to stimuli.
It grew out of cognitive psychology, but it needed more resources than were available in psychology departments. It needed resources from computer science, linguistics, philosophy, and anthropology, to mention just the most obvious cases. The first is that, in fact, many cognitive scientists repeat the same mistake of behaviorism. They continue to think that cognition is not a matter of subjective mental processes but a matter of objective, third-person computational operations.
I think that cognitive science suffers from its obsession with the computer metaphor. Instead of looking at the brain on its own terms, many people tend to look at the brain as essentially a computer. And they think that we are studying the computational powers of the brain.
Intelligence in cognitive science?
In fact, I would say that without the digital computer, there would be no cognitive science. And the reason for that is that when people reacted against behaviorism and started doing cognitive psychology, they needed a model for cognitive processes. And the temptation was to think, «Well, we already have a model of cognitive processes. Cognition is a matter of information processing in the implementation of a computer program.» So the idea was, as you know, to treat the brain as a digital computer and to treat the mind as a computer program.
Well, first, of course, strong Artificial Intelligence, which is the view that all there is to having a mind is implementing a computer program, is demonstrably false. I demonstrated that in my Chinese Room Argument. But even weaker versions of Artificial Intelligence, such as the view that says that minds are partly constituted by unconscious computer programs, are also confused. Being a computer is like being a bathtub or a chair in that it is only relative to some observer or user that something can be said to be a chair, a bathtub, or a computer.
The consequence of this is that there is no way you could discover unconscious computational processes going on in the brain.
I think, in fact, that today very few people defend strong Artificial Intelligence. I do not hear as many extreme versions of strong Artificial Intelligence as I used to. But in the hard core of cognitive science, the biggest change has been the advent of connectionism, parallel distributed processing, neural net models of human cognition. I am more sympathetic to these than I am to traditional Artificial Intelligence, because they are trying to answer the question of how a system that functions like the brain might produce intentional and intelligent behavior.
The idea of PDP is an old idea that comes from the sixties, from the work of Rosenblatt.
Intelligence?
I am now talking just about Artificial Intelligence efforts at simulating human cognition. Within traditional Artificial Intelligence, you have to distinguish between those programs that are just for commercial applications and those that are supposed to give us information about psychology. They are useful as tools, but they do not even attempt to give you real insights into human psychology, and the methods of information processing that they use are, in general, rather simple retrieval methods. Maybe there are some programs I do not know about, but as far as I know, the so called expert systems are not very interesting from a psychological point of view.
Intelligence, we are talking about those programs that are supposed to give us psychological insights. Occasionally, the problems become apparent within the Artificial Intelligence community. For example, discussions of the «frame problem» exhibit the limitations of traditional Artificial Intelligence. I think that the frame problem is just an instance of what I call the «background problem,» namely that all intentionality functions against a background of commonsense knowledge, practices, behavior, ways of doing things, know-how, and such preintentional capacities, both cul-tural and biological.
So there is no way that their essential features can be captured in representations, but by definition, traditional Artificial Intelligence uses representations. You might get the same results using representational methods for something that a human brain does without representational methods, but it does not follow that you have gained any insight into how the human brain does it. Intelligence is the failure to come to terms with the nonrepresentational character of the background. Let's go back to the strong Artificial Intelligence position.
Similarly, people have built their professional lives on the assumption that strong Artificial Intelligence is true. In some quarters, the faith that the mind is just a computer program is like a religious faith. Perhaps there is some sort of mainstream thinking in our society about how science should function. You do not have to worry about subjective, conscious mental states, because reality is physical, and physical reality is objective.
The beauty of strong Artificial Intelligence is that it enables you to recognize the special character of the mind while still remaining a materialist. You do not have to say that there is some special mental substance. And, on this view, the key to solving the mindbody problem is the computer program. Because a computer program can be defined purely formally or abstractly, it satisfies the urge to think of the mind as nonphysical, but at the same time, the solution is completely materialistic.
The program, when implemented, is always implemented in a physical system.
If the Turing Test is supposed to be philosophically significant and conclusive to show that behavior constitutes cognition and intelligence, then it is obviously false. The Chinese Room refutes it. You can have a system that passes the Turing Test for understanding Chinese, but it does not understand Chinese.
Let us reexamine the Turing Test and the Chinese Room. One could argue that the Chinese Room Argument is the Turing Test from the inside. Chinese Room is the first-person view. The philosophical depth of the argument, however, derives from the fact that it reminds us not just that syntax is not sufficient for semantics but that actual semantics in a human mind, its actual mental contents, are ontologically subjective.
So the facts are accessible only from the first-person point of view. Or, rather, I should say that the facts are accessible from the first-person point of view in a way that they are not accessible from the third person. Let me add something to this discussion of the Chinese Room parable. From my point of view, as I see it now, I must say that I was conceding too much in my earlier statements of this argument.
I admitted that the computational theory of the mind was at least false. Now, compu-tation is defined in terms of symbol manipulation, but «symbol» is not a notion of natural science. Something is a symbol only if it is used or regarded as a symbol. There are no physical properties that symbols have that determine that they are symbols.
Only an observer, user, or agent can assign a symbolic interpretation to something, which becomes a symbol only by the act of interpreting it as a symbol. They thought that it was beyond the reach of science to explain why warm things feel warm to us or why red things look red to us. Maybe a kind of residual dualism prevented people from treating consciousness as a biological phenomenon like any other. By consciousness, I simply mean those subjective states of awareness or sentience that begin when you awake in the morning and continue throughout the day until you go to sleep or become «unconscious,» as we would say.
Consciousness is a biological phenomenon, and we should think of it in the same terms as we think of digestion, or growth, or aging. What is particular about consciousness that distinguishes it from these other phenomena is its subjectivity. There is a sense in which each person's consciousness is private to that person. This is the famous mind-body problem.
As far as we know anything now about how the brain works, variable rates of neuron firings in the brain cause all the enormous variety of our conscious life. But to say that conscious states are caused by neuronal processes does not commit us to a dualism of physical and mental things. The consciousness that is caused by brain processes is not an extra substance or entity. It is just a higher-level feature of the whole system.
Now let's go back to the Turing Test. The Turing Test inclines us to make two very common mistakes in the study of consciousness. These mistakes are to think either that consciousness is a certain set of dispositions to behavior or that it is a computer program. I call these the behavioristic mistake and the computational mistake.
Now, behaviorism has already been refuted, because a system could behave as if it were conscious without actually being conscious. There is no necessary connection between inner mental states and observable behavior. The same mistake is repeated by computational accounts of consciousness. Computational models of consciousness are not sufficient for consciousness, just as behavior by itself is not sufficient for it.
There is no doubt that for the next decade or so cognitive science will continue. What I think will happen to Artificial Intelligence is that it will be immensely useful commercially. Computers are generally useful commercially, so various efforts to simulate the processes and results of human intelligence will be useful commercially. But my guess is that traditional Artificial Intelligence, as a scientific research attempt to understand human cognition as opposed to being just a straight commercial attempt, has run its course. Until we get a science of the brain, we will always have these desperate and pathetic metaphors. And I just do not know how close we are to having a science of the brain.
TERRENCE J. SEJNOWSKI
Terrence J. Sejnowski studied physics at Case Western Reserve University (B.S.)
and Princeton University (M.A., Ph.D. in 1978). As a research fellow at Princeton
University and Harvard Medical School he specialized in biology and neurobiology.
His professional career began at John Hopkins University, where he served as
Assistant Professor and later Associate Professor from 1982 to 1987. In 1988, he
was appointed Professor of Biophysics at Johns Hopkins, before joining the University of California at San Diego in the same year, as Professor of Biology and
Physics. Since 1988, he has also been a senior member of the Salk Institute, San
Diego.
The Hardware Really Matters
In the process of trying to understand how the brain works, I have come to appreciate the complexity not only of the structure of the brain but also of its cognitive abilities.
Which sciences would you include?
One of the most influential statements was made by Allen Newell, when he outlined a research program that would carry Artificial Intelligence into the next decade. The key is the realization that knowledge about the brain was potentially an important resource and constraint on human cognition. In fact, in many respects the techniques are so developed that it may, over the next decade, become one of the leading resources of constraints and information. I am not saying here that neuroscience by itself is going to provide all the constraints that are needed, only some constraints that would be unavailable with any other approach.
« Whether the brain is made out of neurons or of chips or of tin cans really is not relevant to the level I want to address.» That the mind should be studied independently from the brain has become almost a dogma with certain people, but I don't think it was a dogma with the founders of Artificial Intelligence, certainly not with Allen Newell. One dramatic change is that we use the term computer now in a very different way from how it was used, let's say, before 1950. They used log tables, they used mechanical devices, they used analog computers. These were all computers, but now the prototype of the computer has become the digital computer.
In fact, our abstract definitions almost preclude what formerly was called a computer. If you assume that computation necessarily means sequential operations and instructions, you exclude analog computation. I want to go back to a much broader definition and try to imagine what sorts of computation might be going on in the brain, if we all assume that computation is a term that extends to brains as well. Digital computers also have the property that software is separated from hardware.
The very concept of software has to be modified, I think, when you come to the question of how the brain is programmed. There is a point where hardware and software come so close that it does not make sense any more to separate them.
We have to go back to basic psychology and ask whether there may be a more fundamental way of looking at human cognition, of which the traditional view is only an approximation. Hardware has been largely ignored both in cognitive science and in Artificial Intelligence. Most digital computers would not survive long without care and feeding by humans. The essential part is that it is not enough simply to show in a formal sense that a program can solve a problem.
Most Artificial Intelligence programs require millions of steps before coming to even the simplest conclusions. The only Artificial Intelligence program I know that uses this as a constraint is Allen Newell's Soar. If a program is meant to be a model for cognition, this is something that really has to be taken into account. Computers today are a million times faster than the first generation, and new machines will be faster, but they are still orders of magnitude away from real-time solutions.
A LISP machine will never «see» in real time, even if such a program could be written.
You claim that even with bigger and faster machines they won't succeed?
If they are following the traditional view of computation as being one operation at a time, then it will be very difficult. In the case of biology, the difficulty is to solve problems fast. If you have to solve a practical problem, you are almost forced to do that. In retrospect, a great part of the reason why cognitive science and Artificial Intelligence have focused on symbol processing was that there were machines available that were very efficient at symbol processing.
That is what digital computers are good at, but they are not good at sensory processing, motor control, or a number of other things that humans do very easily. In a sense, the programs in Artificial Intelligence have been optimized for digital computers. I would argue that, far from being machine-independent, the research in the field has been entirely dependent on that architecture. Having this architecture has strongly influenced the realm of computation where people have searched for algorithms.
The hardware really influences your thought process. The effort of finding new algorithms is now focused on a different part of computational space. This is ironic, because one of the original motivations for Artificial Intelligence and cognitive science was the belief that this way of approaching computation was machine-independent. I think most human beings, use concrete images or exemplars as ways of getting new ideas, even abstract ideas.
This is certainly one type of parallel architecture. The connectionist architecture is somewhat different. The processing units are much simpler than von Neumann architectures. The connectivity between the processing units accounts for much of the computational power.
Ideally, these processing units are connected to hundreds or thousands of other units, whereas typical circuit elements on chips have two or three fan-outs, and even contacts between boards are ten or less. This directly affects the speed, how well the whole computer can synchronize, and so forth. We are dealing here with fundamental hardware constraints. In a traditional Artificial Intelligence program you would have to represent «glass» by a symbol and all its properties by other symbols, and these representations would be stored at different locations.
The advantage of the distribution is that a large fraction of the units have a piece of the representation that they can combine locally with other pieces of other representations, because the other representations are being stored in the same population in a distributed way. Much of this can be done locally, so that the information need not travel through the whole system. In only a few steps or iterations it may be possible to come up with a solution to the problem, because the computation has been distributed over many processing units, and the information is being combined everywhere in parallel.
What would a representation distributed over thousands of parts look like?
We are coming up with many counterintuitive answers that would never have been guessed based on the model of memory found in a digital computer, that is, a very specific and precise location for each fact. If you have to distribute this fact over a million units, you have to do it in a way that is very different. In a sense, the representation of «glass» contains partially the information you need to solve the problem, because the representation already contains information about its properties and uses. I should mention here that the field is spreading and has a very strong component from the engineering side that is outside Artificial Intelligence.
MIT Press and entitled Neural Computation. Because computer scientists have come to realize that problems like vision and speech recognition are so computation-intensive that in order to solve them in real time you need to design special chips that can solve those problems in parallel. They have designed a chip that converts sound waves into frequency-filtered analog signals, which are then combined in a way that is similar to the way they are combined in the brain of a barn owl. I would not say that the hardware duplicates their brain, because the degree of microminiaturization in the brain exceeds anything that can be done today with VLSI.
It would have an enormous impact on interfaces if a cheap, practical, efficient system could be designed that allowed us to feed information into the computer through the voice. So far, there has only been progress with limited vocabulary and single speaker systems, which are still very expensive and not nearly as good in their performance as humans. One way to deliver that enormous computing power is to go through a massively parallel system and by using the connectionist approaches. One approach that has been successful and looks promising is the idea of learning algorithms applied to speech data.
The idea here is, rather than to program the speech problem, to teach a computer to recognize individual speakers one by one. I want to emphasize the factor of real time, because it really concentrates the mind. You have to solve the problem, because the signal is there now, and it is not going to be there anymore in a second. Different disciplines have used different words to describe similar approaches.
Within psychology, the application of massively parallel computation of networks and of learning procedures has come to be known as PDP, or Parallel Distributed Processing, primarily because of an influential book that was published in 1986 by Rumelhart and McClelland. The emphasis in computer science is on solving practical problems, whereas in psychology the problem is to understand humans. Again, very similar, often identical algorithms are being applied now to the problem of understanding specific brain circuits. What is common to all these approaches and what unifies the research workers who are exploring these problems from a mathematical, psychological, engineering, or biological angle, is the focus on this new architecture.
This new architecture has, in a sense, offered an alternative to the von Neumann architecture as a basic model. It changes some of the fundamental assumptions about the way you approach the problem. I mentioned earlier, the hardware, far from being irrelevant, could be the driving force. The explorations we are engaged in, looking at massively parallel architectures from all these different disciplines, are going to change the way we think not only about computation but even about the organization of other distributed complex systems at the level of societies or economies.One person knew enough about both areas. But today, because of the existence of the neuroscience society, neuroscience programs, and neuroscience departments, it is possible for students to become experts and to receive instruction on anatomy and physiology, and psychology, and pharmacology. All these areas are important if you want to study the brain. We may know, within the next ten or twenty years, the fundamental neural mechanisms that form the basis of some aspects of human memory.
Having that knowledge will have major implications not just for medical treatment of memory disorders but also for designing massively parallel computers that are built along similar principles. There is a whole list of outstanding problems that no one can solve with conventional computers. How to do it is a research problem. We can build computers with sixty-four thousand processors today, maybe with a million processors in a few years.
There is a technological force that is pushing us toward massively parallel architectures. The second reason is that neuroscience was still in its infancy. In fact, over the last twenty years, almost all neural network models that incorporate memory functions have used one or the other variation of Hebb's synapse as the fundamental local learning rule, without any experimental evidence whatsoever. The focus of my own lab is to try to combine the facts and insights from neurobiology with the models that are being investigated by the PDP and connectionist community, and to couple neuroscience into the research program in a way that was not possible in 1960.
The answer to your question is that there has been a gradual increase in knowledge but that you need to reach a threshold before you can really succeed to solve practical problems. In the last few years, we have computers that are fast enough to do the simulations, and we have hardware like analog VLSI chips that can deliver the architecture to the field in an efficient package. That may account for what appears, from the outside, as a major change.
To go from there to a million units will require some new research strategies. Scaling means also temporal scaling, that is, being able to design systems or multiple networks that are able to integrate information over time and, in the limit, approach the processing capacity that sequential von Neumann architectures are so good at. There is going to be a gradual fusion toward the center as the problems of the approaches start to merge, so that the insights and strengths of both approaches can be taken advantage of and combined.
There will be both practical and intellectual outcomes. First, it will be possible to build new generations of computers based on these principles. The higher levels of cognition will take longer to emulate with this architecture. Let's look at this from the biological perspective of evolution.
The focus of the research in Artificial Intelligence has been on that pinprick, on a tiny part of the biological evolution. Those might be the most computation-intensive problems where we have to wait the longest to come up with the right answers. The second factor is the bias toward language. Intelligence there has been a focus on symbol processing, language, and logic, people have looked for intelligence in those terms.
Everyone knows children can be incredibly intelligent without being able to speak, in terms of solving problems and dealing with the world in a very sophisticated way. One of the things you come away with after thinking about the problem of intelligence from a biological perspective is that we may be myopic in our insistence in identifying intelligence with language. Not everybody has that opinion, but nonetheless it has biased the way we think about the problem. Perhaps with a new type of architecture we will gradually come up with a new perspective on intelligence that has more to do with the interaction of internal representations with the real world in real time.
Real time is a part of the equation that is essential on the biological side. This constraint on intelligence has not yet been fully considered.
Will this new research and technology lead to new social impacts?
That illustrates that human beings are not Turing machines but creatures created by genes and experience. One of the strengths in the new research area is that the circuits we are dealing with here are fundamentally adaptable. In principle, they should be able to adapt to human beings, and not the other way around. In the past, you had to train human beings to adapt to the machine.
Similarly, there are many problems in manufacturing and control, in running large businesses, that would look very different if the process, the organization itself, were adaptable.
HERBERT A. SIMON
Herbert A. Simon received his Ph.D. from the University of Chicago in 1943. He
has also received numerous honorary doctorates from universities all over the
world. He was Assistant Professor and then Associate Professor of Political Science at the Illinois Institute of Technology, before serving as Professor of Political
Science at that institution from 1947 to 1949. From 1949 to date, he has been
at the Carnegie Institute of Technology, as the Richard King Mellon University
Professor of Computer Science and Psychology at Carnegie-Mellon University.
The importance of his work was acknowledged by the receipt of the Nobel Prize
in Economics 1978.
Technology Is Not the Problem
That led me into trying to understand human problem solving, because it is part of what is involved in decision making. This research started back in 1939 and went on through the forties and fifties. I used punched card equipment for various research activities. I heard about those wartime computers, and by 1952 I had learned how to program a computer.
I wrote a little paper about chess programs, about how to do combinations. Newell wrote a more elaborate paper published in 1955 on chess playing programs,1 but we had never programmed one. At first we thought of building a chess machine, but for a variety of reasons it turned out that the first thing to do was build a program that would discover proofs in symbolic logic. By Christmas 1955, we had a program called LT, the Logic Theorist, which is capable of discovering proofs for theorems by heuristic search.
By August of the next year, 1956, we had a running program. When we actually began in 1957 to compare the way this program did its business with protocols we had been gathering from human beings, we saw that human beings use means-end analysis.
About the same time, people who, out of wartime experience, were doing human factors research were going back to higher mental functions.
So behaviorism was still the leading school?
In our first psychological paper in Psychological Review, in 1958,7 we spent a lot of time arguing that what we were doing was just a natural continuation of behaviorism and Gestalt psychology.
Most of us now make a fairly consistent distinction between Artificial
I have done both, but I don't really care whether we have more efficient computers in the world. On the other hand, I do care for understanding what the human mind is like.
What I took from Searle's Gedankenexperiment is that you cannot explain it very well from a third-person view but that you should imagine you are inside the machine.
What Searle's machine lacked was a semantics. But to say that it is impossible for a machine to have semantics is ridiculous, because human beings who are machines have semantics. Semantics simply means that you have definitions connecting things that come in through your senses with the stuff that is going on inside. I cannot take it as a first premise that computers don't have intentions.
I have a program that allows me to recognize this thing as a tape recorder, then I have intentions, and so does my computer.
What do you think about the so-called commonsense problem and the notion of a nonrepresentational background?
There is a lot of evidence that the kind of thinking observed in human beings can be done with the kinds of mechanisms I observe in computers. This is one reason why we worked a lot on systems that do their thinking by recognition, because all the evidence I see about human intuitions are precisely the phenomena of recognition. We can reproduce recognition in computers without any trouble at all. A computer that does medical diagnosis does it by recognition of symptoms and by associating these symptoms with what it has learned about disease entities.
The question is what it would take a computer to do that, so that if you apply the same criteria to the computer you would have to agree that the computer is behaving intuitively or creatively. If you are saying that we have not progressed nearly as far with computer motion as with human motion, I would have to agree with you. I had a problem with my knee last year and went to see a physician. We would have in the program for a simulation of human behavior like this a set of productions.
So tacit knowledge or knowledge that you could not make explicit is not an argument against your theory?
My own model of that is a program called EPAM ,13 which picks up the features my eyes present to me, searches them down a discrimination net, and arrives at Joe. There is nothing about our EPAM program that says that what it does is conscious. What our program does assert is that what is going on is a series of differential tests on features that allow different things to be discriminated. An awful lot of things are going on here that I cannot tell you about.
Until recently, there have not been the manpower and the large computers to do it. Douglas Lenat has a ten-year program of building a huge semantic memory . When people start to build programs at that magnitude and they still cannot do what they are supposed to do, then we will start worrying. That is like saying that we won't have a theory of DNA until we have deciphered the whole human genome.
We need people like Lenat who invest ten years of their life to build this semantic memory.
You don't have to have that sentence in memory. All you have to have in memory is that most people in our culture wear underwear. Most of the programs that try to simulate semantic memory have in them a so-called inheritance property, so that you can make special inferences from the more general ones. Searle uses these questions to express his disbelief that you do inferences every time you see a station wagon, sit down on a chair, step through a door, and so on.
He says, «Please sit down.» I notice an object that satisfies the recognition conditions for «chair.» Nothing about the object evokes the production «If a sitting place looks fragile, test it before you sit on it,» because there is no cue in the situation that causes recognition of «fragility» . Hence the condition of the production is not satisfied, and it is not fired. None of the productions for recognition of the nonexistent items was evoked.
Psychologists are tremendously empirical, and, by and large, even cognitive psychologists are suspicious of theory. In Artificial Intelligence the dominant tradition has been an experimental tradition, a tradition of building big systems and seeing what they can do. It occurs to me that even in a highly developed science like physics, experimentation plays a tremendous role. Most of my work in Artificial Intelligence and cognitive science has been not experimental but closely tied to experiments.
My idea of a theory is a model of the phenomena, not a bunch of mathematical theo-rems. Mathematics is very powerful for finding out about highly abstracted, conceptualized systems and has to be supplemented with a lot of messier techniques to deal with the real world.
You might almost say it is the method of science. If I have a set of differential equations, and if I think that those differential equations are a good theory of what the planets are doing, then I better record the movements of the planets and compare them with the movements predicted by my differential equations. A computer program, formally speaking, is simply a set of difference equations, in the sense that the next state of the computer will be a function of its state right now, just one cycle later. So we build our theories as computer programs, as difference equations.
Then we compare them with human data. Some people argue that you cannot take the Turing Test seriously, because it comes from the behaviorist point of view, and cognitive science has to look at the underlying process and not at the interpretation from the outside. So all I can observe is behavior. We have the same problem of inferring what goes on inside the box from what is going on outside.
That is not behaviorism. Behaviorism was a theory saying that we should satisfy ourselves with laws of the form «the following input leads to the following output» and that we should not inquire about what is inside the box. The people who are skeptical about Artificial Intelligence flop from one side to the other. On the one hand, they say that the human being is so complicated that nobody could build a machine capable of doing commonsense reasoning and of exhibiting the behavior of commonsense reasoning.
In the next breath, they tell us, Well, if all you have to do is match the behavior, you just write a simple program. Again, people who object to this have a completely unrealistic view of what science is about. They somehow seem to imagine that physics, for example, is a field with a great theory and that theory predicts everything.
When you think you understand that class of human tasks, you take some tougher ones. Novak's ISAAC program,16 which takes problems out of a physics book written in English, constructs an internal representation, draws a picture of it on the screen, writes the equations for the problems, and solves them. We have just completed some work on the «mutilated checkerboard» problem, an old puzzle about how to cover part of a checkerboard with dominoes. People think it is a hard problem, because in order to solve it, you have to change the representation.
A lot of people are also working on the related problem of what diagrammatic and pictorial representations are about. You said that when you want to solve a task or problem, the most important thing is to try to program it, and you will understand the model. Now, one could imagine a machine that wins every game of chess by pure brute force. But we know that human beings don't think that way.
So that machine would not pass the Turing Test, because it would not fit the behavioral evidence, which shows that no human being looks at more than a hundred branches of the tree before making a move. If we want to have a theory of human chess playing, we would need a machine that also matches that piece of evidence. If a computer is to be a model of human chess playing, it has to behave in such a way that it is not looking at more than a hundred. In fact, there is such a program.
It plays chess only in tactical situations.
So the program with brute force would not be Artificial Intelligence?
It would be Artificial Intelligence, but not cognitive science. It does not tell anything about how a chess grandmaster thinks, or very little. Expert systems became a sudden fad, and of course lots of people tried lots of silly things that would not work. Since 1956, there have been programs at Westinghouse that will take a customer's specifications for an electric motor, look into its memory to find a design that is similar to that motor, pull out these designs, pull out a lot of data about heating, transmission, and so on, and then redesign that motor for the customer's order, following almost the same procedure that engineers would follow.
So in your opinion there are machines doing Artificial Intelligence, but not in a spectacular way. In every architecture or construction firm today there is computer hardware that does most of the mechanical drawings of the buildings and stores various kinds of information about the buildings in its memory. One reason that it is done by machines and not by people any more is that computers know where they have put a wire through a pipe, whereas people make the drawings all the time without knowing it. Not only that, when you change from a number 2 bolt to a number 3 bolt between two I beams, the program will remember that and change the purchase requirements for bolts.Sometimes, these gradually change things around, the way DNA research did. The basic ideas of Artificial Intelligence have been around for only thirty years, so there are still areas to exploit.
So you are optimistic?
/ have beard many times that the early rate of progress in Artificial Intelligence has at least slowed down, if it has not stopped altogether. Look at the programs in the year 1990, compare them with 1985 and with 1980, compare them with 1975, and you will see that there are new things every five years that we could not do before. A number of social scientists are concerned about the social issues implied by Artificial Intelligence. Intelligence is a lot like the invention of writing, or the invention of the steam engine, in the sense that it does not have just one particular narrow application but penetrates everything, just as energy does.
Everything can make use of intelligence, everything can make use of writing. For that reason it is obvious that Artificial Intelligence, like writing and like the steam engine, can be used for good and bad purposes. Therefore the task is not to predict whether it will be good or bad but to design and modify our institutions so that we get as much of the good uses as we can and avoid as much of the bad uses as we can. Let's take the question of privacy as an example.
Artificial Intelligence gives some opportunities, even at a rather stupid level, for invasion of privacy. So I can use the same thing either to invade your privacy or to save myself from being killed by the New York Times. Well, we have to design our institutions so that they use it one way but not the other. The answer to that is what you think about the Industrial Revolution.
The main thing computing does, if it is successful, is raise the level of productivity. So you have to ask if you can have full employment with a higher level of productivity. One of the richest countries in the world feels poor. That means that we could use much more productivity than we have.
The idea that increased productivity is a menace is a wrong idea. The idea that we have to worry about who gets hurt while these changes take place is another matter. If we keep employment fairly high, then most people are not going to be hurt. What we should be worrying about is introducing not technology but institutions that make it possible for people to adapt to change.
If people think they are making the change themselves, so that they say «I'm going to Texas» or «I'm going to California,» there is no social problem.
JOSEPH WEIZENBAUM
Joseph Weizenbaum was born in Berlin in 1923 and emigrated to the UnitedStates in 1936. He studied mathematics at Wayne University, Detroit, served inthe United States Army Air Corps from 1942, and then resumed his study after
the war. At Wayne State he was involved in designing and producing a computer
before moving into industry in 1953. In 1963, he joined MIT as Professor of Computer Science, a position he held until his retirement in 1988.
The Myth of the Last Metaphor
I was introduced to computers in the early fifties, at Wayne University in Detroit. A mathematics professor there decided to build a computer for the university. After a long while, I participated in the design of a computer system for the Bank of America.
When the design of the bank system was over, the General Electric
I was very interested in the use of the computer in socially significant areas. I also got to know Kenneth Colby, a psychoanalyst in the Palo Alto area. I did develop that list processing system, and I got an honorary appointment at the Stanford Computation Center.
Joseph Weizenbaum was born in Berlin in 1923 and emigrated to the United
At Wayne State he was involved in designing and producing a computer before moving into industry in 1953. In 1963, he joined MIT as Professor of Computer Science, a position he held until his retirement in 1988. The console as if they had the whole computer. That was the basic idea.
Under the influence of this conversational computing, I became interested in actual conversation, in natural language conversation with the computer. I began to work on that project, and it got me right into one of the key areas of Artificial Intelligence, which was then and is still the processing of natural language by the computer.
That was the time that you developed ELIZA?
I had developed a lot of machinery for looking at written text, and I was now ready to program some sort of conversational mechanism. It was clear to me that when two people talk to each other, they both share some common knowledge. I was confronted with the problem that today might be called knowledge representation, which was not what I was interested in at that moment. I tried to think of a kind of conversation where one of the partners does not necessarily have to know anything.
So I thought of cocktail party conversations, or the conversations people have with bartenders who simply say «Yes, yes» or «Oh, that's the saddest story I've ever heard» and things like that.
I look, with a critical eye, on the whole computer field and, in fact, on modern science and technology in general. One thing I have learned in my life, which is now getting longer, is that I, as well as many other people, am very much tempted by binary thinking. If she is angry with me, it's the end of our marriage, the end of the world. I had lived as a Jewish boy in Hitler's Germany for three years, and I have remained, to this day, very interested in the history of fascism in Germany and throughout the world and in the political history of Europe.
I knew the role, I might say the miserable role, that German scientists, including physicians, played in German military history, also during the first World War. I saw myself in the same situation as these German scientists before and during World War II.
Why did you undertake to program ELIZA at alii What was the idea of simulating a conversation with a psychiatrist?
I have already explained that it was a way to avoid the difficult problem of knowledge representation. I had the idea, perhaps wrongly, that I understood their position better than kids who had been brought up here.
I did not think that I was making a direct contribution to social science or to psychology or psychiatry. Technically, ELIZA was an interpreter. ELIZA plays the psychiatrist, but there were others. Specifically, I remember one script in which one could define things that the system would then «remember,» for example, things like «The diameter of a circle is twice the radius of the circle.» This was typed in English, and the system stored it.
And «The area of the circle is IT times the square of the radius of the circle». Then one could type in «A plate is a circle» and «The diameter of the plate is two inches» and ask later, «What is the area of the plate?» The thing would see that the plate is a circle, that it did not have a definition of the area in terms of the diameter, but that there was a relationship between the diameter and the radius. It would look that up and finally tell you what the area of the plate was.
Intelligence community seriously?
One very critical component of such programs is that they formulate a picture of what it means to be a human being. It describes, in my view, a very small part of what it means to be a human being and says that this is the whole. That idea made the Holocaust possible. It would have been impossible without such an idea.
We, people in general, Americans included, find it necessary to have an image of the enemy in an animallike form, like the «Japs» in World War II.
Not long ago, at a public meeting at Harvard University at which both
I consider myself a member of the human family, and I owe loyalty to this family that I don't share, to the same extent, with other families. Well, I've special values for the human family, and by extension for life, carbon-based life. «What I am trying to say is that with these world-encompassing projects that strive for superhuman intelligence and for the ultimate elimination of the human race and so on emerges a subtly induced world picture, expressed, for example, in Dan Dennett's statement that we have to give up our awe of living things. » That enters our language and the public consciousness, and that makes possible a treatment of human beings vastly below the respect and dignity that they ought to have.
«Intelligence» community, as Searle would say.
I do think that the idea of computation has served psychology very, very well. Initially, in the late fifties, it was the idea of computation that almost dealt a death blow to behaviorism. Now comes the computer, and we can actually see internal logical operations take place and result in things that resem.Some people claim that it will be the last metaphor. As it happens, the latest technology has always claimed to be the last metaphor. So that, to take a current example, connectionism is a modification, an enlargement of the computational metaphor. And computing being such a powerful metaphor for thinking about thinking, this enlargement becomes immediately suggestive.
Now this is the last metaphor. Perhaps machines with biological components. The beginning could be largely siliconbased machines with a few biological components, then with more and more biological components. Finally somebody could get the idea that a man and a woman should come together and produce a thinking machine which is entirely biological, which is a baby .
In the debate between connectionism and the physical symbol systems hypothesis, the connectionists say that the underlying «hardware» is relevant, the biological structure. They claim that they are more concerned with the human brain. The machine need not consist of neurons, but the neurons have to be simulated. Finally, another stage is the neuronal machine, which is a kind of connection machine, but whose connectivity is to be imitative of how the neurons in the brain are connected to one another.
The second of these is still a Turing machine, 262 JOSEPH WEIZENBAUM in my view. What I am suggesting is that this machine is imitable by any other machine, so that the physical symbol system hypothesis is not falsified by the appearance of the connection machine. That, in turn, means that whatever limitation one sees in the physical symbol system hypothesis continues to exist for the connection machine. Let me now think of a neuronal connection machine, which, at the moment, is merely a fantasy.
I don't think that, in principle, it avoids the problems of this hypothesis and that it is a step ahead. One of the things that has entered our language as a result of computers is the term «real time.» If someone had said «real time» as recently as fifty years ago, nobody would have known what he could possibly mean. It would be very hard to build a simulated frog out of a machine that operates at a very low speed. From a philosophical, as opposed to a practical, point of view, real time and speed are not necessary.
That sort of consideration, among many others, enters into Dennett's argument against the Chinese Room.
If one considers quantum mechanics and relativity, for example, to be finally the real representation of the real world, the final metaphor, then it would not be useful or correct to damn all these physicists who have done physics «on the wrong basis» all these years before it came along.
ROBERT WILENSKY
Robert Wilensky was born in 195 I. He studied mathematics and computer science at Yale College and Yale University. Thereafter he worked in the Department of Medical Computer Science at Yale University, where he developed a
medical record system. As a research assistant at Yale, he began working in Artificial Intelligence. In 1978, he received his Ph.D. and moved to Berkeley, where
he was Director of the Artificial Intelligence Research Project and of the cognitive science program from its start. Today he is Professor in the Department of
Electrical Engineering and Computer Science at the University of California at
Berkeley
Why Play Pilosophy Game ?
What is interesting about what the brain does is that it does some interesting stuff with information. And as soon as we start talking about doing things with information we are talking about information processing, and a very good way to talk about information processing is to use the kind of languages and methods that computer scientists have been developing for a long time. For me, that is what cognitive science is about, but there are obviously many ways to get at the kind of information processing that people do. As a research assistant at Yale, he began working in Artificial Intelligence.
If I put it in information processing terms, Artificial Intelligence should seem central. The idea of looking at things in terms of how we describe information processes has got to be central.
Superficially, there is a problem, but no bigger problem than the one we have in trying to be a computer scientist or a linguist. I find in my own work that I have to know some things about computer science, some things about logic, some things about linguistics, and some things about philosophy. There are certainly things I don't know anything about.
Is cognitive science a unified discipline or just a mixture?
The way we divide things into fields is fairly arbitrary to begin with. Once we've divided them up we tend to assume that it is a natural division and that there are good reasons, for example, to have a psychology department.
The media got very excited about expert systems, which represented very little in terms of Artificial Intelligence technology, and they became important in the business world. My feeling is that there is a certain consensus within Artificial Intelligence about some problems and certain general ideas about how to solve them. What most Artificial Intelligence researchers today would call the biggest current problem is the representation and application of large amounts of world knowledge. It took a long time for people to recognize this as the main problem.
Philosophers are a funny group. I was interviewed by someone from a newspaper about what the philosophers at Berkeley said. «It is probably fair to say that some of my colleagues in the field have been unduly optimistic about how hard it was to do things. » Years ago Hubert Dreyfus said that a computer would never beat him at chess.
Then a computer beat him fairly shortly afterward. He said that computers would never play master-level chess. Now computers do this routinely. Then he said that a computer would never be world champion of chess.
What is interesting is that some of these same philosophers are very excited about neural networks. Quillian,2 which was an intrinsically parallel system. The notion that certain things have to be done in a massively parallel way is partly not distinct from some of traditional Artificial Intelligence. It is also the case that if you look at low-level vision, the mainstream Artificial Intelligence doctrine of how to do it is by massively parallel processing.
It has been known for years that perceptron-type systems were potentially very good at pattern recognition, and today's extension is not very surprising. Some people are intrigued by the learning aspect of these systems. People often like neural networks for the wrong reason or play an unsavory game. The game is «Since I am working with neural networks, my work is real science, because obviously the brain works with neurons.
Since you are not doing neural networks, your work is not science, but some sort of engineering.» When you ask these persons if they make an empirical claim that neurons in the brain work this way, they will say, «No, no, no. » What the right way is to think about these systems has not quite emerged yet. When you have a neural network that does certain tasks in a certain way, you find that you can characterize it without recourse to neural networks. Artificial Intelligence has been saying, among other things, Here is a description of what the mind is doing in terms of information processing, of how it is behaving, that is independent of physiological details.
If the answer is that the mind is computing in parallel minimization of errors by gradient descent, that is a nice computer science/mathematical dissertation. That is what computer science has been doing all along.
I find it somewhat infelicitous to say computers are «good» at this or «bad» at that, and with people it is the other way around. In some sense, this is not surprising, because we created computers to do very specific things. It is almost miraculous that they can do all those other things at all. The reason why computers are not good at certain things is that we have not thought for a long time about how to make them good.
There is no in-principle problem with commonsense?
There are other areas of science where people have proven in principle that something cannot be done. If there is an in-principle problem with getting machines to do certain things, no one has even suggested what it is, let alone if it is true. To say that such statements are in some sense a possible challenge to the traditional Artificial Intelligence shows such a profound ignorance of what the enterprise is like that it is hard even to respond to them seriously. The problem is that in our everyday practice we rely on a lot of background knowledge.
When did Artificial Intelligence people start to use the term background knowledge} About twenty years ago. The knowledge that people have about a restaurant is exactly background knowledge. So the whole idea of representing knowledge in Artificial Intelligence has basically been the problem of how to represent background knowledge. Searle says that the background is a mysterious thing and that it is not represented.
I don't know a single Artificial Intelligence researcher who would not believe that. Positions differ in how it should be encoded and what exactly it should look like, but the idea that you are operating by virtue of background information is axiomatic in Artificial Intelligence. But Searle's argument is not that the background is not represented but rather that it is nonrepresentational. What they seem to mean by it is that a person can become consciously aware of this knowledge or can explicitly articulate it.
But you will find that Artificial Intelligence literature never makes such a claim. Artificial Intelligence researchers are making. Clearly, the form in which such knowledge is encoded is different from that of knowledge that we can readily articulate as English sentences.
What is the reason for this ongoing argument?
There are philosophers who work on time and space, but you would not ask them every time a physicist says something. There are some things in Artificial Intelligence, I think, that are interesting for the question of how we think about people. Most of them are not even discussed by philosophers, unfortunately. The philosophers see that people in Artificial Intelligence touch upon topics that are important to them.
« Well, I don't know... » Biologists don't worry about that too much, and the same thing is very much true in Artificial Intelligence. There are some researchers who would ask themselves whether they would attribute a certain property to a system with certain characteristics. If you say things that are sensible, no one will pay attention to you. Intelligence.
You said that information processing is central to Artificial
The whole question of whether these systems actually have symbols or not is a minor argument. The neurophysiologist talks in terms of information processing, because this is in a sense what the mind is doing. If I say that the mind does error minimization by gradient descent, that is a description in terms of information processing, and this is exactly how the connectionists talk about it.
You have to say what the visual system accomplishes independent of how it is done. The answer is, presumably, that the brain does things that are very abstract. So, I would not even know what to say about the mind that is interesting if I did not talk in terms of information processing. I think it was McCarthy who said that a thermostat, too, is an information processing device, and we could ascribe it beliefs.
I don't find anything problematic in saying that there are lots of things doing information processing. Searle said that if you think so, you must also assume that a thermostat has beliefs and so on. This is basically a silly argument. Why should we be at all uncomfortable if I said that my mind is doing information processing and your tape recorder is doing information processing, if the mind's information processing is so much more complex than what the tape recorder is doing?.So you've got all those biological systems that can have experiences, and there you draw the line. You would like to understand that what a bee experiences as pain is sort of like what you experience as pain, but it is far less subtle. The further down you go in the biological world, the more problematic is the question. It is not clear whether it is a whole lot more sensible to ask if a thermostat has beliefs than to ask if a bee has them.
What do you think about the Turing Test?
Building a system that duplicates a person in every detail is such a gargantuan task that the possibility is even beyond my own fairly optimistic expectations. It seems too hard.
The usual story is: people claim that it was mainly the influence of
In vision they never stopped thinking about neural networks, because they always knew that the only way to talk about vision in a sensible way was to talk in terms of parallel processing on the neural level. If you pick up an Artificial Intelligence textbook, the one picture of a neuron you will find is in the chapter on vision. The early attempts to put a neural network in a corner and let it learn just did not work. There really was a basis being built for doing neural networks properly, on top of the understanding that we have gained.
The other thing is simply a computational problem. The criticism was that certain styles of systems have limited processing capabilities. It is well understood that if you have systems that are more complicated than early perceptrons, they would not necessarily be limited in that way. Most of the neural networks now are more complicated than perceptrons.
Again, it is important not to underestimate the amount of effort that went into thinking about how you would do these things at WHY PLAY THE PHILOSOPHY GAME? 281 all, before you could think about how to do them with parallel computing systems. My conviction is that none of them will work without some subtle interaction between them. The problem was always how to compute textures. There were theories for years, nobody got it really right, and the classical theory was recently disproved.
It refers to neurons at some level, but you can talk about the computations independently of them. We can speculate that in the brain it is done by neurons like this and neurons like that, but we don't know. If I want even to describe what the neurophysiological level is doing, I have to describe it in these terms of computation.
TERRY A. WINOGRAD
Terry A. Winograd was born in 1946 in Maryland. He studied mathematics at
Colorado College (B.A.) and linguistics at University College (London). He received his Ph.D. in applied mathematics from MIT in 1970. He was Assistant Professor of Electrical Engineering at MIT from 1971 to 1974, before joining Stanford
University, where he has been first Assistant Professor and then Associate Professor of Computer Science and Linguistics since 1974.
Social Values
That was six years before the term Artificial Intelligence was coined. It shows how easy it was and still is to overestimate the ability of machines to simulate human reasoning.
These are some of the principal components of what is called cognitive science today. The meaning of the term cognitive science is changing with time. In the near future, there might be some more programs and even some more departments. I do not anticipate that cognitive science as a field will become an important field numerically.
It will be difficult for people with a degree in cognitive science to get a job. Whether they apply to psychology, linguistics, or computer science departments, the answer will be that they are at the periphery and not in the center of the field.
Intelligence? Are there stages f
The term Artificial Intelligence was not in existence at that time yet, but still there was already this discussion going on. It was at that conference that the field was launched and Artificial Intelligence came into existence. In the late fifties and early sixties, the concerns of Artificial Intelligence were centered on game playing and to a lesser extent on pattern recognition, problem solving, and so on. I would say that in the late sixties and early seventies the attention was shifted to knowledge representation, meaning representation, and, in connection with that, natural language processing.
In the midseventies, some expert systems made their appearance.
So a definite shift took place, as problems like game playing and pattern recognition receded in importance. Artificial In-telligence has become a big field. Some of the big conferences have more than five or six thousand participants. It is a field that is undergoing rapid change, but right at this point there is a feeling that the expectations were exaggerated.
Many of the expectations did not materialize.
When did this feeling appear?
There were some articles critical of Artificial Intelligence that point to all sorts of things that were promised but not delivered. I see as a reason for the lack of accomplishments the almost total commitment of Artificial Intelligence to classical logic. Artificial Intelligence has become identified with symbol manipulation. I think that symbol manipulation is not sufficient when it comes to finding the solutions for the tasks Artificial Intelligence has set for itself in the past, like natural language, understanding, summarization of stories, speech recognition, and so forth.
Artificial Intelligence people are likely to disagree with me there, but I am convinced that in the end they will be proved to be wrong.
You said that three or four years ago there was some criticism of
Artificial Intelligence. He was right as to the fact that some of the expectations that were built up by the «artificial intelligentsia» were exaggerated. Predictions of this kind soured many people on Artificial Intelligence. I part company with Dreyfus when it comes to the use of computers to solve many problems that can be solved effectively by humans.
No, not like human beings.
Airlines flight 325 going to arrive at Los Angeles?» There will be no human operator on the other side, but a machine that understands and analyzes your question, looks through its database, and, with a synthesized voice, gives you the answer, something like «Flight 325 has been delayed. Systems like that will be in wide use. You will be able to ask questions like «If I want to do this and that, how do I do it?» The VCR may be able to tell you certain things. I think the voice input-voice output systems will have an enormous impact, because the most natural way of communicating with a machine may well be voice input-voice output.
Jules Verne said, at the turn of the century, that scientific progress is driven by exaggerated expectations. There was a debate between Donald Michie and Sir John Lighthill about Artificial Intelligence and robotics. In any case, it was a mistake for the British government to stop its support of Artificial Intelligence.
Artificial Intelligence people to have a superiority complex with regard to cognitive science. The Artificial Intelligence community view the cognitive science community as comprising for the most part people who are not capable of doing serious work insofar as Artificial Intelligence is concerned. I do not think that in the cognitive science community there is a similar superiority complex in relation to Artificial Intelligence. Intelligence people think that they have nothing to learn from the others, so they don't go to their conferences.
I think the cognitive science community would probably like to have more interaction with the Artificial Intelligence community, but, as I said, this feeling is not mutual. Within the cognitive science community, there is more interest in neural networks, in connectionism at this point. The attitude toward neural networks and connectionism within the Artificial Intelligence community is somewhat mixed, because the Artificial Intelligence community is, as I said earlier, committed to symbol manipulation. Connectionism is not symbol manipulation.
Artificial Intelligence community. In Europe, things might be different. Artificial Intelligence people have perhaps a closer relationship with cognitive science, and that relationship may be likely to grow. In the United States you have to sell research, and therefore you have to be a salesperson.
In Europe this is less pronounced because of the tradition of research funded by the government.
But there are some programs that claim to do summarizing?
These programs do not have the capability at this point, and they are not likely to have it in the future to summarize nonstereotypical stories of a length of one thousand words or so. We cannot implement this knowledge on our machines, and at this point we do not have the slightest idea how it could be done.
What are the major problems that Artificial Intelligence has to overcome to make progress?
One of the most important problems in Artificial Intelligence is the problem of commonsense reasoning. Many people realize that a number of other problems like speech recognition, meaning understanding, and so forth depend on commonsense reasoning. In trying to come to terms with commonsense reasoning, Artificial Intelligence uses classical logic or variations of it, like circumscription, nonmonotonous reasoning, default reasoning, and so on. But these techniques make no provision for imprecision, for fuzzy probabilities and various other things.
I think it is impossible to come to grips with commonsense reasoning within the framework of traditional Artificial Intelligence. We need fuzzy logic for this. Human nature does not work that way. As long as you merely manipulate symbols, you have a limited capability of understanding what goes on in the world.
Database systems can answer all kinds of questions, not because they understand but because they manipulate symbols, and this is sufficient to answer questions. What Searle basically points to is the limitation associated with symbol manipulating systems. But as I said earlier, it is not necessary to go into all these things that Searle has done to explain it. To people in Artificial Intelligence, these are not really deep issues, although there have been debates going on for maybe forty years.
There are certain issues, one of which is concerned not with Artificial Intelligence but with the capability of computers to store large volumes of information so that our lives can be monitored more closely. I have a collection of horrible stories about humans who got stuck with errors made by computers.
What could be done about these future problems?
Even now, many of these intelligence agencies monitor telephone conversations, and if you have the capability of processing huge amounts of information, they can monitor everybody's telephone conversations. I don't think this will happen, but the capability is there. And once the capability is there, there will be the temptation to misuse it.
Intelligence?
Artificial Intelligence may not maintain its unity. It is very possible that within the next decade we will see a fragmentation of Artificial Intelligence. This branch started within Artificial Intelligence, but it is like a child that has grown up and is leaving its parents. There are many people working in these fields who are not members of the Artificial Intelligence community.
The two currently most important fields within Artificial Intelligence are going to leave Artificial Intelligence. Then there is the field of voice input-voice output, and that too is leaving Artificial Intelligence. If you look at the programs of Artificial Intelligence conferences and subtract these things, you are left with things like game playing, and some Artificial Intelligence-oriented languages. These things will not be nearly as important as Artificial Intelligence was when knowledge-based systems were a part of it.
Intelligence will be like parents left by their children, left to themselves. Artificial Intelligence, to me, is not going to remain a unified field.
LOTFI A. ZADEH
Lotfi A. Zadeh studied electrical engineering at the University of Tehran as an
undergraduate. He continued his studies in the United States, at MIT (S.M.E.E.,
1946) and at Columbia University, where he received a Ph.D. in 1949. He taught
at Columbia University and joined the Department of Electrical Engineering at the
University of California at Berkeley in 1959, serving as the department's chairman
from 1963 to 1968. He has also been a visiting member of the Institute for Advanced Study in Princeton, New Jersey; a visiting professor at MIT; a visiting scientist at IBM Research Laboratory in San Jose, California; and a visiting scholar at the
Al Center, SRI International, and the Center for the Study of Language and Information, Stanford University. Currently he is Professor Emeritus of Electrical Engineering and Computer Science at Berkeley and the director of BISC (Berkeley
Initiative in Soft Computing)
The Albatross of Classical Logic
That was six years before the term Artificial Intelligence was coined. It shows how easy it was and still is to overestimate the ability of machines to simulate human reasoning.
When you talk about cognitive science you are not talking about something that represents a well-organized body of concepts and techniques. We are talking not about a discipline but about something in its embryonic form at this point, because human reasoning turns out to be much more complex than we thought in the past. And as we learn more about human reasoning, the more complex it becomes. In my own experience, at some point I came to the conclusion that classical logic is not the right kind of logic for modeling human reasoning.
These are some of the principal components of what is called cognitive science today. The meaning of the term cognitive science is changing with time. In the near future, there might be some more programs and even some more departments. I do not anticipate that cognitive science as a field will become an important field numerically.
It will be difficult for people with a degree in cognitive science to get a job. Whether they apply to psychology, linguistics, or computer science departments, the answer will be that they are at the periphery and not in the center of the field.
Intelligence? Are there stages f
The term Artificial Intelligence was not in existence at that time yet, but still there was already this discussion going on. It was at that conference that the field was launched and Artificial Intelligence came into existence. In the late fifties and early sixties, the concerns of Artificial Intelligence were centered on game playing and to a lesser extent on pattern recognition, problem solving, and so on. I would say that in the late sixties and early seventies the attention was shifted to knowledge representation, meaning representation, and, in connection with that, natural language processing.
In the midseventies, some expert systems made their appearance.
So a definite shift took place, as problems like game playing and pattern recognition receded in importance. Artificial In-telligence has become a big field. Some of the big conferences have more than five or six thousand participants. It is a field that is undergoing rapid change, but right at this point there is a feeling that the expectations were exaggerated.
Many of the expectations did not materialize.
When did this feeling appear?
There were some articles critical of Artificial Intelligence that point to all sorts of things that were promised but not delivered. I see as a reason for the lack of accomplishments the almost total commitment of Artificial Intelligence to classical logic. Artificial Intelligence has become identified with symbol manipulation. I think that symbol manipulation is not sufficient when it comes to finding the solutions for the tasks Artificial Intelligence has set for itself in the past, like natural language, understanding, summarization of stories, speech recognition, and so forth.
Artificial Intelligence people are likely to disagree with me there, but I am convinced that in the end they will be proved to be wrong.
You said that three or four years ago there was some criticism of
Artificial Intelligence. He was right as to the fact that some of the expectations that were built up by the «artificial intelligentsia» were exaggerated. Predictions of this kind soured many people on Artificial Intelligence. I part company with Dreyfus when it comes to the use of computers to solve many problems that can be solved effectively by humans.
No, not like human beings.
Airlines flight 325 going to arrive at Los Angeles?» There will be no human operator on the other side, but a machine that understands and analyzes your question, looks through its database, and, with a synthesized voice, gives you the answer, something like «Flight 325 has been delayed. Systems like that will be in wide use. You will be able to ask questions like «If I want to do this and that, how do I do it?» The VCR may be able to tell you certain things. I think the voice input-voice output systems will have an enormous impact, because the most natural way of communicating with a machine may well be voice input-voice output.
Jules Verne said, at the turn of the century, that scientific progress is driven by exaggerated expectations. There was a debate between Donald Michie and Sir John Lighthill about Artificial Intelligence and robotics. In any case, it was a mistake for the British government to stop its support of Artificial Intelligence.
Artificial Intelligence people to have a superiority complex with regard to cognitive science. The Artificial Intelligence community view the cognitive science community as comprising for the most part people who are not capable of doing serious work insofar as Artificial Intelligence is concerned. I do not think that in the cognitive science community there is a similar superiority complex in relation to Artificial Intelligence. Intelligence people think that they have nothing to learn from the others, so they don't go to their conferences.
I think the cognitive science community would probably like to have more interaction with the Artificial Intelligence community, but, as I said, this feeling is not mutual. Within the cognitive science community, there is more interest in neural networks, in connectionism at this point. The attitude toward neural networks and connectionism within the Artificial Intelligence community is somewhat mixed, because the Artificial Intelligence community is, as I said earlier, committed to symbol manipulation. Connectionism is not symbol manipulation.
Artificial Intelligence community. In Europe, things might be different. Artificial Intelligence people have perhaps a closer relationship with cognitive science, and that relationship may be likely to grow. In the United States you have to sell research, and therefore you have to be a salesperson.
In Europe this is less pronounced because of the tradition of research funded by the government.
But there are some programs that claim to do summarizing?
These programs do not have the capability at this point, and they are not likely to have it in the future to summarize nonstereotypical stories of a length of one thousand words or so. We cannot implement this knowledge on our machines, and at this point we do not have the slightest idea how it could be done.
What are the major problems that Artificial Intelligence has to overcome to make progress?
One of the most important problems in Artificial Intelligence is the problem of commonsense reasoning. Many people realize that a number of other problems like speech recognition, meaning understanding, and so forth depend on commonsense reasoning. In trying to come to terms with commonsense reasoning, Artificial Intelligence uses classical logic or variations of it, like circumscription, nonmonotonous reasoning, default reasoning, and so on. But these techniques make no provision for imprecision, for fuzzy probabilities and various other things.
I think it is impossible to come to grips with commonsense reasoning within the framework of traditional Artificial Intelligence. We need fuzzy logic for this. Human nature does not work that way. As long as you merely manipulate symbols, you have a limited capability of understanding what goes on in the world.
Database systems can answer all kinds of questions, not because they understand but because they manipulate symbols, and this is sufficient to answer questions. What Searle basically points to is the limitation associated with symbol manipulating systems. But as I said earlier, it is not necessary to go into all these things that Searle has done to explain it. To people in Artificial Intelligence, these are not really deep issues, although there have been debates going on for maybe forty years.
There are certain issues, one of which is concerned not with Artificial Intelligence but with the capability of computers to store large volumes of information so that our lives can be monitored more closely. I have a collection of horrible stories about humans who got stuck with errors made by computers.
What could be done about these future problems?
Even now, many of these intelligence agencies monitor telephone conversations, and if you have the capability of processing huge amounts of information, they can monitor everybody's telephone conversations. I don't think this will happen, but the capability is there. And once the capability is there, there will be the temptation to misuse it.
Intelligence?
Artificial Intelligence may not maintain its unity. It is very possible that within the next decade we will see a fragmentation of Artificial Intelligence. This branch started within Artificial Intelligence, but it is like a child that has grown up and is leaving its parents. There are many people working in these fields who are not members of the Artificial Intelligence community.
The two currently most important fields within Artificial Intelligence are going to leave Artificial Intelligence. Then there is the field of voice input-voice output, and that too is leaving Artificial Intelligence. If you look at the programs of Artificial Intelligence conferences and subtract these things, you are left with things like game playing, and some Artificial Intelligence-oriented languages. These things will not be nearly as important as Artificial Intelligence was when knowledge-based systems were a part of it.
Intelligence will be like parents left by their children, left to themselves. Artificial Intelligence, to me, is not going to remain a unified field.
What about connectionism?
I think there are exaggerated expectations at this point with regard to connectionism. They think that connectionism and neural networks will solve all kinds of problems that, in reality, will not be solved.
I first got interested in language and thought when I took my master's degree in 1955 at UCLA, where I was introduced to the work of Edward
These contacts stimulated me to go outside of sociology.
In the spring of 1954,1 began a series of small research projects with
Harold Garfinkel3 that led to a serious interest in everyday reasoning and social interaction. My dissertation got a lot of people in sociology upset because I talked about a number of issues that were about language, social interaction, and reasoning among the aged. I found phenomenological ideas very useful, but I was bothered eventually by the fact that there was not much empirical research attached to that tradition. My preoccupation with the local ethnographically situated use of language and thought in natural settings was not always consistent with this tradition.
By 1958, at Northwestern University, I had already read Chomsky's book Syntactic Structures4 and George Miller's5 works on memory and language. I was also strongly interested in work by Roger Brown6 and his students on language acquisition, but I could not find anyone in sociology who cared about or gave any importance to these areas.
In 1969-70 I was attracted to work in linguistics and psychology in
In 1970-71,1 received a NSF Senior Postdoctoral Fellowship to work on British Sign Language in England, where I replicated, with deaf subjects, work on hearing children's language competence and repeated some of this research in California. I moved to San Diego in 1971 and got in touch with the cognitive group in psychology and also with people in linguistics. At UCSD, an informal group interested in cognitive science began meeting, and this led to the development of a cognitive science doctoral program and the foundation of the first Department of Cognitive Science. It has been a bit strange for people in cognitive science to understand my social science background, but I sometimes remind them that my first degree was in psychology from UCLA, and this background made it easier to talk to colleagues in cognitive science than to many sociologists.
Following my early use of the work of Alfred Schutz on socially distributed knowledge, I became interested in Roy D'Andrade's and Edwin
The two notions of socially distributed cognition and socially distributed knowledge overlapped considerably because cognition is always embedded in cultural beliefs about the world and in local social practices. These conditions tend to be ignored by social scientists.
What could be the reasons why social scientists ignore this fact and why they do not play an important part in cognitive science now?
In sociological social psychology, that is, primarily symbolic interactionism, a lot of emphasis is placed on what is situational without examining language use and information processing constraints during social interaction. It seems to me that this view also ignores the constraints that the brain has placed on the way you can use conscious and unconscious thought processes, emotions, and language, and the way that language places further restrictions on what you can or want to communicate about what you think you know or don't know. A central issue of human social interaction are constraints that stem from cognitive processing. Looking at the role of social ecology and interaction requires a lot of longitudinal, labor-intensive research.
My point is that when you take for granted culture and the way it is reflected in a local social ecology you eliminate the contexts within which the development of human reasoning occurs. When culture, language use, and local interaction are part of cognitive studies, subjects are asked to imagine a sequence of events that have not been studied inde-COGNITION AND CULTURAL BELIEF SI pendently for their cultural basis or variation except for the self-evident variation assumed by the experimenter. But in everyday life, cognition and culture include locally contingent, emergent properties while exhibiting invariant patterns or regularities whose systematic study remains uneven. But what is complicated about cognitive science's focus on individual human information processing is what happens when two minds interact.
There are many constraints on how interaction can occur. For example, during exchanges in experimental settings and everyday life, problem solving can be influenced by how well subjects know each other as part of the local social ecology, how long they have known each other, the nature of the task, individual variation, and the socially distributed nature of cognitive tasks.
Suchman's Plans and Situated Action?20
She is obviously aware of environmental conditions and applies her knowledge of ethnomethodology and conversation analysis to issues in Artificial Intelligence and the situated nature of local task accomplishments. It needs to be linked to empirical research that people have done on children's cognitive and linguistic development. The most interesting parts for me are his intermediate reflections, where he brings in the cultural background as a horizon.
That tradition limits the extent to which the environment comes in. Dreyfus has been interested in the background notion because of his phenomenological perspective, but the relevance of a locally organized social ecology is not addressed as directly as can be seen in work by Alfred Schutz. Habermas, however, has been influenced by a broad range of theorists, including those interested in everyday interaction, such as G. .
What I said earlier applies: sociologists avoid cognition because from
What I have been trying to say is that the structure of social organiza-tion is important as a constraint, as well as how interaction is going to guide its reproduction. Social ecology puts limits on the kind of human cognition and information processing that can occur in a given physical space, at a given time for particular kinds of problem solving or decisions. On certain occasions, you can use only certain types of language, you can use only certain topics, you have to be seated in a certain place, and you have certain regularities about who gets to talk and when. This is the case in all social ecologies just as it applies to you and me sitting in this room.
Everything we do involves these constraints as well as the fact that social ecology or organization facilitates certain kinds of interaction and inhibits other kinds. Well, I see all of this as the interaction of different levels of analysis. I do not see why studying the brain is a problem. Researchers would like to know what neural substrates are relevant for cognitive functions while also examining and understanding these functions at their own level of analysis, despite not always having a clear picture of the neurobiology.
With what the connectionists claim—do you think that it is an advance from Newell's and Simon's physical symbol system hypothesis?
I am interested in a more macrolevel bandwidth, such as decision making in such areas as caretaker-child interaction during the acquisition of cognitive, cultural, and linguistic knowledge and practice, med-ical diagnostic reasoning, and the role of expert systems that employ rules within a knowledge base but where the time frame can be highly variable. The problem of modeling an individual mind is so complicated right now that most modelers are not prepared to do more than that. But Hutchins and others are concerned with applying connectionist models to the cognitive interaction of two or more individuals. Many people make claims about Artificial Intelligence.
The key is that Artificial Intelligence modeling can help clarify basic mechanisms involved in cognition. The value of simulation, in my opinion, depends on the way it makes use of empirical research in a given domain. I think it is difficult to apply a Turing Test to sociology because the field is too diffuse and lacks adequate articulation between general theory and the concepts that research analysts explore empirically.
Turing Test because the notion of environment remains ambiguous and unexplored. The notion of commonsense suggests a contrasting term like logical thinking or rational choice. My colleague in this research is a clinician, and we look at the question of how you actually get the knowledge from the patient that is required to use an expert system.
He addresses the cognitive processes of users of artifacts that are not addressed adequately by their producers. The notion of knowledge can be examined as a process, a set of practices and mental models about the processes and what they are to accomplish. Let me assume that the physician is a knowledge engineer. The general point is that knowledge should be conceived as a process both internally and when instantiated in experimental or practical settings.
Is it right that you claim not only that cognition is underdeveloped in social theory but also that social scientists do not have experiments or ideas for experiments?
To my knowledge, none of them have examined everyday social interaction experimentally. The experimental work done by sociologists has been confined to what is called «small group research» by a subset of social psychologists at a few universities. The study of human affect, language, learning, and decision making has progressed primarily by its reliance on laboratory research and the use of sophisticated modeling programs. The laboratory and modeling research, however, seldom includes information about the behavioral ecology of humans in natural settings or the social ecology of the laboratory.
My recent concern with the acquisition of emotions and social interaction uses experimental data obtained by Judith Reilly, Doris Trauner, and Joan Stiles on children with and without focal brain lesions. One focus of this research is the notion that processing and expressing emotions requires an intact right hemisphere and that this hemispheric specialization may occur quite early in human development. In general terms, we seek a developmental explanation of human affect and facial expression and suggest that this development is a function of the role of neural substrates and the learning of pragmatic elements of verbal, paralinguistic, and nonverbal behavior. Yet, some caretakers seem to become sufficiently depressed in socializing such infants that this can eliminate the possibility of enhancing the infant's or child's ability to adapt to typical behavioral routines that occur in local social encounters.
My goal is to examine the extent to which experimental work on cognition presumes stable forms of social interaction the infant and child must acquire to be seen as «normal.» This social and communicative competence is thus necessary to be able to conduct experiments about the infant's or child's cognitive development.
It seems to me that any problems that emerge can be useful if they force us to question the way we use existing teaching systems, medical systems, legal systems, and so on.
Let me give you a sense of my own views about how current Artificial
My concern with medical diagnostic reasoning revolves around problems of actually implementing existing systems. I think traditional production systems and the use of neural network modeling capture the essential formal criteria used to arrive at an adequate diagnosis.
DANIEL C. DENNETT
In Defense of AI
Before there was a field called cognitive science, I was involved in it when
At that time, I knew no science at all. I was completely frustrated by the work that was being done by philosophers, because they did not know anything about the brain, and they did not seem to be interested. Just about that time, I learned about Artificial Intelligence. When I got my first job at the University of California at Irvine in 1965, there was a small Artificial Intelligence group there.
The paper he threw on my desk was Bert Dreyfus's first attack on Artificial Intelligence. Intelligence. Since it fit with my interests anyway, I wrote an article responding simultaneously to Dreyfus and to Allen NewelPs view at that time. From 1965 to 1971, he taught at the University of California at Irvine.
Since 1985, he has been Distinguished Professor of Arts and Sciences and Director of the Center for Cognitive Studies at Tufts University. Dreyfus and Newell. « I was very interested in the debate, and I was sure that Dreyfus was wrong. The people in Artificial Intelligence were glad to have a philosopher on their side.
From where did you get your knowledge about Artificial Intelligence and computer science?
Palo Alto on philosophy and Artificial Intelligence. I still did not come out of that year very computer-literate, but I made some considerable progress. In the years following, I developed that much further, even to the point where, a few years ago, I taught a computer science course here.
It would be hard to find four more philosophical people in Artificial Intelligence than McCarthy, Pylyshyn, Hayes, and Moore. And it would be impossible to find two more Artificial Intelligence-minded philosophers than Haugeland and me.
Like most philosophers, Dreyfus tries to find something radical and absolute rather than a more moderate statement to defend. Intelligence, particularly people like John McCarthy, have preposterously overestimated what could be done with proof procedures, with explicit theorem-proving style inference. If that is what Dreyfus is saying, he is right. But then, a lot of people in Artificial Intelligence are saying that, too, and have been saying it for years and years.
Marvin Minsky, one of the founders of Artificial Intelligence, has always been a stern critic of that hyperrationalism of, say, McCarthy. Dreyfus does not really have a new argument if he is saying something moderate about rationalism. If, on the other hand, Dreyfus is making a claim that would have as its conclusion that all Artificial Intelligence, including connectionism and production systems, is bankrupt, he is wrong. There are things you cannot represent, like bodily movements, pattern recognition, and so on.
I understand that he says the human mind is not merely a problem solver, as the physical symbol systems hypothesis would imply.
So you think that one can combine the physical symbol systems hypothesis with the intuition that you cannot explain everything with a symbol?
What about connectionismf It criticizes the physical symbol systems hypothesis and seeks alternative explanations for the human mind. If you look at the nodes in a connectionist network, some of them appear to be symbols. Some of them seem to have careers, suggesting that they are particular symbols. Because it turns out that you can disable this node, and the system can go right on thinking about cats.
Moreover, if you keep the cat and dog nodes going and disable some of the other nodes that seem to be just noisy, the system will not work. The competence of the whole system depends on the cooperation of all its elements, some of which are very much like symbols. At the same time, one can recognize that some of the things that happen to those symbols cannot be correctly and adequately described or predicted at the symbol level.
There is a big difference in theoretical outlook, but it is important to recognize that both outlooks count as Artificial Intelligence. They count as strong Artificial Intelligence. In John Searle's terms, connectionism is just as much an instance of strong Artificial Intelligence as McCarthy's or Schank's7 or Newell's work. Artificial Intelligence.
Both Searle and Dreyfus generally think that it could be a more promising direction. Let's follow Searle's criticism of strong Artificial Intelligence. He has refuted strong Artificial Intelligence with his Chinese Room Argument.
What do you think about that?
First of all, the Chinese Room is not an argument, it is a thought experiment. When I first pointed this out, Searle agreed that it is not an argument but a parable, a story. Searle's most recent reaction to all that is to make explicit what the argument is supposed to be. I have written a piece called «Fast Thinking,» which is in my book The Intentional Stance.
It examines that argument and shows that it is fallacious. Briefly, Searle claims that a program is «just syntax,» and because you can't derive semantics from mere syntax, strong Artificial Intelligence is impossible. But claiming that a program is just syntax is equivocal. One of the things that has fascinated me about the debate is that people are much happier with Searle's conclusion than with his path to the conclusion.
They did not care about the details of the argument, they just loved the conclusion. Finally I realized that the way to respond to Searle that met people's beliefs effectively was to look at the conclusion and ask what it actually is, to make sure that we understand that wonderful conclusion.
Searle's conclusion. 8 / think Searle would classify that reply as the «systems reply,» because you argue that nobody claims that the microprocessor is intelligent but that the whole system, the room, behaves in an intelligent way. Searle modified the argument in response to this. He says that the systems reply does not work because it is only one step further.
I had already been through that with Searle many times. Searle does not address the systems reply and the robot reply correctly. He suggests that if he incorporates the whole system in himself, the argument still goes through. But he never actually demonstrates that by retelling the story in any detail to see if the argument goes through.
I suggest in that piece that if you try it, your intuitions at least waver about whether or not there is anything there that understands Chinese or not. We have got Searle, a Chinese-speaking robot, who has incorporated the whole system in himself. If we tell the story in a way in which he is so engrossed in manipulating his symbols that he is completely oblivious to the external world, then of course I would be certainly astonished to discover that my friend John Searle has disappeared and has been replaced by a Chinese-speaking, Chinese-understanding person. If Searle had actually gone to the trouble of looking at that version of his thought experiment, it would not have been as obvious to people anyhow.
Sure, you could ask the robot questions, observe its behavior, and ascribe it intentionality, as you wrote in your paper. But to ascribe intentionality does not mean that it really has intentionality. There is not any original, intrinsic intentionality. The intentionality that gets ascribed to complex intentional systems is all there is.
When computers came along, we began to realize that there could be systems that did not have an infinite regress of homunculi but had a finite regress of homunculi. The homunculi were stupider and stupider and stupider, so finally you can discharge the homunculus, you can break it down into parts that are like homunculi in some regards, but replaceable by machines. They can get by with systems that are only a little bit like homunculi. If we try to do it with rules forever, we have an infinite regress of rules.
This is not an infinite regress.
HUBERT L. DREYFUS
Cognitivism Abadoned
Which brings me, by the way, to cognitivism, because representationalism is not just a thesis of Artificial Intelligence, it is a thesis of cognitivism as I understand it. It seems to me that Artificial Intelligence and then cognitivism as a generalization of Artificial Intelligence is the heir to traditional, intellectualist, rationalist, idealist philosophy. Wittgenstein's Philosophical Investigations were first published in 1953,4 shortly before Newell and Simon started to believe in symbolic representations.
Why do you think they took this approach in cognitive science and
So that, by the time Artificial Intelligence and cognitivists come along, it doesn't even seem to be an assumption anymore. It just seems to be a fact that the mind has in it representations of the world and manipulates those representations according to rules. Their idea was that you could treat a computer as «a physical symbol system,» that you could use the bits in a computer to represent features in the world, organize these features so as to represent situations in the world, and then manipulate all that according to strict rules so as to solve problems, play games, and so forth. The computer made it possible to make the intellectualist tradition into a research program just as connectionism has now made it possible to make the empiricist, associationist tradition in philosophy, which goes from Hume to the behaviorists, into a research program.
You mentioned several times the notions of cognitivistn and cognitive science.
I want to distinguish cognitive science from cognitivism. All those disciplines are naturally converging on this question, and what they do is cognitive science. But cognitivism is a special view. As behaviorism is the view that all you can study is behavior and maybe all that there is is behavior, so cognitivism is the special view that all mental phenomena are fundamentally cognitive, that is, involve thinking.
Cognitivism is the peculiar view that even perception is really thinking, is really problem solving, is really manipulation of representations by rules. I think that there isn't any philosophical argument or empirical evidence that supports it. But that would simply leave cognitive science channeled in different directions.
What role does Artificial Intelligence play in these notions?
Artificial Intelligence is a good example of cognitivism, and it was the first version of cognitivism. It is an example of trying to make a model of the mind, of making something that does what the mind sometimes does, using rules and representations and cognitive operations such as matching, serially searching, and so on. So here is Artificial Intelligence, trying to do what the mind does, using computers in a special way, namely as physical symbol systems performing cognitive operations. So cognitivism is a development out of a certain subpart of Artificial Intelligence.
Do you think that this problem with commonsense knowledge has already been realized in the Artificial Intelligence community?
It totally evades the commonsense knowledge problem and does not gain any followers outside the Carnegie group. Then there is Lenat and his wonderful but, I think, hopelessly misguided attempt to program the whole of human knowledge into the computer. Then there are the people at Stanford who had «commonsense summer.» They were trying to formalize commonsense physics.
Artificial Intelligence has said that there has been a failure in symbolic
Artificial Intelligence. But there is nothing like a unified research program.
Either you have to go back to microworlds and take a domain like the blocks world or the domains of expert systems, which are so limited that they are cut off from commonsense knowledge and have only a very special domain of knowledge, or else you have to put in a vast amount of commonsense knowledge. At the time being, the problem is how you could get this vast amount of knowledge into that superbig memory. I would doubt that you could, because the most important part of commonsense knowledge is not «knowledge» at all but a skill for coping with things and people, which human beings get as they grow up.
Since that point, cognitivism and Artificial Intelligence look new, because now there are two competing research programs within cognitivism and Artificial Intelligence. That becomes another research strategy in Artificial Intelligence. That means now that cognitive science has two tracks. It has cognitivism, which is the old, failed attempt to use rules and representations, which takes up the rationalist-intellectualist tradition.
This looked like a discredited view while rationalists were running the show, but now it looks to many people like a very promising approach to intelligence. Already in 1957 Rosenblatt was working on neural nets. So «neural net Artificial Intelligence» might and, I think, will succeed in getting computers to do something intelligent like voice recognition and handwriting recognition, but no one would think of that as a contribution to cognitive science, because it will clearly not be the way people do it. Then there will be another part of neural network simulation that is a contribution to cognitive science and neuroscience, because it will have different algorithms, the sort of algorithms that could be realized in the brain.
Only now, when symbolic information processing has been given a chance and shows that it is not the way to go, when we have computers big enough to try to do neural nets again, will all sorts of associationist models come back into cognitive science.
I think Artificial Intelligence is a great failure, but I think also there are a few useful applications. When you ask an expert for his rules, he regresses to what he remembers from when he was not an expert and still had to use rules. That is why expert systems are never as good as intuitive experts. That leaves only two areas where expert systems will be interesting, and these are relatively restricted areas.
PUFF system, but it turns out you can algorithmatize the whole thing, and you do not need any rules from experts.
It turns out that the people doing the job of customizing mainframe computers had to make a lot of calculations about the capacities of components on the basis of the latest manuals, and so, naturally, machines can do that better. There were no intuitive experts who had a holistic view of configuration. Another example is the system that the Defense Department uses, called ALLPS, which was developed by SRI and was not meant to be an expert system. It was not developed by Artificial Intelligence people.
It is an expert system, however, in that it uses rules obtained from people who load transport planes. Two experts had to calculate for many hours to do it. Now, one program can do it in a few minutes. That's a real success, and it is real Artificial Intelligence, a real expert system.
Feigenbaum once said that expert systems would be the second and the real computer revolution. Feigenbaum has a complete misunderstanding of expertise, again a rationalist misunderstanding, as if experts used rules to operate on a vast body of facts that they acquire. Then expert systems would eventually take over expertise and wisdom, vastly changing our culture. Lots of people said, «Well, it will fail to produce intelligent systems in ten years, but it will produce lots of other interesting things.» I haven't heard any interesting things that have come out of the Fifth Generation, and it's been around for five or six years, I think.
JERRY A. FODOR
The Folly Of Simulation
I talked a lot with graduate students about psychology, and so I got involved in doing some experiments. For fairly fortuitous reasons, I picked up some information about philosophy, linguistics, and psychology. It turned out that my own career had elements of the various fields, which converged to become cognitive science, except computer science, about which I know very little.
In your opinion, which sciences are part of the enterprise of cognitive
If you are a cognitive psychologist and know a little bit about philosophy of mind, linguistics, and computer theory, that makes you a cognitive scientist. If you are a cognitive psychologist but know nothing about these fields, you are a cognitive psychologist but not a cognitive scientist.
So cognitive science is dominated by cognitive psychology?
Traditionally, cognitive psychologists have had the view of the world that experimental psychologists are inclined to. Cognitive science is cognitive psychology done the way that cognitive psychology naturally would have been done except for the history of academic psychology in America, with its dominant behaviorist and empiricist background.
The relation to cognitive science is actually fairly marginal, except insofar as what you learn from trying to build intelligent mechanisms, to program systems to do things, may bear on the analysis of human intelligence. You have to distinguish between that very profound idea, the central idea of modern psychology, which is endorsed by most cognitive science people and by most Artificial Intelligence people, and any particular purposes to which that idea might be put. Artificial Intelligence attempts to exploit that picture of what mental processes are like for the purpose of building intelligent artifacts. It shares the theory with cognitive science viewed as an attempt to understand human thought processes, but they are really different undertakings.
The project of simulating intelligence as such seems to me to be of very little scientific interest.
You don't do physics by trying to build a simulation of the universe. It may be in principle possible, but in practice it's out of the question to actually try to reconstruct those interactions. What you do when you do science is not to simulate observational variance but to build experimental environments in which you can strip off as many of the interacting factors as possible and study them one by one. Simulation is not a goal of physics.
What you want to know is what the variables are that interact in, say, the production of intelligent behavior, and what the law of their interaction is. Once you know that, the question of actually simulating a piece of behavior passing the Turing Test is of marginal interest, perhaps of no interest at all if the interactions are complicated enough. The old story that you don't try to simulate falling leaves in physics because you just could not know enough, and anyway the project would not be of any interest, applies similarly to psychology. The latter is the strategy of cognitive psychology.
You mentioned the Turing Test. If you don't think that simulation is the goal of cognitive science, then you are not likely to be impressed by the Turing Test.
I don't think there is any interesting answer to the question of what we should do in cognitive science that would be different from the answer to the question of what we should do in geology. The idea that mental processes are transformations of mental representations.
Do you think the question of intentionality does not matter at all?
Turing idea does not answer that question and is not supposed to. That does not answer the question of what gives our thoughts truth conditions, and that is the problem of intentionality, basically. Turing's answer seems to me to be the right one, namely that the states that have truth conditions must have syntactical properties, and preservation of their semantic properties is determined by the character of the operations on the syntactical properties. Turing talks about intelligence, not about intentionality.
Artificial Intelligence is about intelligence, not about intentionality. I think people in Artificial Intelligence are very confused about this. But Searle claims that you cannot say anything about intelligence without intentionality. The main idea is to exploit the fact that syntactical properties can preserve semantic properties.
That is the basic idea of proof theory. Searle is perfectly right when he says that the Turing enterprise does not tell you what it is for a state to have semantic properties. Whatever the right theory about intentionality is, it is not what cognitive science is going on about. I think you can have a theory of intentionality, but this is not what cognitive science is trying to give.
If we assume intentionality, the question becomes, Can we have a syntactical theory of intelligence?.
In their criticism, they mainly referred to people like Minsky and
We are at the very margins of exploring a very complicated idea about how the mind might work. The fact that the Artificial Intelligence project has failed does not show anything about the syntactical account of cognitive processes. What Bert and John have to explain away are all the predictions, all the detailed experimental results that we have been able to account for in the last twenty years. Simulation failed, but simulation was never an appropriate goal.
People like Dennett or Hofstadter3 believe in simulation, and they have a very strong view of it. They say that a good simulation is like reality. When you deal with cases where the surface phenomena are plausibly viewed as interactions, you don't try to simulate them. As I said, Turing had the one idea in the field, namely that the way to account for the coherence of mental processes is proof-theoretic.
Connectionism consists of the thought that the next thing to do is to give up the one good idea in the field.
Why it has become so fashionable is an interesting question to which
One of the things connectionism really has shown is that a lot of people who were converted to the Turing model did not understand the arguments that converted them because the same arguments apply in the case of connectionism. It seemed to me that they have a series of powerful arguments, for instance, that the human brain, on the basic level, is a connectionist network. The question of what the causal structure of the brain is like at the neurological level might be settled in a connectionist direction and leave entirely open what the causal structure of the brain is like at the intentional level.
If the hardware that implements these discrete states is diffuse, then to that extent the device will exhibit graceful degradation. Nobody, in fact, has a machine, either the classical or the network kind, that degrades in anything like the way people do. In the sense in which graceful degradation is built into a network machine, it has to do with the way the network is implemented. It is not going to be built into a network machine that is implemented in a Turing machine, for example.
In the only interesting cases, learning is not statistical inference but a kind of theory formation. And that is as mysterious from the point of view of network theory as it is from the point of view of the classical theory. I think connectionist networks will just disappear the way expert systems have disappeared. As for the future of cognitive science, we are about two thousand years from having a serious theory of how the mind works.
Did you say that expert systems will disappear?
No one takes them seriously in psychology. People in computer science have a different view. I said in psychology. In psychology, this program has never been very lively and is now conclusively dead.
My guess is that connectionist models will have the same fate. Connectionist models do have some success with pattern recognition. Up to now, we have been thinking about these devices as inference devices, devices that try to simulate the coherence of thought. What you have is not a new theory of learning but a standard statistical theory of learning, only it is embedded in an analog machine rather than done on a classical machine.
It will explain learning only insofar as learning is a process of statistical inference. The problem is that learning is no process of statistical inference. Roughly speaking, you construct a theory of gases for gases that don't have all the properties that real gases have, and insofar as the model fails, you say it is because of interactions between the properties the theory talks about and the properties that the theory treats as noise. Now, that has always been the rational strategy in science, because the interesting question is not whether to do it but which idealization to work with.
I assume that cognitive science is going to work in exactly the same way. You want to do psychology for ideal conditions and then explain actual behavior as ideal conditions plus noise. In the case of perceptual psychology, the advance has been extraordinarily rich. In logic, we idealize again to the theory of an ideally rational agent and explain errors as interactions.
These idealizations have to be tested against actual practice, and so far
I don't know any reason at all to suppose that the idealizations that have been attempted in cognitive theory are not appropriate or not producing empirical success. I would have thought that the empirical results suggest exactly the opposite inference. Let me just say a word about background information. To begin with, I think that Bert is absolutely right that we have no general theory of relevance, that we don't know how problems are solved when their solution depends on the intelligent organism's ability to access relevant information from the vast amount we know about the world.
That is correct, and it is one of the problems that will eventually test the plausibility of the whole syntactical picture. On the other hand, it is an empirical question how much of that kind of capacity is actually brought to bear in any given cognitive task. Bert is inclined to take it for granted that perception is arbitrarily saturated by background information. That is a standard position in phenomenological psychology.
That means that you can construct computational models for those processes without encountering the general question of relevance, of how background information is exploited. If you look at the successes and failures of cognitive science over the last twenty or thirty years, they really have been in those areas where this isolation probably applies. Searle uses the example "Do doctors wear underwear?" He is claiming that the answer is not found by an inference process. To go from the fact that it is a serious question to the conclusion that there is no inferential answer to it seems a little quick.
If there really is an argument that shows that no inferential process will work, then the computational theory is wrong. I could tell you experimental data for a week that look like it required an inferential solution. If you think that the fact that it is stereotyped shows that it is noninferential, you have to explain away the experimental data. Even Bert thinks that the fact that after you have become skilled you have no phenomenological, introspective access to an inferential process establishes only a prima facie case that it is not inferential.
The trouble is, there are all these data that suggest that it is, that a process like speech recognition, which is phenomenologically instantaneous, is nevertheless spread out over very short periods of time, goes through stages, involves mental operations, and so on. There are a lot of data that suggest that introspection is not accurate, that people are wrong about whether or not these tasks are inferential. That does not show that perception or action are not inferential. What it shows is that some of the things that you know are not accessible to the computational processes that underlie perception and action. That shows that if the processes are inferential, they are also encapsulated. Take visual illusions.
There is a pretty good argument that inferential processes are involved in generating the illusions. But knowing about the illusions does not make them go away. What that shows is, among other things, that what is consciously available to you and what is inferential don't turn out to be the same thing.
JOHN HAUGELAND
Farewell of GOFAI ?
What are the main ideas of cognitive science, in contradistinction to
I certainly think that Artificial Intelligence is a part of cognitive science. There is one way of thinking about cognitive science that takes the word cognitive narrowly and takes Artificial Intelligence as the essence of it. There are other conceptions of cognition in which ideas from Artificial Intelligence play a smaller role or none at all.
People talk about cognitive linguistics,1 although this is again a disputed characterization. We still don't know just how cognition is relevant to linguistics. One importance of both linguistics and psychology for Artificial Intelligence is setting the problems, determining what an Artificial Intelligence system must be able to achieve. Some people also include cognitive anthropology, but that is secondary.
And, of course, philosophy is often included. I believe that philosophy belongs in cognitive science only because the «cognitive sciences» have not got their act together yet. If and when the cognitive sciences succeed in scientifically understanding cognition and the mind, then philosophy's role will recede. Some people, especially in the computer science community, don't understand what role philosophy has to play in this enterprise.
Once you have that view, once you have what Kuhn calls a paradigm,2 then all that remains is normal science. I think that philosophy is still pertinent and relevant, however, because I don't think that the field has arrived at that point yet. Indeed, we are quite possibly in a revolutionary period in this decade. You coined the term Old-Fashioned Artificial Intelligence, so you seem to think of connectionism as the new direction.
Which periods would you distinguish?
First, there is the period of pre-Artificial Intelligence, before it really got going, from the middle forties to the middle fifties, which was characterized by machine transla-tion, cybernetics, and self-organizing systems. Starting in the middle to late fifties, the physical symbol systems hypothesis began to take over. Minsky's and Papert's book3 did have an important effect on network research, but I think the symbol processing approach took over because it was, on the face of it, more powerful and more flexible.
It is not a simple question, because there are different ways in which to understand what is meant by «for» cognitive science and Artificial
Intelligence. Simon tells me that it is not, that they are trying to incorporate it in their models. In his opinion, the commonsense problem is one proof that Artificial Intelligence with the physical symbol systems hypothesis will never succeed. Nobody is in a position to claim that anything has been proved, neither Dreyfus nor Simon.
But I do believe that, in the first place, the problem of building machines that have commonsense is nowhere near solved. And I believe that it is absolutely central, in order to have intelligence at all, for a system to have flexible, nonbrittle commonsense that can handle unexpected situations in a fluid, natural way, bringing to bear whatever it happens to know.
My conviction that the physical symbol systems hypothesis cannot work is mostly an extrapolation from the history of it. I don't think that you could produce an in-principle argument that it could not work, either like Searle or like Dreyfus . I am mostly sympathetic to Dreyfus's outlook. I think it is right to focus on background, skills and practice, intuitive knowledge, and so on as the area where the symbol processing hypothesis fails.
What about the problem of skills that you mentioned?
What you call cognitive or not depends on how broadly or how narrowly you use the term cognitive. This is the difference in the conception of cognition I mentioned at the beginning of the interview. I think a psychologist or an Artificial Intelligence professor is quite right to be exasperated by any suggestion that «it just does it» meaning «so don't try to figure it out.» That is crazy. However, none of that means that the way it does it will ultimately be intelligible as symbol processing.
Newell and Simon would accept and Rumelhart and McClelland would not, nor would I, nor would Dreyfus. Either affective phenomena, including moods, can be partitioned off from something that should be called more properly «cognitive,» so that separate theory approaches would be required for each and there would be no conflict between them. Or we cannot partition off moods and affect , in which case symbol processing psychology cannot cope with it.
Do you think that physical symbol systems hypothesis could cope with the problem of qualia?
I don't know what to think about the problem of qualia. I am inclined to suspect that that problem can be partitioned off. Those are integral to intelligence and skills. Let's turn to connectionism.
The rise of connectionism has become a power in the field since then. But the idea that intelligence might be implemented holistically in networks, in ways that do not involve the processing of symbols, has been around, in a less specific form, for a long time. There are three kinds of attraction for connectionism. The hardware response time of neural systems is in the order of a few milliseconds.
If you had all kinds of time, you could implement any kind of system in that hardware. So the one-hundred-steps constraint ties you more closely to the hardware. It is one attraction of connectionism that it is an approach fitting this hardware apparently much more naturally.
That would be the difference between parallel architecture and von
Neumann architecture. But people like Newell and Simon say that they now use parallel processing. They claim that on a certain level, connectionism, too, has to introduce symbols and to explain the relation to the real world. At first I thought the main difference was between serial and parallel architecture.
Not a von Neumann machine, but a LISP machine. The deeper way to put the difference is that in a connectionist network, you have very simple processors that, indeed, work in parallel. The way in which those strengths vary on all those connections encodes the knowledge, the skill that the system has. So that, if one component in a symbol processing machine goes bad, everything crashes, whereas one wrong connection in a network can be compensated, which is similar to the brain.
The architecture is not at all a hardware issue. The hardware itself will have an architecture, of course, but the notion of architecture is not tied to the hardware. The question of whether it is hardware or software is not a deep question. The difference between a von Neumann and a LISP machine is that a LISP machine is a function evaluator.
A von Neumann machine is a machine built around operations performed on specified locations in a fixed memory. The important thing that von Neumann added that makes the von Neumann machine distinctive is the ability to operate on values, especially addresses, in the memory where the program is located. Thus, the program memory and the data memory are uniform. It has nothing to do with the hardware, except that the one-hundredsteps constraint keeps it close to the brain.
In fact, almost all connectionist research is currently done on virtual machines, that is, machines that are software implemented in some other hardware. They do use parallel processors with connections when they are available. The trouble is that machines that are connectionist machines in the hardware are fairly recent and quite expensive. In a production system, even if it is parallel, you do have somebody in charge, at least at higher levels.
Just as you can get your PC to be a LISP machine, you can also get it to be a virtual connectionist machine.
The first attraction of connectionism you mentioned is that the hardware is closer to the human brain. If you have a noisy or defective signal, the system treats that signal as if it were a noisy or deformed variant of the perfect prototype. And this really is the basis of a great deal of what the systems do. You take the pattern to be one part of the signal, and the identification of the pattern addresses a lack or apparent failure of classical Artificial Intelligence systems.
So this is the second point of attraction, of which hardware is one aspect. The third attraction is that classical Artificial Intelligence has more or less gone stale for a dozen years. It could be that when our children look back on this period fifty years from now, they will say that only the faint of heart were discouraged in this period of stagnation, which was a consolidation period with successes continuing afterward. Or it could be that this is the beginning of the end of classical Artificial Intelligence.
The symbol processing hypothesis cannot incorporate connectionism. So the «partially» would be «almost completely,» except maybe for some peculiarities like reflexes, and physical abilities like walking, and so on. There is symbol processing, and it has to be implemented in the brain.
GEORGE LAKOFF
Embodied Minds and Meanings
Meanings
Probably I should start with my earliest work and explain how I got from there to cognitive science. Jakobson on the relationship between language and literature. My undergraduate thesis at MIT was a literary criticism thesis, but it contained the first story grammar. That led to the study of story grammar in general.
Out of that came the idea of generative semantics. I asked the question of how one could begin with a story grammar and then perhaps generate the actual sentences that express the content of the story. In order to do this, one would have to characterize the output of the story grammar in some semantic terms. So 1 asked myself whether it would be possible for logical forms of a classical sort to be deep structures, actually underlying structures of sentences.
in Linguistics from Indiana University in 1966. Professor of Linguistics at the University of California at Berkeley in 1972.
He taught at Harvard and the University of Michigan before his appointment as
Since 1975, after giving up on formal logic as an adequate way to represent conceptual systems, he has been one of the major developers of cognitive linguistics, which integrates discoveries about conceptual systems from the cognitive sciences into the theory of language. «Be hot.» We eventually found evidence for things of this sort.
I had a commitment to the study of linguistic generalizations in all aspects of language. In semantics, I assumed that there are generalizations governing inferential patterns of the meaning of words, of semantic fields, and so on. So I took the study of generalizations as a primary commitment. I also assumed that grammar has to be cognitive and real.
These are systems in which arbitrary symbols are manipulated without regard to their interpretation. So I had a secondary commitment to the study of formal grammar. I also had a «Fregean commitment,» that is, that meaning is based on truth and reference. Chomskian-Fregean commitment on the other hand.
The history of my work as a generative linguist has to do with the discovery of phenomena where you could not describe language in fully general, cognitively real terms using the Chomskian-Fregean commitments. Jim McCawley, in 1968, came up with an interesting sentence that could not be analyzed within classical logical forms. In 1968,1 suggested moving away from the description of logical form to model theory.
You were still at MIT at that time?
This was about the same time that Richard Montague was suggesting that possible world semantics ought to be used for natural language semantics, as was Ed Keenan from the University of Pennsylvania. Partee sat in on Montague's lectures at UCLA, and the three of us, Partee, Keenan, and myself, were the first ones to suggest that possible world semantics might be useful for doing natural language semantics. One could not use possible world semantics so easily for natural language semantics. By 1974, it became clear that the Chomskian theory of grammar simply could not do the job.
«Around 1975, a great many things came to my attention about semantics that made me give up on model theoretic semantics. One of these was Charles Fillmore's work on frame semantics,8 in which he showed that you could not get a truth-conditional account of meaning and still account correctly for the distribution of lexical items. Eleanor Rosch's work on prototype theory came to my attention in 1972, and in 1975 I got to know her and Brent Berlin's work on basic level categorization. I attended a lecture at which Rosch revealed fundamental facts about basic level categorization, namely that the psychologically basic categories are in the middle of the category hierarchy, that they depend on things like perception and motor movement and memory.
Rosch had shown that the human body was involved in determining the nature of categorization. This is very important, because within classical semantics of the sort taken for granted by people both in generative grammar and in Fregean semantics, meaning, logic is disembodied, independent of the peculiarities of the human mind. That suggested that neurophysiology entered into semantics. This was completely opposed to the objectivist tradition and Fregean semantics.
The thing that really moved me forever away from doing linguistics of that sort was the discovery of conceptual metaphor in 1978. Ortonyi's Metaphor and Thought11 showed basically the role of metaphor in everyday language. This meant that semantics cannot be truth conditional, it could not have to do with the relationship between words and the world, or symbols and the world. It had to do with understanding the world and experiences by human beings and with a kind of metaphorical projection from primary spatial and physical experience to more abstract experience.
Around the same time, Len Talmy and Ron Langacker began discovering that natural language semantics require mental imagery. They showed that semantic regularities required an account of image schemata or schematic mental imagery. Reading their work in the late seventies, I found out that it fit in very well with all the other things I had discovered myself or found out through other people. That entailed that you could not use the kind of mathematics that Chomsky had used in characterizing grammar in order to characterize semantics.
The reason was, as we had first shown in generative semantics, that seman-tics had an effect on grammar, and we tried to use combinatorial mathematics to characterize logical form. We thought that the use of formal grammars plus model theory would enable us to do syntax and semantics and the model theoretic interpretation. However, if meaning is embodied, and the mechanisms include not just arbitrary symbols that could be interpreted in terms of the world but things like basic level categories, mental images, image schemas, metaphors, and so on, then there simply would be no way to use this kind of mathematics to explain syntax and semantics. Our work in cognitive linguistics since the late seventies has been an attempt to work out the details of these discoveries, and it changed our idea not only of what semantics is but of what syntax is.
This view is attacked as relativistic by people holding an objective semantics position. If you take the idea of total relativism, which is the idea that a concept can be anything at all, that there are no constraints, that «anything goes» in the realm of conceptual structure, I think that is utterly wrong. Rather, there are intermediate positions, which say that meaning comes out of the nature of the body and the way we interact with the world as it really is, assuming that there is a reality in the world. We don't just assume that the world comes with objectively given categories.
We impose the categories through our interactions, and our conceptual system is not arbitrary at all. Those are very strong constraints, but they do not constrain things completely. They allow for the real cases of relativism that do exist, but they do not permit total relativism. So, in your view, there is a reality outside the human mind, but this reality is perceived by the human mind.
Therefore you cannot establish reality only by logical means. The conceptual system, the terms on which you understand the world, comes out of interaction with the world. That does not mean that there is a God's-eye view that describes the world in terms of objects, properties, and relations. Objects, properties, and relations are human con-cepts.
We impose them on the world through our interaction with whatever is real.
What happened when you got these ideas about embodiment?
We now have some very solid research on non-Western languages showing that conceptual systems are not universal. Even the concept of space can vary from language to language, although it does not vary without limits but in certain constrained ways. We have been looking at image schemas. There seems to be a fixed body of image schemas that turns up in language after language.
We are trying to figure out what they are and what their properties are. I noticed that they have topological properties and that each image schema carries its own logic as a result of its topological properties, so that one can reason in terms of image schemas. The spatial inference patterns that one finds in image schemas when they apply to space are carried over by metaphor to abstract inference patterns. There have been two generations of cognitive science.
They were internal representations of some external reality. That second generation of cognitive science is now being worked out. It fits very well with work done in connectionism. The physical symbol system hypothesis was basically an adaptation of traditional logic.
We are trying to put the results of cognitive linguistics together with connectionist modeling. One of the things I am now working on is an attempt to show that image schemas that essentially give rise to spatial reasoning and, by metaphor, to abstract reasoning, can be characterized in neural terms. I am trying to build neural models for image schemas. So far, we have been successful for a few basic schemas.
Traditional generative phonology assumed the symbol manipulation view.
All over the country?
Those are two different things. In terms of cognitive science, the applications have been much more modest, and only a small number of people are working in this field. So, although connectionism has emerged around the world as an engineering device, it has merely begun to be applied to cognitive science. Some people in connectionism claim that it could be used to model the lower level processing of the human mind, like vision, pattern recognition, or perception, but that for more abstract processes like language the symbol manipulation approach could still be useful.
The people who are claiming that are those who come out of Artificial
Intelligence, raised in the symbol system tradition. They don't consider connectionism to be biological, they consider it to be another computer science view. What we are trying to do is show how meaning can be grounded in the sensory-motor system, in the body, in the way we interact with the world, in the way human beings actually function. People who try to adapt the physical symbol system view by adding a little bit of connectionism to it are not seriously engaged in that research.
PDP book18 is that radial categories arise naturally from connectionist networks. Technically this gives rise to radial categories. In the models we have we begin to see that the brain is structured in such a way that modules of retinal maps could naturally give rise to cognitive topology. At least, now we can make a guess as to why we are finding that cognitive topology exists.
Moreover, one of the things that Rumelhart discovered is that analogical structuring is a natural component of neural networks. That could explain why there should be metaphor, why the abstract reasoning is a metaphorical version of spatial reasoning. Computer models are secondary, but they may allow us to begin to understand why language is the way it is.
In connectionism, we need to get good models of how image schemas and metaphors work, and good models of semantics, fitting together the kind of work that has come out of cognitive semantics. We also need to put together a much more serious theory of types of syntax. Langacker's work on cognitive grammar is one kind of a beginning, as is my work in grammatical construction theory and Fillmore's work in construction grammar.
Dangerous Things,20 we have both pointed out that the entire notion of epistemology has to be changed within philosophy in such a way that it completely changes most philosophical views. Not only the analysis of language but also the relationship between ontology and epistemology. These systems fit our experiences of the world in different ways, as there is more than one way to fit our experiences perfectly. One consequence is that conceptual systems turn out not to be selfconsistent.
In general, human conceptual systems must be thought of as having inherently inconsistent aspects. In law, there is a commitment to objective categories, to objective reality, and to classical logic as the correct mode of human reasoning. In general, the political and social sciences are framed in terms of the old cognitive science, the old objectivist philosophy.
What do you mean exactly by »objectivist philosophy«?
It says that the world is made up of objects, these have objectively given properties, and they stand in objective relations to one another, independent of anybody's knowledge or conceptual system. Categories are collections of objects that share the same properties. Categories not only are objects and properties but they are out there in the world independent of the mind. That means that reason has to do with the structure of the world.
If all that is false, if reason is different, if categories are not based on common properties, then our view of reality has to change. Moreover, our view of thought and language has to change. The idea that our mental categories mirror the world, that they fit what is out there, turns out to be wrong. Correspondingly, our names of mental categories are supposed to be thereby names of categories in the world, and our words can thereby fit the world, and sentences can objectively be true or false.
If meaning depends instead on understanding, which in turn is constructed by various cognitive means, then there is no direct way for language to fit the world objectively. It must go through human cognition. Now, human cognition may be similar enough around the world that in many cases no problems arise.
JAMES L. MCCLELLAND
Toward a Pragmatic Connectionism
When I finished graduate school and went to San Diego, which was in
This was where I became a cognitive scientist as opposed to a cognitive psychologist, I would say. I was no longer doing experiments simply to make inferences from them, but I was also trying to formulate really explicit computational theories. I had new conceptions, and I needed to formulate a different theoretical framework for them.
Typical members of a cognitive psychology department stay very close to the data. They might come up with a model explaining the data from this or that experiment, but mostly they stay close to the facts. Given where I come from, I don't want to lose track of the data. But the notion that one needs an explicit, computationally adequate framework for thinking about the problem at hand, and that one then tries to use this framework to understand the data, is really the essence.
Computational model means that you have to be able to articulate your conception of the nature of the process you are interested in in such a way as to specify a procedure for actually carrying out the process. Now, the word computational process gets associated with this, as opposed to just process, because it is ordinarily understood, in cognitive circles, that these activities are being done removed from the actual external objects about which they are being done. This is about the whole business of the physical symbol system hypothesis. You are doing some mental activity with a certain result not on physical objects, although there is a physical substrate for the process, but on things that represent those objects, in the same way that we use numbers to calculate bushels of wheat instead of piling up the wheat.
In a computational model, we try to figure out what representations we should have, based on what information, and what process applies to these representations. This is our conception of a computational model, but it also carries with it, for me, the notion of explicitness, of specifying something that allows you to understand in detail how the process actually goes, as opposed to articulating a rather vague, general, overarching conceptual-ization such as a Piagetian theory, where you have some very interesting but often exceedingly vague concepts that are difficult to bring into contact with facts. Your definition of a computational model and of the intermediate abstraction seems to contradict the new connectionism, which goes down to the neurophysiological level. One idea of the computational model is that you can abstract from the hardware, that only the intermediate level of symbol manipulation matters.
The result of this mental process can be new representations. At some level, this sounds like the physical symbol system hypothesis. Representationalism is essential in connectionism. So the main difference is not about the view that there are representations, that there is symbol manipulation going on.
A symbolic relation to objects in the world simply means that the representation is not the thing itself but something that functions as the object for computational purposes. It can be used as the basis for further processes, which might result in the formation of new representations, just as we have the number 2 to represent the two bushels of wheat from one field and the number 5 to represent the five bushels from another field. The things that come out of it are new representations. But what those representations are, whether they are symbols in the sense of Newell's notion of symbols or not, that is where there is some disagreement.
My feeling is that the attempt to understand thinking may progress with the conventional notion of symbol, but that it is limited in how far it can go and that it is only an approximation of the real right way to think about representations. If we are able to understand more clearly how we can keep what the people who think about symbols have been able to achieve, while at the same time incorporating new ideas about what mental representations might be like, we would be able to make more progress in understanding human thought. A symbol is generally considered to be a token of an abstract type that gains power from pointing to other knowledge. The symbol itself is not the information but only a pointer to it.
When you process the symbol, you have to search down through some complex tree to find that information. That limits the ability of a symbol-oriented system to exploit the contents of the objects of thought. If the objects of thought in the representations we have of them could give us somehow more information about those contents, this would increase the flexibility and the potential context sensitivity of the thought processes. We want to capture what we can with these representations, but we think that we need to go beyond them to understand how people can be insightful and creative, as opposed to mechanical symbol processors.
One thing I share with the people who formulated the physical symbol system hypothesis is a commitment to sitting down and seeing if something can actually work. I myself am not so expert in the kinds of arguments those people propose, but it is actually my suspicion that many of the things Dreyfus points to are things that result from the limitations of symbols as they are currently conceived of by people like Newell, as opposed to some more general notion of thought as manipulation of representations. Once we achieve a superior notion of what representations are like we will be in a much better position to have a theory that won't founder on that particular kind of criticism. One point of that criticism is the so-called commonsense problem.
Symbol manipulation cannot solve the problem of this huge amount of commonsense knowledge that people have as background. The basic notion of connectionism is that a representation is not a static object that needs to be inspected by an executive mechanism. If you have a bunch of processing units, and these units have connections to other processing units, and a representation is a pattern of activity over the processing units, then the relations of the current pattern of activity to patterns of activity that might be formed in other sets of units are in the process themselves, in the connections among the units. The notion that a mental representation is not just a location in the data structure but a pattern of activity that is interacting with the knowledge by virtue of the fact that the knowledge is in the connections between units gives us the hope that we will be able to overcome this particular limitation.
I would say that we do not yet know the limits of what can be achieved by the kinds of extensions to statistical methods that are embodied in connectionist models.
What
Second, connectionist models are now being applied extensively to the task of extracting structure from sequences. In connectionism, we have a similar kind of situation. We have a basic notion about a set of primitives, a notion of what the general class of primitives might be like. Then we formulate specific frameworks within this general «framework generator,» if you like, which can then be used for creating specific models.
At the most abstract level, the idea is that mental processes take place in a system consisting of a large number of simple computational units that are massively interconnected with each other. The knowledge that the system has is in the strength of the connections between the units. The activity of the system occurs through the propagation of signals among these units. Boltzmann machine, which is a very specific statement about the properties of the units, the properties of the connections, and maybe about the way knowledge is encoded into the connections.
The Boltzmann machine can then be instantiated in a computer program, which can be configured to apply to particular, specific computational problems. What we did in the handbook that Rumelhart and I published5 was collect together a few of the fairly classical kinds of frameworks for doing connectionist information processing. They are statements about the details of the computational mechanisms, which can then be instantiated in computer programs. Not the computer programs are essential, but the statement of the principles.
We do feel that giving people some firsthand experience maybe gets something into their connections .
They try to get the model to exhibit behavior that captures other known properties of that physical circuit, perhaps in order to study particular neurons in a particular area with the model incorporating facts about the distribution of connections and so on. Other people are building connectionist networks because they want to solve a problem in Artificial Intelligence. They think that the connectionist framework will be a successful one for doing this. In that case, there is absolutely no claim made about the status of these things vis-avis human cognition.
There is often the implicit claim that humans are probably doing something like this if you believe that this is the way to build an artificial system, because humans are so much better than most existing artificial systems in things like speech recognition. Of course, when you adopt a particular framework, you make an implicit statement that it will be productive to pursue the search for an understanding within the context of this framework. «Therefore we can reject the whole framework out of which it was built». A similar thing happened to symbol processing.
I really don't subscribe to the notion that all you need to do is to take a back propagation network and throw some data at it and you will end up with a remarkable system. But we are at the point where several state-of-the-art methods for doing things like phoneme or printed character recognition are successfully drawing on connectionist insights.
What is the reason for the rise of connectionism in the last few years?
The main ideas go back to the 1960s. He said something like, «Well, I got six hours of computer time between midnight and 6 a. » The equation might lead to a breakthrough in what kind of training can occur in connectionist networks. The fact that people have workstations on their desks has created the opportunity to explore these ideas and has led to key mathematical insights.
The computational resources contribute to the development of a critical mass. You have to have knowledge about psychology, mathematics, computer science, and philosophy as well. What we have to do is create environments that bring together just the right combinations of things to allow people to make progress. At another institution, they may have a much better chance to make a connection with linguistics, because there are linguists who have the right kinds of ideas about language and connectionists who have the right kinds of tools.
Rumelhart had incredible mathematical modeling skills, and I brought an awareness of the details of the data in a particular area. I learned how to model from working with him, how to get this whole thing to work, how to account for the data I knew something about. Elman said, «Hey, I can see how this relates to language,» and he and I started to work together on language.
ALLEN NEWELL
The Serial Imperative
Computer science as a discipline does not show up until the early or midsixties. Before that, computers were viewed as engineering devices, put together by engineers who made calculators, where the programming was done by mathematicians who wanted to put in mathematical algorithms. As the number of scientists was fairly small, lots of things that are regarded today almost as separate fields were thrown together.
How did this lead to Artificial Intelligence?
Back in the late forties and fifties, there were already speculations about computers, cybernetics, and feedback, about purpose in machines and about how computers might be related to how humans think. Allen Newell one of the founders of the fields of Artificial Intelligence and cognitive science, died on 19 July 1992 at the age of sixty-five.
University, then worked for the Rand Corporation as a research scientist from
Their discussions on how human thinking could be modeled led Newell to Pittsburgh, where he collaborated with Simon and earned a Ph. Newell joined the CIT faculty in 1961. Carnegie-Mellon's School of Computer Science and elevating the school to worldclass status. Newell, U. He was awarded the National Medal of Science and elevating the school to world class status.
Science at the time of his death, wrote and co-authored more than 250 publications, and ten books. He was awarded the National Medal of Science a month before his death. Place, because the military is in fact the civilian population. So you get this immense turbulence, this mixing effect that changes science.
The shift in psychology that is now called the «cognitive revolution» really happens in the midfifties. First came the war, then came this turbulence, and then big things begin to happen.
Was it difficult for you to work together with people from other fields?
I went to the Rand Corporation in Santa Monica. It was only five or six years later that Herb convinced me to go back to graduate school. I have considered myself as a physicist for a long time, but at Rand I worked on experiments in organizational theory. It makes everyone appreciative of things going on in other areas.
A place like Rand is completely interdisciplinary. Operations research did not exist before the war. At the time I was there, in the fifties, the idea was born that you could look at entire social systems, build quantitative models of them, and make calculations upon which to base policy decisions. Rand is a nonprofit corporation that got its money from the Air Force.
In the early days, the only place where military people were allowed in was the director's office. It exists independently from the government, and its purpose is to provide competitive salaries for scientists, because the government does not pay enough money.
How did you get involved in computer science?
We were deeply immersed in using computers as a simulated environment. In the middle of this emerges the computer as a device that can engage in more complex processes than we had ever imagined. Herb was an economic consultant, but he became very interested in organizations. At this meeting in November 1954, Herb was present.
As soon as I hear us talk, I absolutely know that it is possible to make computers behave in an essentially intelligent way.
The so-called cognitive revolution came later?
The cognitive revolution starts in the second half of the fifties. It was a direct transfer from the analysis being done for radars into models for the humans. In the same year, we wrote about the Logic Theorist.
So you would see the cognitive revolution as a break with behaviorism?
Right, except that some of us, such as Herb and I, were never behaviorists. All this happened in the second half of the fifties. The business schools, which had been totally focused on business, underwent a revolution in which they hired social scientists and psychologists and economists. This revolution was, in large part, due to Herb.
When did you start with your work on the general problem solver
At first, I think, you thought that one could isolate the problem-solving mechanism from knowledge about the world. We had this notion that one could build systems that were complex and could do complex intelligent tasks. The first system we built was called Logic Theorist and was for theorem proving in logic. The reason it was not in geometry was that the diagrams would have increased the momentary burden on the system.
Herb and I have always had this deep interest in human behavior. The issue of always asking how humans do things and trying to understand humans has never been separate. Although we wrote an article in which we talked about it as a theory of human behavior, the connections drawn to the brain are rather high-level nevertheless. So we used the protocols of how humans do tasks and logic.
We thought, Let's write a program that does the tasks in exactly the way humans do it according to the protocol. The paper in Feigenbaum's and Feldman's Computers and Thought is the first paper on GPS that has to do with human behavior. We had this evidence of human behavior, we could see the mechanism, and, given what we then knew about the Logic Theorist, we could build a new version. Later on, one would get concerned that the symbols were formalized things and not commonsense reasoning or language.
The criticism concerning commonsense came later?
One of the things that are true for every science is that discriminations that are very important later on simply have no place in the beginning. They cannot have a place, because only when you know something about the problem, you can ask these questions. On the one hand, you write that problem solving is one of the major intellectual tasks of humans, and we think about intelligence as the ability to solve problems. On the other hand, we succeeded in building machines that could do pretty well in problem solving, but we did not succeed so well in pattern recognition, walking, vision, and so on.
Think of cognitive reasoning in terms of computational demand, and then of speech, which is one-dimensional with noise, and of vision, which is two-dimensional. There are still serious problems with respect to actually extracting the signals, but we do have speech recognition systems. We also have visual recognition systems, but there the computation is still just marginal. Today, people try to get around this bottleneck by parallel processing.
I wonder how much your physical symbol system hypothesis is bound up with serial computation. You once wrote that one of the problems with humans is that they have to do one step after the other, with the excep-tion of the built-in parallelism of the eye. Connectionism is absolutely part of cognitive science. There is a need for seriality in human behavior in order to gain control.
If you worked totally in parallel, what you think in one part of your system could not stop what you are thinking in another part. There are functional reasons why a system has to be serial that are not related to limitations of computation. The physical symbol systems hypothesis, the nature of symbolic systems, is not related to parallelism. You can easily have all kinds of parallel symbol systems.
You can think of connectionist systems as not containing any notion of symbols. So you can, in one sense, say that connectionist systems are systems that are nonsymbolic and see how far you can go when you push them to do the same tasks that have been done with symbol manipulation without their becoming symbolic systems. There is certainly an opposition between the physical symbol system hypothesis and connectionism.
What the connectionists are doing is not just trying to deal with speech and hearing. The usual way of putting them together is to say, Clearly, a symbolic system must be realized in the brain. So it has to be realized by one of these parallel networks. Therefore, symbolic systems must be implemented in systems that are like parallel networks.
A lot of connectionists would not buy that, in part because they believe that connectionist systems do higher-level activities in ways that would not look at all like what we have seen in symbolic systems. Yet production systems are massively parallel systems, not serial systems. This distinction, which is also meant to be a distinction between serial and parallel, vanishes.
STEPHEN E. PALMER
Gestalt Psychology Redux
My training in cognitive science began as a graduate student in psychology at UC San Diego. With Donald Norman and David Rumelhart we began doing work in cognitive science before there was actually any field by that name, in that we, as psychologists, were constructing a largescale computer model of how people understand language and retrieve information from their memories.
We went up to
We were doing two pieces of cognitive science, in the sense that we were worried not only about human psychology but also about how to do computer simulation models. If it were done today, it would be recognized immediately as cognitive science, but at that time it was just some strange psychologists doing what computer scientists normally do.
As I said, I think that cognitive science probably got started at UC San
The connectionist days came later, actually after I was gone. Back then the major project that would be called cognitive science was the memory model project of Lindsay, Norman, and Rumelhart. By the time I came to Berkeley I had already started to become interested in perception more than in language. I had started out in the memory project, and I came to perception through a kind of back door, via language.
We had a nice theory of the structure of the meanings of many of these verbs, but at that time, nouns had no interesting cognitive structure in the theory. I decided we needed a better representation for nouns, and that was how I got interested in perception. Because once you start to worry about nouns and concrete objects, you start to think about the representation of their physical structure, what they look like, and how you identify them. Gestalt psychologists had been interested in.
I did not have any training in Gestalt psychology because most of the psychologists at UC San Diego were relatively young at the time and did not know the Gestalt literature. Dreyfus, for example, criticizes cognitive science from a Gestalt approach. Real human cognition is more fluid, more flexible, more context-sensitive than classical cognitive science has been able to capture in the kinds of models they use and the kind of paradigm they developed. I see it as a very central and interesting part of what is going on in cognitive science.
Is the difference between the East and West Coasts a coincidence?
The East is where things are very busy and uptight and crammed together. Simon's production systems. The family resemblance between what we were doing and the symbolic systems approach that characterizes the more formal East Coast approach to cognitive science was very strong. It was only later, when McClelland, Rumelhart, and Hinton started the connectionist work, that it got a particularly Gestalt flavor.
How would you outline your critique of the more formal approach and the advantage of a Gestalt approach?
I got interested in Gestalt theory because I was interested in phenomena that Gestalt psychologists had studied, especially contextual phenomena. The idea of holism is essentially that there is global interaction among all the parts and that the whole has characteristics that are quite different from the characteristics of the parts that make it up. You can think of Gestalt psychology as having discovered these phenomena and thus having identified a very important set of problems that need to be solved theoretically. The Gestalt psychologists managed to overthrow the prior prevailing view of perceptual theory, which was the structuralist theory promoted primarily by Wundt7 and his followers.
Eventually, they proposed their own way of theorizing about those things, but these theories were not well accepted. When Kohler8 proposed that electrical fields in the brain were responsible for the holistic interactions, it allowed people to do physiological experiments to try to determine whether these fields in fact existed and whether they had causal properties in perception. Because their physiological hypothesis was wrong, all Gestalt theoretical ideas were discounted. They thought that the brain was a physical Gestalt, that it somehow was a dynamical system that would relax into a state of minimum energy.
The electrical fields they proposed were one way in which this theory could be implemented in the brain, and it turned out to be wrong. That does not say that it is going to be the right theory, but it has given Gestalt theory a new lease on life. It was more or less banished from psychology on the basis of those experiments. There are no electromagnetic fields in the brain that cause perceptions, but there might be neural networks that are working in the same way, which are responsible for the very same kinds of phenomena arising in visual perception.
In one of your articles,10 you define five underlying assumptions of information processing, one of which is the recursive decomposition assumption. The recursive decomposition assumption is the assumption that you can characterize the human mind in terms of a functional black box, in terms of a mapping between inputs and outputs, and that then you can recursively decompose that black box into a set of smaller and simpler black boxes that are connected to one another by a flow diagram of some sort. The mind is probably a nearly decomposable system in the sense that Simon defined, down to some level. The decomposition is smooth only for a few levels, and then you get these extremely strongly interacting sets of neuronlike elements that define that module, and it does not make sense to define intermediate levels or structures within that module.
This is the point where things get very Gestaltlike. It might be that you have a module for recognizing faces that works as a Gestalt, that is, a single black box in there that just takes its input, does something mysterious with it, and in this massively interactive and parallel way produces an output. But those modules are probably put together in ways that are understandable, so that you can in fact do a certain amount of decomposition. Smolensky's notion that you can build an information processing system out of connectionist modules that would look approximately like symbol manipulation if you look at it on a grand scale.
What do you think about the physical symbol systems hypothesis?
The fact is that we have never been able to get very good models of human perception, pattern recognition, or human learning. We have never been able to get contextual effects to arise naturally out of physical symbol system models.
Do you think an algorithm like back propagation is a satisfying model?
Unlike most people, I am less enthralled with the learning aspects of the connectionist paradigm than I am with the dynamic system aspects.
In one of your articles12 you deal with cognitive representation. One of the attractive things in the connectionist paradigm for me is the idea that information can be represented in a system in a very opaque way. The standard view of representation is always that you have something in the system that is a symbol, and it corresponds in a fairly clean way to something, a relation, a property, or an object in the world. The relationship between the element inside the system and the external elements is conceptually transparent.
This has had a very liberating effect on the way we think about what representations might be like, because there is not necessarily any transparent relationship between the elements in the systems and what is out there in the world. This suggests that some of the problems involved in commonsense knowledge, like ideas we have about causality or about the physical world in terms of «If you push against something, how hard is it going to push back?» may be encoded in these networks in ways that are conceptually opaque. That may be fine for talking about how a physics student learns something about a domain of formal academic physics, but it probably does not capture anything like the kind of knowledge we have about the world we live in. Maybe the connectionist paradigm will provide an answer for how commonsense knowledge might be represented, but we don't know that yet.
The idea is that there may be some level down to which you can decompose, and what is below that is the background, which can only be captured by some very messy neural network or connectionist kind of scheme. But I still think that psychologists and cognitive scientists have to be concerned with it, because it ends up being the foundation on which so much higher-level cognition is built. There may be certain things you can do without it, but at some point you have to understand what it is. You write13 that the broadest basis of cognitive science is the information processing paradigm.
Searle says that even if you could describe a river in terms of an information processor, it does not lead to new psychological insights. We can describe what a thermostat does as information processing, but we would not call it cognition. The central issue is the nature of consciousness. What people are really talking about when they say that people and animals have cognitive systems but rivers do not is the issue of consciousness.
A good example would be the human immune system.
Argument is about, is consciousness. The deep reasons have to do with the extent to which you want to make science a third-person as opposed to a first-person endeavor. What the information processing approach can do is divide a cognitive system into a part that can be accounted for by information process-ing and a part that cannot. The part that can be accounted for is all the stuff an automaton could do if it were doing the same information processing.
The relation between the first- and third-person view of science is that to say what these experiences are like is a first-person statement, and as long as the science we are doing is restricted to third-person statements, we cannot get into that. It is methodologically closed to the kind of science we do currently. We might come to some extended notion of science that would allow those things to be part of it. When I read your articles, I thought that you believed the information processing paradigm to be the most promising approach in cognitive science.
But now you say that there is no notion of consciousness in it. I do think that information processing is the most promising approach in cognitive science, because there is not any approach I know of that would give us an account of consciousness. I just do not see how information processing is going to give us a complete account of consciousness. I do not see how information processing can ever tell the difference between a person who has conscious experience and a complete computer simulation that may or may not.
It has no features for saying whether a person or a creature or a computer is conscious or not. Let us return to the enterprise of cognitive science. The problem with cognitive science as a discipline is that the methods of the component disciplines are so different. Linguistics does not enter into it, except maybe indirectly if you want to study the interface between language and perception.
Some people think linguistics has to be a major part of cognitive science. Historically, linguistics has been a very important part of cognitive science, largely because of the relationships between modern linguistics and computers through computer languages and automaton theories. This was one of the foundations of cognitive science. But it is not necessarily an important part of cognitive science for every cognitive scientist.
For example, I do not think it is important at all for me to know government and binding theory15 in order to do interesting work in perception. Actually, I think that vision is probably the best example in which cognitive science is really working, because we know more about the biology of the system than we do in the case of language. We can actually look at the relationship between biological information processing and the computational models that have been constructed to do vision. Within vision, there is this one very restricted domain, which I think is the very best example of cognitive science anywhere in the field, and that is color perception.
You can tell a very interesting and extended story about color vision that goes all the way from the biological aspects, like the three different cone types that we have in the retina, and how that representation is transformed into an opponent red/green, black/white, and blue/yellow representation and so forth, through the psychological structure of color space and how they relate to one another, all the way up into anthropology, where you can tell the Berlin and Kay story about basic color terms in different cultures16 and how those basic color categories are grounded biologically in terms of the underlying processes and structures of the visual system. It is a beautiful story that connects all the way from biology through psychology and up to anthropology. It is the best example we have of how the different disciplines of cognitive science can all contribute to the understanding of a single topic in a way in which each one illuminates the others. It is a real problem.
It is the reason why we at Berkeley have gone very slowly in the direction of a cognitive science program. They started to move into the first-order approximation to cognitive science at places like UC San Diego, where they are designing a graduate program in cognitive science from the ground up. By this I mean that the students coming out of this program will probably know how to write LISP code, how to do computer simulation, and how to do experiments and analyze data so that they can work as psychologists as well as computer scientists. And they probably will have some knowledge of other fields as well.
In psychology, the principal methods are empirical. The techniques a psychologist learns are very different from those a computer scientist learns. You can talk about Artificial Intelligence research as being empirical in the sense that they get an idea, they build a system to see whether it can do what they think it can do, and if it cannot, they revise it and go back to do something else. You can do psychological experiments without using a computer at all or even knowing what a computer is.
The knowledge you get in graduate training in philosophy is about logic and logical argument. This is training we do not get in psychology, for example.
Is this an advantage or a difficulty?
One way is to be master of one methodology, and to have a working knowledge of other methodologies so that you can understand what is going on and be informed by what is going on in those disciplines, at least enough to influence the work you do in your own methodology. I might try in my experiment to test some theory that was constructed by a computer scientist in vision. That certainly constitutes cognitive science research, in my view. In psychology there are no such experiments where the results send a whole field into a quandary over how to solve some very deep problem.
It is because the underlying mechanisms of cognition and the human mind are much more complicated. There is always more than one way to test any given idea, and when you test it in different ways, there is always the possibility that you get different answers, because there are some factors that you have not taken into account.
Methodologically, this is a behaviorist viewpoint, appropriately aligned with the information processing approach, in the sense that if you get a machine to behave like a person, doing the same kind of things in the same amount of time and with the same kinds of errors, then you can claim that you capture the critical information processing capabilities of the organism that you study. We can do the very same experiment on a computer simulation that we do on a person. If the answer comes out the same way, we make the inference that the computer may have the same information processing structure as the person. The way we make the decision that they have the same internal structures is by doing methodologically behavioral experiments, but it is entirely compatible with cognitive science in that sense.
On the basis of all these data we can construct a spatial representation of what your color experiences are like. Maybe there we will be able to say what conscious experiences are about. We may be able to say whether there is some biological property that corresponds to something being conscious or not. But I do not see how we can get beyond information processing constraints, without changing something about the way we do objective science.
I, for one, would be loath to give up this kind of science. We had problems with this in psychology long ago when introspectionist views were predominant. If one assumes that, then in order to be intelligent it is necessary to have consciousness, intentionahty . There is no evidence that for something to be intelligent it is necessary for it to be conscious.
There are programs now that apparently can play chess at the level of masters or even grand masters. It is clear to me that, as far as behavior goes, this is intelligent. One of the most interesting things that we have found out so far in cognitive science with computer simulations is that the high-level processes seem to be a lot easier to do than the low level ones. We are much closer to understanding how people play chess than we are to understanding how people recognize objects from looking at them.
Suppose we have a machine that not only plays chess but can make breakfast. Let's further assume, with John Searle, that it is not conscious, although I do not think we have any reason to say either way. My view is that intelligent behavior should be defined without reference to consciousness. Some people may claim that it is an empirical fact that only conscious beings achieve those criteria, but I do not think so.
We already have convincing examples of intelligence in nonliving things, like the chess computers. The reason why we do not have general purpose intelligence is that so much of what is required for this is background knowledge, the low-level stuff that seems so difficult to capture.
Do you think that the connectionist approach will continue to fit into cognitive science?
The people who say that connectionism is not really cognitive science because «they are only building biological analogues» are the hard-line symbolic systems people. They seem to take a very narrow view of what cognition is, and if it turns out to be wrong, they will have to end up saying that there is not any cognition. There are phenomena of cognition, and there is, therefore, a field of cognitive science defined as the interdisciplinary study of these phenomena. It was for this very reason that I wrote the article about the information processing approach to cognition.
The symbolic systems hypothesis is a stronger view than the information processing approach. The physical symbol systems hypothesis is fairly well characterized now, but we are just beginning to understand what connectionism is about and what its relationship is to symbolic systems. The biggest breakthrough was the breakthrough that allowed cognitive science to happen.
Why did it happen so late?
They made some very well founded arguments against a certain class of perceptron-type theories, showing that these theories could not do certain things. They did not prove that those same things could not be done by a more complex class of perceptrons, which was the principal basis of the more recent advances. I doubt whether there will be another breakthrough soon, because you usually get breakthroughs under conditions where there is dissatisfaction with the way things are. Part of the underlying conditions for the connectionist movement to happen was that there was a bunch of people who were not terribly happy with certain shortcomings of the symbolic systems approach.
They found a way around some of these problems. The symbolic systems people are happy with the standard view and will continue to work within it. Most of the people who had problems with that view are either neutral or going to the connectionist camp. These people are busy just working now, because there is so much to be done within the connectionist framework.
It will keep them busy for a long time, and it will be a long time before they get to the point where they become dissatisfied in a fundamental way with the connectionist paradigm. And only under the conditions of deep-seated dissatisfaction is another revolutionary breakthrough likely. As far as the current situation goes, the next decade is going to be spent working on what the connectionist paradigm is and how it relates to the symbolic systems approach.
HILARY PUTNAM
Against the New Associationism
DAVID E. RUMELHART
From Searching to Seeing
JOHN R. SEARLE
The Hardware Really Matters
HERBERT A. SIMON
Technology Is Not the Problem
JOSEPH WEIZENBAUM
The Myth of the Last Metaphor
I was introduced to computers in the early fifties, at Wayne University in Detroit. A mathematics professor there decided to build a computer for the university. After a long while, I participated in the design of a computer system for the Bank of America.When the design of the bank system was over, the General Electric
I was very interested in the use of the computer in socially significant areas. I also got to know Kenneth Colby, a psychoanalyst in the Palo Alto area. I did develop that list processing system, and I got an honorary appointment at the Stanford Computation Center.
Joseph Weizenbaum was born in Berlin in 1923 and emigrated to the United
At Wayne State he was involved in designing and producing a computer before moving into industry in 1953. In 1963, he joined MIT as Professor of Computer Science, a position he held until his retirement in 1988. The console as if they had the whole computer. That was the basic idea.
Under the influence of this conversational computing, I became interested in actual conversation, in natural language conversation with the computer. I began to work on that project, and it got me right into one of the key areas of Artificial Intelligence, which was then and is still the processing of natural language by the computer.
That was the time that you developed ELIZA?
I had developed a lot of machinery for looking at written text, and I was now ready to program some sort of conversational mechanism. It was clear to me that when two people talk to each other, they both share some common knowledge. I was confronted with the problem that today might be called knowledge representation, which was not what I was interested in at that moment. I tried to think of a kind of conversation where one of the partners does not necessarily have to know anything.
So I thought of cocktail party conversations, or the conversations people have with bartenders who simply say «Yes, yes» or «Oh, that's the saddest story I've ever heard» and things like that.
I look, with a critical eye, on the whole computer field and, in fact, on modern science and technology in general. One thing I have learned in my life, which is now getting longer, is that I, as well as many other people, am very much tempted by binary thinking. If she is angry with me, it's the end of our marriage, the end of the world. I had lived as a Jewish boy in Hitler's Germany for three years, and I have remained, to this day, very interested in the history of fascism in Germany and throughout the world and in the political history of Europe.
I knew the role, I might say the miserable role, that German scientists, including physicians, played in German military history, also during the first World War. I saw myself in the same situation as these German scientists before and during World War II.
Why did you undertake to program ELIZA at alii What was the idea of simulating a conversation with a psychiatrist?
I have already explained that it was a way to avoid the difficult problem of knowledge representation. I had the idea, perhaps wrongly, that I understood their position better than kids who had been brought up here.
I did not think that I was making a direct contribution to social science or to psychology or psychiatry. Technically, ELIZA was an interpreter. ELIZA plays the psychiatrist, but there were others. Specifically, I remember one script in which one could define things that the system would then «remember,» for example, things like «The diameter of a circle is twice the radius of the circle.» This was typed in English, and the system stored it.
And «The area of the circle is IT times the square of the radius of the circle». Then one could type in «A plate is a circle» and «The diameter of the plate is two inches» and ask later, «What is the area of the plate?» The thing would see that the plate is a circle, that it did not have a definition of the area in terms of the diameter, but that there was a relationship between the diameter and the radius. It would look that up and finally tell you what the area of the plate was.
Intelligence community seriously?
One very critical component of such programs is that they formulate a picture of what it means to be a human being. It describes, in my view, a very small part of what it means to be a human being and says that this is the whole. That idea made the Holocaust possible. It would have been impossible without such an idea.
We, people in general, Americans included, find it necessary to have an image of the enemy in an animallike form, like the «Japs» in World War II.
Not long ago, at a public meeting at Harvard University at which both
I consider myself a member of the human family, and I owe loyalty to this family that I don't share, to the same extent, with other families. Well, I've special values for the human family, and by extension for life, carbon-based life. «What I am trying to say is that with these world-encompassing projects that strive for superhuman intelligence and for the ultimate elimination of the human race and so on emerges a subtly induced world picture, expressed, for example, in Dan Dennett's statement that we have to give up our awe of living things. » That enters our language and the public consciousness, and that makes possible a treatment of human beings vastly below the respect and dignity that they ought to have.
«Intelligence» community, as Searle would say.
I do think that the idea of computation has served psychology very, very well. Initially, in the late fifties, it was the idea of computation that almost dealt a death blow to behaviorism. Now comes the computer, and we can actually see internal logical operations take place and result in things that resem.Some people claim that it will be the last metaphor. As it happens, the latest technology has always claimed to be the last metaphor. So that, to take a current example, connectionism is a modification, an enlargement of the computational metaphor. And computing being such a powerful metaphor for thinking about thinking, this enlargement becomes immediately suggestive.
Now this is the last metaphor. Perhaps machines with biological components. The beginning could be largely siliconbased machines with a few biological components, then with more and more biological components. Finally somebody could get the idea that a man and a woman should come together and produce a thinking machine which is entirely biological, which is a baby .
In the debate between connectionism and the physical symbol systems hypothesis, the connectionists say that the underlying «hardware» is relevant, the biological structure. They claim that they are more concerned with the human brain. The machine need not consist of neurons, but the neurons have to be simulated. Finally, another stage is the neuronal machine, which is a kind of connection machine, but whose connectivity is to be imitative of how the neurons in the brain are connected to one another.
The second of these is still a Turing machine, 262 JOSEPH WEIZENBAUM in my view. What I am suggesting is that this machine is imitable by any other machine, so that the physical symbol system hypothesis is not falsified by the appearance of the connection machine. That, in turn, means that whatever limitation one sees in the physical symbol system hypothesis continues to exist for the connection machine. Let me now think of a neuronal connection machine, which, at the moment, is merely a fantasy.
I don't think that, in principle, it avoids the problems of this hypothesis and that it is a step ahead. One of the things that has entered our language as a result of computers is the term «real time.» If someone had said «real time» as recently as fifty years ago, nobody would have known what he could possibly mean. It would be very hard to build a simulated frog out of a machine that operates at a very low speed. From a philosophical, as opposed to a practical, point of view, real time and speed are not necessary.
That sort of consideration, among many others, enters into Dennett's argument against the Chinese Room.
If one considers quantum mechanics and relativity, for example, to be finally the real representation of the real world, the final metaphor, then it would not be useful or correct to damn all these physicists who have done physics «on the wrong basis» all these years before it came along.
Why Play Pilosophy Game ?
TERRY A. WINOGRAD
Social Values
LOTFI A. ZADEH
The Albatross of Classical Logic
That was six years before the term Artificial Intelligence was coined. It shows how easy it was and still is to overestimate the ability of machines to simulate human reasoning.
When you talk about cognitive science you are not talking about something that represents a well-organized body of concepts and techniques. We are talking not about a discipline but about something in its embryonic form at this point, because human reasoning turns out to be much more complex than we thought in the past. And as we learn more about human reasoning, the more complex it becomes. In my own experience, at some point I came to the conclusion that classical logic is not the right kind of logic for modeling human reasoning.
These are some of the principal components of what is called cognitive science today. The meaning of the term cognitive science is changing with time. In the near future, there might be some more programs and even some more departments. I do not anticipate that cognitive science as a field will become an important field numerically.
It will be difficult for people with a degree in cognitive science to get a job. Whether they apply to psychology, linguistics, or computer science departments, the answer will be that they are at the periphery and not in the center of the field.
Intelligence? Are there stages f
The term Artificial Intelligence was not in existence at that time yet, but still there was already this discussion going on. It was at that conference that the field was launched and Artificial Intelligence came into existence. In the late fifties and early sixties, the concerns of Artificial Intelligence were centered on game playing and to a lesser extent on pattern recognition, problem solving, and so on. I would say that in the late sixties and early seventies the attention was shifted to knowledge representation, meaning representation, and, in connection with that, natural language processing.
In the midseventies, some expert systems made their appearance.
So a definite shift took place, as problems like game playing and pattern recognition receded in importance. Artificial In-telligence has become a big field. Some of the big conferences have more than five or six thousand participants. It is a field that is undergoing rapid change, but right at this point there is a feeling that the expectations were exaggerated.
Many of the expectations did not materialize.
When did this feeling appear?
There were some articles critical of Artificial Intelligence that point to all sorts of things that were promised but not delivered. I see as a reason for the lack of accomplishments the almost total commitment of Artificial Intelligence to classical logic. Artificial Intelligence has become identified with symbol manipulation. I think that symbol manipulation is not sufficient when it comes to finding the solutions for the tasks Artificial Intelligence has set for itself in the past, like natural language, understanding, summarization of stories, speech recognition, and so forth.
Artificial Intelligence people are likely to disagree with me there, but I am convinced that in the end they will be proved to be wrong.
You said that three or four years ago there was some criticism of
Artificial Intelligence. He was right as to the fact that some of the expectations that were built up by the «artificial intelligentsia» were exaggerated. Predictions of this kind soured many people on Artificial Intelligence. I part company with Dreyfus when it comes to the use of computers to solve many problems that can be solved effectively by humans.
No, not like human beings.
Airlines flight 325 going to arrive at Los Angeles?» There will be no human operator on the other side, but a machine that understands and analyzes your question, looks through its database, and, with a synthesized voice, gives you the answer, something like «Flight 325 has been delayed. Systems like that will be in wide use. You will be able to ask questions like «If I want to do this and that, how do I do it?» The VCR may be able to tell you certain things. I think the voice input-voice output systems will have an enormous impact, because the most natural way of communicating with a machine may well be voice input-voice output.
Jules Verne said, at the turn of the century, that scientific progress is driven by exaggerated expectations. There was a debate between Donald Michie and Sir John Lighthill about Artificial Intelligence and robotics. In any case, it was a mistake for the British government to stop its support of Artificial Intelligence.
Artificial Intelligence people to have a superiority complex with regard to cognitive science. The Artificial Intelligence community view the cognitive science community as comprising for the most part people who are not capable of doing serious work insofar as Artificial Intelligence is concerned. I do not think that in the cognitive science community there is a similar superiority complex in relation to Artificial Intelligence. Intelligence people think that they have nothing to learn from the others, so they don't go to their conferences.
I think the cognitive science community would probably like to have more interaction with the Artificial Intelligence community, but, as I said, this feeling is not mutual. Within the cognitive science community, there is more interest in neural networks, in connectionism at this point. The attitude toward neural networks and connectionism within the Artificial Intelligence community is somewhat mixed, because the Artificial Intelligence community is, as I said earlier, committed to symbol manipulation. Connectionism is not symbol manipulation.
Artificial Intelligence community. In Europe, things might be different. Artificial Intelligence people have perhaps a closer relationship with cognitive science, and that relationship may be likely to grow. In the United States you have to sell research, and therefore you have to be a salesperson.
In Europe this is less pronounced because of the tradition of research funded by the government.
But there are some programs that claim to do summarizing?
These programs do not have the capability at this point, and they are not likely to have it in the future to summarize nonstereotypical stories of a length of one thousand words or so. We cannot implement this knowledge on our machines, and at this point we do not have the slightest idea how it could be done.
What are the major problems that Artificial Intelligence has to overcome to make progress?
One of the most important problems in Artificial Intelligence is the problem of commonsense reasoning. Many people realize that a number of other problems like speech recognition, meaning understanding, and so forth depend on commonsense reasoning. In trying to come to terms with commonsense reasoning, Artificial Intelligence uses classical logic or variations of it, like circumscription, nonmonotonous reasoning, default reasoning, and so on. But these techniques make no provision for imprecision, for fuzzy probabilities and various other things.
I think it is impossible to come to grips with commonsense reasoning within the framework of traditional Artificial Intelligence. We need fuzzy logic for this. Human nature does not work that way. As long as you merely manipulate symbols, you have a limited capability of understanding what goes on in the world.
Database systems can answer all kinds of questions, not because they understand but because they manipulate symbols, and this is sufficient to answer questions. What Searle basically points to is the limitation associated with symbol manipulating systems. But as I said earlier, it is not necessary to go into all these things that Searle has done to explain it. To people in Artificial Intelligence, these are not really deep issues, although there have been debates going on for maybe forty years.
There are certain issues, one of which is concerned not with Artificial Intelligence but with the capability of computers to store large volumes of information so that our lives can be monitored more closely. I have a collection of horrible stories about humans who got stuck with errors made by computers.
What could be done about these future problems?
Even now, many of these intelligence agencies monitor telephone conversations, and if you have the capability of processing huge amounts of information, they can monitor everybody's telephone conversations. I don't think this will happen, but the capability is there. And once the capability is there, there will be the temptation to misuse it.
Intelligence?
Artificial Intelligence may not maintain its unity. It is very possible that within the next decade we will see a fragmentation of Artificial Intelligence. This branch started within Artificial Intelligence, but it is like a child that has grown up and is leaving its parents. There are many people working in these fields who are not members of the Artificial Intelligence community.
The two currently most important fields within Artificial Intelligence are going to leave Artificial Intelligence. Then there is the field of voice input-voice output, and that too is leaving Artificial Intelligence. If you look at the programs of Artificial Intelligence conferences and subtract these things, you are left with things like game playing, and some Artificial Intelligence-oriented languages. These things will not be nearly as important as Artificial Intelligence was when knowledge-based systems were a part of it.
Intelligence will be like parents left by their children, left to themselves. Artificial Intelligence, to me, is not going to remain a unified field.
What about connectionism?
I think there are exaggerated expectations at this point with regard to connectionism. They think that connectionism and neural networks will solve all kinds of problems that, in reality, will not be solved.
Komentar
Posting Komentar