Are large language models slightly conscious?
Diving into the twitter debate between Illya Kustskever and Yann LeCun
Recently there was a debate on twitter between leading AI scientists about the question if large language models of today are already conscious. Illya Sutskever tweeted that “It may be that today’s large neural networks are slightly conscious” . Yann LeCun replied to this tweet “Nope. Not even for true for small values of “slightly conscious” and large values of “large neural nets”.I think you would need a particular kind of macro-architecture that none of the current networks possess”.
This was a huge claim by Ilya and I wanted to know more. However, I was left disappointed as this exchange did not result in any further discussion with no further descriptions about the ideas being talked about. These people are giants in the field of AI and surely must know what they are talking about but it would have been great if they had made their positions clearer and explained the reasons behind their claims. But, maybe I am too used to detailed discussions and debates about the smallest of ideas on philosophy twitter. It’s perhaps unreasonable to expect scientists who are busy competing to get to the the state-of-the-art to go on in detail about such a contentious issues. However, I do feel that given that the topic has been explored in great detail for a century now and since consciousness can be talked about in so many different ways, any thought leader in the field would corroborate their claims about AI consciousness with at least some descriptions of their definitions and metaphysical assumptions about it. In this blogpost, I try to guess what they are talking about.
I am not saying that they should be stating if they are physicalists or eliminativists or panpsychists. I think it is safe for us to assume that they are obviously not dualists or idealists, and are probably physicalists. Although Ilya’s statements could be seen to be compatible with some forms of eliminativism and even some forms of panpsychism, but if anyone has to guess, both of them are physicalists.
However, it would have been insightful to know if they are talking about phenomenal consciousness (P-C) or access consciousness (A-C). Making this distinction is significant because it is not at all obvious that mental states like thoughts and language are accompanied by P-C. In fact, a lot of people do not believe it. The answer to this question would have also kind of eliminated the curiosity about their metaphysical positions with respect to P-C. Since A-C is anyway physical and independent of P-C, questions like if P-C exists at all and if it fits in our current understanding of physics are redundant for this discussion. ( I am taking a leap here and some would disagree that A-C is entirely physical. Also, it is not clear if A-C can exist without P-C and vice versa. I am assuming both can exist and have different basis and A-C is definitely physical.)
Thinking back I feel that it is safe to assume that they are talking about A-C here. P-C claim is a stronger claim. It would require definite explanations from Ilya about the theory he is propounding like Global Workspace or Integrated Information Theory and how does the large language model sit in it. Currently there is no widely accepted theory of P-C and its metaphysics is also highly debated. A-C on the other hand is a lesser claim. It is not insignificant in any way. It is very huge but lesser than P-C. And, any AI scientist would probably first have a crack at the easy problems of consciousness before claiming to have solved ‘the hard problem’ .
I think as creatures with A-C, we have many abilities. When IIlya says large language models are slightly conscious, he probably is picking up a few aspects of A-C and saying that large language models possess them. I think he is talking about certain cognitive abilities like having access to your input signals for gaining knowledge of the world, knowledge of your own existence within this world and how different entities within this world relate to each other. Basically ‘accessing’ which is also otherwise available to P-C and making a world model.
Another support for this guess that Illya is referring to this kind of cognitive ability is Open AI’s claim that GPT-3 is capable of some rudimentary reasoning. And like Ned Block pointed out in his seminal 1995 article said that “The mark of A-C is availability for use in reasoning and rationally guiding speech and action.” Although reasoning ability does not necessitate all the features of A-C, but I think the ability to make a world model is a pre-requisite. Reasoning is basically ability to derive valid statements about the world model given the representation of relations between the entities in the world model. Any statement about an entity of a world model which is coherent with the known relations between the other entities of the world model is a reasonable statement about the world model. It is likely that a model which is making new and valid statements about the world having read the language which is descriptive of the world would have probably learned the world model? Or at least the part of the world which it has read about in the language.
Suppose, I say to the model that “I tossed a thing called ‘coin’ which has two sides called ‘head’ and ‘tail’ . I get a ‘head’? If I turn it, what will be the side?” If it replies ‘Tails’, then probably it has learnt the concept of a ‘side’ in an object and that if one is ‘up’ the other is ‘down’. And when I turn what is ‘down’ it becomes ‘up’ and vice versa. The concepts of ‘side’ , up’ and ‘down’ and the relation to each other would have been learnt. Unless, of course it has seen similar sentences before and it is just spitting words from memory based on a probabilistic model of which words go together in a context. OpenAI claims that this is not the case because it remove such sentences from training before asking it such a question. However, people have disputed this claim by questioning their deduplication methods (which removes the possibility of memorising a previous cases of similar sentence seen while training). Yanik Kilcher says that memorising ability and the next word prediction ability has probably led to the false impression of reasoning abilities.
Over and above this dispute, some people like Gary Marcus have gone on to show that it is anyway very poor at doing some of the reasoning tasks which humans are good at https://www.technologyreview.com/2020/08/22/1007539/gpt3-openai-language-generator-artificial-intelligence-ai-opinion/. Gary’s arguments can be countered as far as the ability to build a world model and reasoning is concerned. It maybe that GPT-3 has not learnt every aspect of the world in its world model but in principle it can and maybe some future large language model will be able to do all sorts of reasoning.
There are ways to refute Yanik’s hypothesis also. Suppose, we deliberately removed the words ‘coin’ ,‘head’ and ‘tail’ in our question to GPT-3 and replaced it with say ‘X’, ’A’, and ‘B’ respectively. If it answers ‘B’ correctly, then it will be very compelling. If people still say it has not learnt the concept of a ‘side’, ‘up’ and ‘down’ in an object, it surely has learnt that the relationship of ‘A’, and ‘B’ to ‘side’, ‘up’ and ‘ down’ is same as that of ‘head’ and ‘tail’ in case of an object with two sides like. And this qualifies as learning some aspect of the world model. However, I don’t know if it has the ability to do this and if Yanik’s hypothesis can be refuted.
But, if in fact, GPT-3 has achieved this, is it be rudimentary A-C? I think it is rudimentary world model building capability which is probably an aspect of A-C. A more complete world model with the understanding of concepts and time would be a step further. However, Yann has questioned if this is possible . He says it would need a particular macro architecture. Why does he say this and what does this mean? In principle, a world model can be learnt by reading a language that describes the world. I think I am able to build a model of the Harry Potter world by reading about it. When something violates physics in Harry Potter. I am able to attribute it to magic. But the question here is that whether a large language model which has a certain kind of architecture (namely stacks of attention mechanisms based decoders) and which learns to predict the next word in the sentence can also do this?
I don’t think I am predicting the next word when I am reading a novel but I do get very good at predicting plots of the novel (maybe not the next word) once I start reading it and have started to form a model of it. Maybe, the objective function might needs to change in these language model. I don’t think I reduce the error between the word I read and the one I was expecting. But maybe I do reduce the error between what I read and what I thought was plausible in the world model I built. Maybe another layer of NNs are required which reduce between output plot (next few sentences ) and the read sentences. Is that what Yann is calling a macro architecture? Who knows. Maybe he is talking about something entirely different like a multilevel cognition architecture as suggested in this paper which does have something called macro-cognition described as knowledge based reasoning. https://www.reservoir.com/wp-content/uploads/2018/04/springer-multilevel-cognitive-architecture.pdf
However all this still leaves out lot of features that we as A-C creatures have. We have some strong intuitions about self, time, and physics. Can these be learnt by reading language? I don’t think so. These can probably be learnt by interacting with the world. I think employing RL with an environment and grounded language learning should be a good direction for features like these. We also have goal directed-ness, desires, intentions and thoughts that arise due to that. I can’t even imagine if these can be learned. Maybe they are innate or hard coded?
And what about P-C ? Can large language models have P-C? Depends on what your beliefs are about P-C ? If you are an eliminativist, P-C does not exist. If you do believe in P-C you could say it emerges in complex physical systems like human brains. Is large language model that kind of a system? There are theories of how consciousness that talk about how they arise. None of them tested and proven. A proponent of panpsychism might believe that the proto-consciousness of fundamental particles they espouse can combine to create consciousness in a complex system like a language model. But all these metaphysical theories aside there is another question whose answer determines ones stand on P-C of language models. Even if you believe in P-C, is there something it’s like to be thinking or doing language? I am torn about this question.
Is it something it’s like to be thinking the thoughts I am think right now. It’s almost like i can hear myself say these words, although i am not uttering them. There is an internal voice. It’s like I am imagining myself say it. But is that because of the fact that I have ability to hear and say things. Would a deaf and dumb person who can read also have this internal voice. If he/she doesn’t, then this qualia is probably dependent on sense perceptions? And inherently, there might be nothing it’s like to have a thought. If I take this line, I imagine a language model, which presumably has no sense perception, should not have P-C. On the other hand, when I’m not thinking, or am in a rear meditative state, it is quite a different from when I am thinking. So maybe it is something it’s like to be thinking but I am not able to point how. I need to think more about this! Maybe some other time in some other article.