The future of AI is not LLMs: Yann LeCun

SWETA AKUNDI

Oct 2024
from Shaastra :: vol 03 issue 09 :: Oct 2024

"Language is not the only substrate of thought. We need to come up with new architectures where machines can think in more abstract terms," says Yann LeCun.

Artificial intelligence scientist Yann LeCun makes the case for thinking beyond language models for an AI that is more efficient in reasoning and planning.

Yann LeCun is over large language models. Right when every other tech start-up boasts of building their own LLMs, LeCun, a Professor at New York University and Chief AI scientist at Meta, is encouraging young graduates in artificial intelligence to think beyond these models. For him, the future of AI is in models that can reason and plan intuitively, and not just simulate intelligence. "Language is not the only substrate of thought. We need to come up with new architectures where machines can think in more abstract terms," LeCun observed during an interaction with students and professors at IIT Madras in October 2024.

In the mid-1980s, LeCun laid the groundwork for the convolutional neural network model, which enabled image and video understanding and improved human-computer interactions. In 2018, he won the ACM A.M. Turing Award for engineering breakthroughs in neural networks. Today, LeCun is looking at developing hierarchical architectures that can help AI plan its actions and offer day-to-day assistance to people. "It would be like walking around with a staff of really smart people assisting us. And I don't think we should feel threatened by that idea; I certainly work with people smarter than me!" he says.

Later that same day in Chennai, LeCun gave a second talk – on achieving human-level intelligence – as part of the Subra Suresh Distinguished Lecture Series hosted by IIT Madras. Shaastra caught up with him for a conversation. Excerpts:

"To some extent, LLM technology development is not that exciting anymore. Work instead on architectures that can reason, understand the world, and plan."

A lot of the AI models we see today predict actions based on the recognition of patterns. Can there be a more reasoning-based judgment call that would count as true intelligence and not a simulation of it?

The type of architecture that LLMs have today is input-output based: you feed them an input, and with a fixed number of computational steps, you get an output. Whereas reasoning can take a variable number of steps depending on the complexity of the problem. An essential characteristic of reasoning is searching for an answer among multiple possibilities. People who have worked on Classical AI before machine learning (became popular) believed that reasoning and search was what it was all about. Think about classical AI problems – like if I give you a map and ask you what is the shortest possible distance between two cities; it is a search through a set of solutions. If I ask you to play chess, you need to explore the tree of all possible moves that you or the opposing player can make, and then select the best move among all those possibilities that is most likely to lead you to victory. So, it used to be that AI was all about search. Then when deep learning came in, it became about something else; it is now just computation with a fixed number of layers.

Does that still involve learning from past data, or can there be a more intuitive understanding of the world?

For a system to have intuitive understanding, it needs a world model – a system trained to predict the state of the world that would result from taking an action. It might not be a single state, it may be multiple states because it could be that the world is not completely deterministic. But if you have a mental model of the world, you can imagine what the sequence of actions is going to be, and you can measure to what extent the resulting state satisfies a task objective. That would be a system that can plan, and that is my idea of the architecture of future AI systems.

"AI was (once) all about search. Then when deep learning came in, it became about something else; it is now just computation with a fixed number of layers," reckons LeCun.

Doesn't this need a deep understanding of causality? How good do you think humans are at that?

That is a very important question, and there is no clear answer to this. Humans are good at it, but far from perfect. The way you establish causal relationships is that you take an action and then you observe the result. The problem is that there are a lot of mechanisms in the world for which you yourself can't take an action, but you might be able to just observe the effect and then have to imagine the cause. And as humans we have evolved to attribute causes to many things that don't have physically good reasons, right? That's how we invented gods! (Laughs). We need a reason for everything.

You've said before that human intelligence is not general but highly specific, and so, you don't like the hype around Artificial General Intelligence. How do we differentiate between what is hype and what are truly transformative technologies?

First of all, we don't need to have human-level AI to transform the world. So was the case with LLMs, so was the case with convolution neural nets. We need pretty sophisticated AI for content moderation and ranking in social networks to take down illegal content. And then there are other applications, for example, automatic braking systems in cars. They use deep learning to save lives. There are promising applications in medicine, we have seen this with the Nobel Prize for protein structure prediction. But you don't need human-level intelligence for any of this. What you might need it for is to empower people with intelligent assistants that will help them in their daily lives.

"We don't need to have human-level AI to transform the world... What you might need it for is to empower people with intelligent assistants that will help them in their daily lives."

Do you think human-level intelligence is the next technological revolution waiting to happen?

I'm not sure if it is the next one, but it will happen at some point. If all the theories we have at the moment work, and there are no big obstacles, we will know if we are on a good path towards animal-level intelligence in six-seven years. And then, if we are lucky, human-level intelligence within the decade.

What is the larger scientific idea that drives you, a goal you would wish to see achieved in your lifetime, even if it is not by you but by others building on your work?

There are two motivations: one is discovering the mystery of intelligence. The only way to answer this question is to build a system that is intelligent. Because you can theorise all you want but how else do you verify it? And then there are all the potential applications of doing this: accelerating the progress of science and medicine, empowering everyone, helping them make good decisions, and bringing knowledge and education to everyone in the world. To me, there is nothing more motivating than making people smarter. Even if it is through the help of machines. Perhaps that is why I am a Professor, that is my job after all!

"To me, there is nothing more motivating than making people smarter. Even if it is through the help of machines. Perhaps that is why I am a Professor..."

When you introduced convolutional neural networks in the late 1980s, there was hesitation in its widespread acceptance by the industry. Does history repeat itself: are there research areas today that are being ignored in favour of more established architectures?

I think that is the case for LLMs. There is nothing wrong in working in LLMs if you need to deploy applications. But I tell students who are interested in doing graduate studies in AI not to work on LLMs, because there are teams in companies like Meta, Google, OpenAI working on this that already have enormous resources for it. To some extent, LLM technology development is not that exciting anymore. Work instead on architectures that can reason, understand the world, and plan.

I have seen this multiple times in the history of AI, computer vision and speech recognition: you have a paradigm that is dominant and there is one hint that a new way of looking at the problem may completely change things and then you have a revolution. Deep learning brought a revolution to speech recognition in 2009-10; for computer vision, it was 2013; for LLMs, it was 2016; and now it is going to be embodied AI for robotics.

Your contemporaries like 2024 Nobel Laureate Geoffrey Hinton have warned against malicious AI, but you are more optimistic in that sense. What is it that reassures you that AI will not be a force of evil?

Geoff and I have been friends for 40 years. I did my post-doc with him, we worked together, and collaborated a lot in the 2000s before the deep learning revolution. We are good friends but we disagree on this. My understanding is that he went through some epiphany in 2023... The quest of his life was to discover the learning algorithm of the cortex, and he gave up after he saw how things like LLM works. He said maybe the cortex doesn't use backpropagation, which is the algorithm we use to train all of this, and maybe backpropagation works better than whatever it is that the brain uses. And he said, oh so maybe they are smarter, and maybe they will take over because whatever is smarter will take over.

He is wrong on two counts, in my opinion. The first is that LLMs do not have the architecture needed for intelligent behaviour. Geoff believes that LLMs have subjective experience, but he is an outlier in that view. And the second thing is that entities that are smarter do not necessarily take over ones that are not as smart. It is not the smartest among us that is the chief – clearly (laughs).

Do the two of you still meet and discuss these things?

He and I cherish our friendship, so we try not to be too confrontational about this stuff!

See also:

Teaching physics to AI

The transformative power of AI

Augmented Intelligence: AI in the service of science

Name

Your Comments