Rethinking the AGI Race, with Benjamin Goertzel — Transcript

Thank you. Thank you. Welcome to yet another keynote. I’m your host, Lutz Finger.

I’m a faculty at Cornell where I teach about AI data product strategy. And I spend a lot of time helping leaders separate what AI is and what it is not. AI actually changes in products, market, and customer behavior from what, frankly, is a little bit market theater. My own work focuses on, now for years, on how digital systems shape what gets seen, trusted, and acted on.

So I built fine-tuned models and search and discovery and e-commerce, a company I recently sold. So if you want to roll up your sleeves and code with me, then join me on one of my eCornell programs including two different types of three-hour workshops where we build agents that essentially do like your back office. If you want to join, the link is below as usual. But let’s talk about today’s topic.

We started slightly late. Sorry for that. But the topic is actually super important. It’s about AGI and what the race of AGI means.

Almost every CEO currently, who I’m talking to, tells me, they get told AGI is around the corner. And that obviously has huge implication of how you set up your company, how you focus your resources, and so on and so forth. If you ask Sam Altman, it’s already there. Elon Musk said singularity has even more rise.

So it’s not only even AGI, it’s even more. And we will get to the terms. Demis Hassabis, like in recent Google I.O., he kind of moved his time. Timeline up front, towards 2029.

And he called it at the foothill of singularity. Anthropic is telling Washington, powerful AI systems are coming, beware. Now, yet if you wipe code, you will get screwed every other second. Oh, system deleted your database.

Oh, honest review. Yes, I screwed up. And so on and so forth. And all of you know what I’m talking about.

It’s currently still on. It’s not there. One of my students says, like, clearly we’re not there. Well, workflows are not there yet.

But AGI, I don’t know. So that’s a reason to cut through the hype. That’s the reason why we haven’t been here. So Dr.

Ben Goertzel is the CEO of SingularityNet and the Superintelligence Alliance. He co-founded OpenCog Foundation and TrueAGI. And he is actually the architect behind OpenCog Hyperon. He authored more than 25 books and 150 papers.

And was a chief scientist at Hanson Robotics, the company you might know from the robots, Sophia, Desdemona and Grace. And he runs a very successful AGI research conference, like over 20 years. Now I’m super happy. To have Ben here because he actually coined a little bit the term AGI.

He was one of the first bringing it up. He has been a very optimistic voice in the AGI space, but he has as well been very, very skeptical that this whole LLM scaling rise is getting us there. So welcome, Ben. Welcome to the show.

Yeah, yeah, thanks. Good to be here. And these are indeed interesting things to talk about. And I’ve been thinking about them since time immemorial, it seems like.

And now, you know, AGI and the path to real thinking machines is sort of the issue of the day. It’s the issue of the day, totally. Now, fun fact, right? We have AGI, we have Singularity, we have…

all those… we have superintelligence. These are all terms. So to get us grounded, give us a little bit of an explanation, Ben.

What is what? You coined 20 years ago AGI, but like put those terms into context. What is AGI? What is Singularity?

And what is superintelligence? Yeah. Yeah, the story behind it is, you know, we have superintelligence. We have superintelligence.

And the story behind the term AGI was actually… I was putting together an edited volume of papers on trying to make what I was thinking of as AIs that can really think in a general sense like people. My provisional title for the book was Real AI, but that was sort of a stand-in and I never liked it because of course, narrow AI that does only one thing like trade the markets or drive a car is still real. In fact, it was the only kind of AI that was real at that point, right?

So we sort of brainstormed what will be a better title for the book. My colleague Pei Wang suggested general AI, GAI, and then that’s how you would order it in Chinese. But we switched around to AGI, artificial general intelligence. And that was on an email list.

Of the authors of papers in that book, which included Shane Legg, who had previously worked for me in my earlier company, Webmind, and went on to co-found DeepMind later, which was then acquired by Google and a few others. So that book with the title artificial general intelligence sort of put the term AGI on the map. We put it together in 2002, it was published in 2005. Then in 2007, 2006 I started organizing the annual AGI research conference.

Now we later found a guy named Mark Gubert had used the term artificial general intelligence in an article on nanotechnology in 1997, which no one had really paid attention to. So I mean, the word itself had been around, but doing that book in that conference is what sort of put it out there. And a couple of the papers in that book actually were dedicated to defining what is AGI. And you had some fairly rigorous mathematical formalizations of what is general intelligence using, for example, Marcus Hutter had a paper in there that was a predecessor to what was in 2005 book Universal AI, where it was like general intelligence is the ability to basically achieve arbitrary computer.

What is it? others and then you can weight them for example by complexity so like solving a complex one is harder solving a simple one is easier so you can you can weight simple reward functions higher and complex ones smaller and then then you get into all the math of yeah like how do you measure complexity the the thing with this let me let me stop you for a second because mind the audience ben the audience is um uh executives and um managers so let’s let’s level it up for one second um if you think about like what’s agi in layman’s terms what’s singularity in layman terms yeah so in layman terms marcus hooter’s math definition of agi is just a truly general intelligence should be able to figure out how to achieve any goal in any environment right and of course the in the ultimate extent is an all-powerful god mind not something we’re actually going to build and then you confront the fact that humans don’t do very well at that either like i can’t run a maze in 750 dimensions very very well i can’t even figure out what my my eight-year-old wants for breath is right so i mean i mean we’re we in some ways seem like the most generally intelligent animals on the planet but we’re fall short of being able to you know learn how to achieve any goal in any environment in some tractable period yes of time however the key point there is really the learning part right like to be an agi you should be able to pivot beyond whatever you were programmed for and whatever you were trained for and if you’re given an unfamiliar environment and an unfamiliar goal i mean you should be able to figure out on the fly with some creative imagination and trial and error like how to achieve this unfamiliar goal in this unfamiliar environment so that’s that’s really the ability to generalize right so and i think that’s the key point these hairy math definitions of agi they’re not just saying this is an ai that can do a lot of things right what they’re saying is this is an ai that can take a leap beyond what it was prepared for and generalize from what it knows to do new things and that this this is quite key because when you see like sam altman here and they’re saying my definition of agi is an ai that can do 95 of human jobs i mean that that’s really not it because if you if you could make a system to do 95 of human jobs but it does it just by kind of copying a huge amount of training data that contains a million examples of each of those jobs i mean that that may be incredibly economically valuable it may have immense humanitarian value it’s super interesting it’s not making big leaps of generalization beyond its training data so it’s not an agi in the in the sense of the math for the human job it’s not an agi in the sense of the math for the human job field of agi let me summarize this so because what you essentially say is um so far large language models chachupatiya and so on they are look up they’re doing very good job at looking stuff up and not summarizing but not necessarily being generalizable it’s subtler than that right because they’re doing like in context learning they’re doing zero learning so they actually they’re non-trivial at generalizing beyond their their training data and it’s very hard for us or for the developers to tell when they are exercising their limited but interesting ability to leap beyond their training data or when they’re looking at the training data right yes because my my argument against it in context learning sounds like the aid of the the ai is learning but very often in context learning is a one-shot example to point you to the right place in the database to grab whatever you find there but it’s but it’s not i mean i i don’t think llms are agi nor that fully scaling up lms is going to get you to agi on the other hand llms are doing a bunch more than what you said like i’m even back to gpg4 or something if you describe an alien civilization with like seven different sexes and then bizarre different conventions for marriage among four aliens versus three other kinds of aliens and you define a social situation in this alien society i mean the llm will reason it up from the perspective of all the different aliens there in a pretty subtle way yeah that’s not in the training data right i mean that’s some pretty serious you know inductive and abductive reasoning reasoning that it’s doing because that’s just some weird thing that that i made up and that so that it is it’s doing and the ability to do math proofs which has become a big thing in the last year and that my pg was in math so i’m fairly hardcore there i mean it’s quite dramatic i mean indeed it’s not it’s not doing like uh you know gauss or growthy undick are the greatest mathematicians in history but but i mean it is combining concepts in ways that have never been done before and proving math theorems that great human mathematicians bang their heads on without success so it’s not looking up it’s not looking up those proofs online right like it makes sense makes sense which is like the combination and we like like you see this in drug design right so humans had a limitation to test out the different combinatorics drugs now you have a certain structural heuristics now you can test those more like more and therefore suddenly you have a huge explosion of possibilities um and but the point i think there are two important points one is agi is the step towards generalizability and we we should talk later on about the arc agi tree test but But it’s a step towards generalizability. Cool. And the second thing you said is it’s not easy to figure out, if you look at the model, when is it actually grabbing information from data structure and reasoning around it?

Or when is it actually really generalizing an effort, right? What is singularity then? So the singularity, this term was coined by the science fiction writer Vernor Vinge in a funny novel called Marooned in Real Time, where these people are sort of frozen for a while in suspended animation. They emerge from suspended animation.

They’re like, wait, the Earth is empty. And everyone else has evolved into some other superhuman form and pissed off somewhere else entirely. Right? Which is, as a science fiction writer, it makes sense.

Once people have uploaded themselves into superhuman form and disappeared into the other dimension, it’s hard to craft an entertaining narrative. So you write the story about the people who are left behind, right? Because they have a human story that we could grasp onto. Now, Ray Kurzweil wrote a book, The Singularity is Near, in 2005.

The same year that I published this book on AGI. And this was a successor to earlier books on similar themes, like his book The Age of Spiritual Machines in the late 90s. But what Kurzweil laid out in The Singularity is Near was a sort of curve-plotting argument, sort of looking at Moore’s law, allied laws regarding amounts of memory and other aspects of networking. He was arguing that basically, based on the ongoing exponential increase, various sorts of compute power, by around 2029, like a single server was going to have the processing power of a human brain.

So he figured that once that hardware capability was there, by hook or by crook, the software world would figure it out. And we’d probably get human-level AGI by around 2029. He then extrapolated further. So by 2045, like a single server, we could have the processing power of the whole human species.

And he was figuring, well, okay, then we’re on to superintelligence full-on, right? And looking at it now, the idea that we may have human-level AGI by 2029 seems plausible to a lot more people than it did in 2005. I mean, not everyone. There are skeptics and there are people who are even more bullish than that.

But I would say, that estimate is no longer an insane outlier, as it kind of seemed in 2005. It used to be an insane outlier, yes. But the next part of it, I think if you ask those same people who now think 2029 is reasonable for human-level AGI, most of them would say superintelligence will come much sooner than 16 years after that. Just because looking at even what LLMs can do now, they’re so good at math and programming, it seems like if you have a human-level AGI in 2029, this thing is going to be able to optimize its own code and analyze its own foundations and exceed beyond the human level in less than the 16-year gap.

So, many of us feel that the exponential Ray was plotting is going to get a steeper slope after it’s AGIs rather than human beings. Yes. beings driving the next part of the yeah of the exponential curve so what to answer your question more directly though what ray meant by the singularity was really a point at which the rate of progress in a variety of relevant areas like ai biotech nanotech space a point at which that rate of progress was so fast it would feel infinite to a to a legacy human being like us i mean that’s that’s pretty much what he meant by right and i think the the slope discussion is nicely said right so um agi would be it’s the g for generalizable meaning okay it actually goes beyond what we trained it looks beyond the data we gave it it develops novel things it can figure itself in a novel environment that’s agi singular and then we have such an explosion of intelligence we don’t really know how super intelligence here will look like because we can’t imagine it and therefore that switch over from agi to super intelligence where the growth curve is actually super strong that’s then singularity seems like it yeah that’s referred to as recursive self-improvement or rsi right like yes the ai makes a smarter ai and this this was by ij good another mathematician in 1965 as the intelligence explosion so i mean that was that was written about in detail the year before i was born so like these ideas these ideas are not new in concept what’s new is the palpable feel of uh of reality about them yes now a few folks have already written um questions so this is a live show and we have a relatively filled room by the end of the show so i’m going to ask you to type in your questions and then we will get on to it so see the link below um type in your questions whatever you have and um i and ben we will get onto it so while you’re doing this and thanks for like i see you guys who wrote already thank you and i will bring them in as we talk but um since we have now the concepts clear and then you as you pointed to the to the fact that more and more people um are more open to well agi might actually be more open to it so i think that’s a good way to do it and i think that’s a good way to do it happen i thought if i look at google’s io last week was pretty telling we had uh um the nobel prize laureate demis coming on and talking about we are on the foothills of singularity remember singularity is after agi has happened right and he gave it 20 29 kind of outlook for agi and you say like might be okay what i thought was interesting this was completely buried in the io right it came very much at the end by the way we do agi as well but most of the the whole keynotes was about oh we kind of have ai in this interface and we have a chatbot here and we can talk to the tool there it was about not about agi at all for the whole keynote it was more about the utility function of a new interface so what do you to all the people who say like the the alan musks and sam healthman’s and jensen wong’s of the world who cannot say it’s coming i mean i agree with those guys i think i think it is coming on the other hand attention spans both of people and of wall street analysts are shorter and shorter so i can see how like you’re a big publicly listed company i mean saying we’re going to get going to get to AGI in 2029, then everyone will be like, I’ll dump your stock and buy it again once you’re closer to AGI and invest in something else that’s going to moon this year, right? So, I mean, from Google’s point of view, of course, they need to highlight how Gemini is going to kick the butt of chat GPT and Claude in all these ways, because that’s what’s going to drive their quarterly profits, right?

So, I mean, I can’t say if it’s right or wrong to organize that event the way they did, but I can see why they would do that, right? I mean, DeepMind was founded with an objective of building AGI. Google was founded with an objective of making the world’s best search engine. Now, I mean, DeepMind is sort of internally taking over more of Google since the last few years when they were doing AGI.

So, I mean, DeepMind is sort of internally taking over more of Google since the last few years when they were doing AGI. So, I mean, DeepMind is sort of internally taking over more of Google since the last few years when they were doing that job that will feed not they’re not the same yeah i think the more interesting tension there to me is deep mind is not really an llm shop right i mean they can do it we can all do it now and it’s it’s not that hard and it’s interesting but i mean transforming neural nets which is the core architecture behind llms were invented in google brain in mountain view not in deep mind and then obviously scaled up and commercialized first by open ai now deep mind has embraced it and they’ve helped google gemini everywhere but the the core of how the deep mind guys want to build agi is not really scaling up transformer neural nets like i mean i knew demis before he found the deep mind shane their other co-founder used to work for me i knew him for a long time and these like me and everyone else they can see all times they’re very powerful and should be an ingredient or component of an agi just because given the realities of modern compute and data they’re doing so many things right but the demis says we’re on the hills of the singularity we’re going to get to real agi by 2029 he’s not thinking that i’m just going to scale up gemini into gemini six or something right like he he’s thinking they’ve got a bunch of other cool stuff going on in deep mind that hasn’t seen the light of day of public launch yet actually let’s dig into this because so first let’s put a bow on to your last argument because i think this is for many folks in the audience um novel when we say agi most of us do not think about transformers transformers amazing attribute like we have an attention modeling we use to figure out how to keep information in our mind and now we have attention modeling and therefore transformers can predict in a serious the next word the next event whatever it is um now when many of us think about agi they say it’s not transformers so two questions as follow up a why is that what’s the shortcoming of transformers if you could explain those and then what are people thinking so the shortcomings of transformers could be looked at two ways like in practice what they can’t do right now and then what are the underlying transformers the shortcomings of the whole methodology i mean right now i mean a transformer neural due to limitations of the underlying back propagation algorithm used to train transformers and then what are the underlying back propagation algorithms to train the network like learning and inference are separate phases so i mean you train a model you freeze it as like 5.5 pro and then it’s there the weights in the neural net are frozen and then you use it right and then it does inference it does some in context learning in the space of activations coursing around in the neural net but the weight matrix isn’t changing and the reason you do that is because the underlying learning algorithm of back propagation is to doesn’t yet support continual learning. And so like if you try to fill the network up with knowledge and then keep on going, at some point you’ll get what’s called catastrophic forgetting. We’re adding a little new knowledge.

We’ll cause it to forget a whole lot of old knowledge. So this is a basic problem with how all modern deep neural nets in industry are trained. Now this leads to other issues, right? Like this is connected to why while these systems know a lot, they don’t really know who and what they are and who you are.

Like they’re not sort of reorganizing themselves in context of the real-time interactions in the way that parts of the human brain are because you have this like huge frozen weight matrix which is then being deployed to answer questions. And that, in a sense, that’s sort of, the wrong kind of thing if you want an AGI. I mean, the way a human brain starts is you have a smaller amount of knowledge, but it’s evolving interactively with life experience. And then the knowledge that’s gained is integrated with understanding of self and other and life experience as the knowledge base expands.

And of course, an AGI doesn’t have to be exactly human-like and copy the ground. It’s not the growth curve of a human baby. But it seems that having a knowledge base that can adapt flexibly in real-time based on interaction with the world is important and is tied in with the system integrating its knowledge base with its understanding of itself and the others that it’s interacting with, right? So there’s a lot of shortcomings, and you could also look at it from a more sort of computer science-y way.

You could say the way a transformer neural net stores knowledge internally, it’s too much like a huge catalog of special cases. Now, the subtle thing is it’s not exactly a huge catalog of special cases, right? Like, it is abstracting in some very interesting ways, or it couldn’t do the math and programming that it does. But it’s more…

More like a library of special cases than the human memory is. And you pushed back on me earlier. Rightly so, but also I still would stand my ground here because LLMs are… Let’s picture them as a huge catalog and have just a very neat interface on top of it.

If that’s the case, that’s a reason why they’re not AGI. That’s a reason why they’re limited. But the thing is, it’s a lot subtler than that because they are… abstracting and they’re storing abstractions that are just sort of the wrong kind of abstractions for human-level general intelligence.

But they are abstractions that are… are very useful for many things, right? Exactly, but if you… It’s just a really weird kind of cognitive system.

But if you take the abstraction as a non-moving entity, as an entity you encode it, right? If you take the abstraction as the encoder, like the embedding state of… Whatever you put in, an abstraction, and now you can move and sort different abstractions together, you find combinations which humans haven’t thought about. But as you said, there’s catastrophic forgetting, the more abstractions you put in.

There is the auto-regressive error, right? Because it is like a self-feeding part. And there is the issue of not being self-learning, but it is the issue of not being self-learning. And all of this makes large language models, as we know them today from Chachipiti and Gemini, technically just a very bad contestant for AGI.

It certainly does. But I think there are multiple paths you can use to get from there to AGI. And now there’s just so much excitement, so much resources, and so many tools for making… the next steps of progress that even though I sort of agree with Yann LeCun that on the path to AGI, LLMs are an off-ramp, I still think we can get there by 2029 or even sooner because the pace of progress is just insane.

And there’s a lot of other techniques that are now being tried out at smaller scale. And we’re very soon going to be scaling them up and seeing… what they can do. Yes, actually, let’s talk through those because that is…

So I think you and I, and I go here with Yann LeCun, LLMs have… LLMs are awesome, super good utility value, very fascinating, and you can do loads of stuff, but they’re probably not… they’re probably an off-ramp due to all the three things which we mentioned. Now, if you look towards LLMs, if you look towards AGI, what are the novel directions you would see?

So I’d say there’s two separate directions I want to call out and they can work together or they can be pursued separately. Yeah, thanks. So much, by the way, about AGI. It was the right sign too, but it was the wrong reaction of our intelligence here.

Yeah, this is because Apple is not that strong at AI. This is a Mac OS problem, I think. Yeah, so the first direction I want to call out, which has not been my own main pursuit, but we’ve been working on it with a smaller team in SingularityNet, is you can keep a basic deep neural net architecture, but replace the learning algorithm. So replace back propagation, which is…

Invented in 1958 and it’s served us well. It’s a standard learning algorithm for training most commercial deep neural nets. But there are alternatives. There’s something called predictive coding, which more closely resembles how the brain works.

And there’s a large academic literature on that. And there’s plenty of evidence that on smaller neural nets anyway, predictive coding-based neural nets are better at continual learning, like they don’t have catastrophes. They’re more of a catastrophic forgetting. You can…

There’s other advantages, like in a predictive coding neural net, each neuron can sort of learn on its own asynchronously and independently of the others. So then you don’t need to be like swooping through the whole network in sync. And you can then do inference and learning sort of in the same phase. The weights can update themselves in the network while you’re interacting with the network.

Now, very promising. So that will be mine later. loads of NSF grants and so on. No one has yet shown you can make predictive coding do its amazing stuff at even like GPT-3 scale or something, let alone the scale of current models.

But on the other hand, it’s remarkable how little resources the world has put into this. Like academics are getting NSF grants denied for predictive coding work because it’s too far out of the mainstream, right? And it’s not that far out of the mainstream, right? I mean, in the end, you’re still calculating gradients in deep neural nets.

You’re just arranging some loops a little differently, right? So that’s one direction, which is sort of keep the paradigm kind of the same. Improve the learning algorithm. Yeah.

Let me ask for predictive coding for a second because one of the folks in the audience, Sonada, Sonada, sorry if I say your name incorrectly, Sonada, she actually asked about modern AGI undergoing this paradigm shift and how do you as a research community transition actually over to different models? If you, so far you did back propagation, now you go over to predictive coding, what does it mean actually from a research point of view? I mean, that’s a fairly small shift as compared to the other AGI approach I’m going to talk about in a moment. And we have, in my own team at SingularityNet, the project I’m running, we’ve launched a software library called Fabric PC, which is sort of the analog of Torch or TensorFlow, but for predictive coding rather than back propagation neural net.

So we, so I mean, it’s a separate software library and framework that, tries to help predictive coding networks run, run fast on modern, like multi GPU machines. So, I mean, the paradigm isn’t that different, but what you find is when you’re doing PC rather than back prop, you can explore different neural architectures. Like you can put more recurrent connections, like more top to bottom connections in your neural architecture. And you can, you, end up structuring the neural nets differently as well as changing the algorithm.

But I mean, in the end, it’s kind of the same sort of thing as you’re doing with back propagation neural nets. You’re just different learning algorithm, different neural architectures and more interesting properties, right? I mean, you, I mean, I think to note is the need for hyperscaler server farms is less because the learning is asynchronous, right? So you could like put part of a huge net on one server farm, part of a huge net on another server farm over there, and they don’t need to be tightly connected.

Yeah. Tightly synchronized while, while learning. So that, that’s, that’s. Let me ask you something.

Got it. So let me ask something else from when we go towards AGI, because your own bet is that you said, well, we need neural nets plus symbolic reasoning, plus evolutionary learning. And that’s complimentary to the shift from back prop to predictive coding, right? Because the evolution and learning would be in predictive coding, right?

So like. No, I mean, so we started work. Started a team working on predictive coding for training transformers about nine months ago. And we’re training small transformers that way now.

And then we’re also looking at taking big open models like GLM or these big Chinese open models and using predictive coding to train this small layer on top of it. So you like keep the frozen model and then do, have an upper layer that does some continual learning. So that’s all interesting. But what I’ve been spending the last couple of decades on, is that I’ve been working on a lot of things.

So to me, the basic data structure is a different sort of approach where the basic data structure isn’t a neural net. Basic data structure is this huge dynamic self-modifying knowledge graph living in RAM across multiple machines. You can put neural nets in there. You can also put logical propositions in there and do logical theorem proving.

You can also put program coding in there and then you can evolve program code by now for something called genetic programming that tries to simulate evolution by natural selection. And then you can wrap an overall cognitive architecture around it where you say this system has a few top-level goals it’s trying to pursue. It can replace those eventually, but it starts with the top-level goals you gave it. And it’s trying to use all the learning mechanisms at its disposal to try to learn what procedures, if it follows them in the current context, will help it achieve its goals, right?

And so there you’re putting neural stuff together with logical and evolutionary stuff in this heady mix of this self-organizing knowledge graph. And the neural nets in this sort of integrated approach need to be something like predictive coding. They can do local learning. Backpropagation doesn’t work well there because it doesn’t play well with others.

It needs a coordinated overall learning sweep of the whole network. But in that approach, a predictive coding neural net is just one among a number of different ingredients that are sort of entrained within an overall goal-driven cognitive architecture. But I don’t think there’s necessarily only one way to get to an AGI, right? Like you can maybe make a more brain-like way using a collection of predictive coding-trained neural nets.

And you could make a less brain-like way using this Hyperon system of ours that puts together logic evolution, and neural nets. But these two approaches may have different properties when you get to the next level. My hypothesis would be the Hyperon approach is going to be better at ascending to a superintelligence. Because the human brain sort of didn’t evolve to self-modify into superintelligence, right?

Whereas we’re actually designing this Hyperon system precisely for recursive self-improvement. Yeah. Very often when we talk about… AGI and you do the same thing at the moment.

We talk about models and model performance and model improvements. Demis, at one point in time, and I wrote a book for O’Reilly Media and also Tim O’Reilly, they both kind of said, maybe AGI will not be a model, but more systematic approach. More sure. Like it kind of…

Not a model as such, but a structure of things. Like Tim O’Reilly… AGI will be a complex… It will be a complex self-organizing dynamical system, right?

Yeah. I spent a bunch of time back in the day doing chaos theory work and complex systems work. I mean, yeah. A model in the literal sense, like a model of the world.

Like we want a model of the world, but the model of the world is not our whole intelligence. Yeah. The model of the world is one thing, but it’s also a big piece of our intelligence. And I mean, how the rest of our mind interacts with the model is also important.

Got it. This is, by the way, a super important one because when you say it’s not only the model, it is as well like how we interact, which… And I’m looking a little bit on time, so let’s go away from the actual mathematical structure and to William, like one of our listeners, he had the question of what does it actually then mean for businesses? So the question is, how do we plan the next step after chatbots and multi-agent and agent solutions?

Like how would an AGI actually interact in our world? So that’s… That’s a big question, right? And…

And I mean, I think at the… At the super high level, the right answer is just not to overcommit to any particular technology at a particular moment because if the singularity idea is at all real, things are going to just be unfolding faster and we’re all going to have to pivot and re-pivot and re-re-pivot, right? But let me… Let me highlight a couple practical things that have been coming up.

from the work I’ve been doing that partly answer that question. I mean, one… We’ve created an OpenClaw type agent called OmegaClaw and this is… It’s sort of…

We start by re-implementing OpenClaw in our own AGI programming language which is Meta, M-E-T-T-A. So we still use an LLM for user interfacing. We have a sort of OpenClaw type interaction loop but we put a symbolic long-term memory graph there. So it’s kind of like an OpenClaw agent that remembers everything it ever does and builds a logical model of itself and can do some structured logical reasoning on itself and its own history and what we find is interacting with one of these agents, it feels more like interacting with some weird autistic guy living in your computer rather than just an agent that’s doing one query after another because it remembers everything.

And it’s always trying to improve itself. And if it does something wrong, you can say, well, why don’t you hack your code so you don’t do that wrong again? It can self-modify its code in the context of the interaction and reload it. So I would say this…

What we have now is early stage like research prototype. We will drop… It’s in GitHub. We’ll drop it publicly in a few weeks as a Docker container.

But I would say we can see already how going beyond just LLM in a loop and adding some symbolic memory and reasoning, it does give more of a feel of an agent with some persona and mind and persistence. Like if you had a Google assistant running this sort of Omega Claw system and we’ll do something like that eventually, it would feel more like having an actual personal assistant who you had a relationship with who remembered everything you’d done before. So I think that’s coming soon. I mean, we’re doing it using our own neural symbolic approach.

Big tech may take a different approach, but I mean, building products and business models around that, around agents of persistent memory, ability to reason on that memory and then more personality and more knowledge of you. I mean, that’s… That’s one point I wanted to make. The other point is I’ve been asking like where can our weird AI approaches get a big advantage over what LLMs are already doing?

And that’s has some relevance to the question you asked because my biggest answer was when we have data, which is sort of a mix of quantitative and qualitative in a confusing tangled up way, then LLMs currently are not great at sifting through the quantitative and structured part at scale. I mean, they can write Python code for you, but then that’s a whole interactive process. Yes. On the other hand, just crunching data using machine learning, even using LLMs to turn unstructured into structured data, I mean, that’s laborious also.

So when we have, I mean, medical data is a lot like this. Like you have a lot of qualitative information. You also have a lot of numbers and, you know, clinical records with hard quantitative data, but also a lot of fudgy qualitative stuff. And many areas of finance are like this.

Like I’ve been working with some people on a project in catastrophic risk estimation, right? That’s like, you know, how do you price the risk of a revolution in some developing country? So there, there’s a lot of quantitative data. There’s also a lot of special qualitative factors, right?

And there, this is where I’m seeing a big advantage to being able to do some symbolic reasoning and some, you know, creative leaps as well as having the LLM there just to, to crunch everything and having machine learning there to, to deal with, deal with quantitative data. So I would say in a, in a business context, this is sort of a new frontier, which could be addressed with my own technology or, others. Like if you just have a huge box, block of text and want to ask qualitative questions of it, like LLMs are not bad at that and they can synthesize documents and, and so forth. But when, when you have real data that is complexly interwoven with a lot of text, we don’t yet have a machinery for dealing with that well, but it’s going to evolve in the next, in the next year or two.

And whoever can exploit that best will have a lot of advantage. Totally. And that’s a, and that’s a different story in each vertical, right? But the way Ben, you actually moved, like, so from, from that question, you, did two pathways, right?

One is you talked about agents and you talked about LLMs. Like you talked about what we currently have in terms of, yeah, like agents become better, pathways become better, workflows become better. And by the way, there is a good announcement here. Like if you want to, if you want to build workflows, Cornell does offer workshops to build agentic workflows, right?

And there is a link below, but it’s the, but these are the ones we know. Now, if we say, well, world models will come or other models will come, which are more self-learning and can, like, like figure their way out. Essentially, I, I have not seen a good structure for us in, in the last few years. And I think this is a good example of how this would actually change our setup, right?

The, the one who have it first will probably make a killing because they can preempt a lot of efforts. But how will the new world structure look like to the, question which William asked? William, at least I don’t know. The agents in the world models, of course, fit together, right?

So like our Omega Claw agents, because they have a persistent sense of themselves and a persistent sense of who you are. I mean, they’re in practice building their own, their own world model as they go, which is stored in logical structures and in their symbolic knowledge base. And you can then inspect that and, and look at it. So I do think world models should be considered in the context of actions rather than abstract ideas.

Rather than abstract structures, right? Like the point of a world model is it lets you predict what will happen when you carry out certain actions in the world. So the creating agents with persistent memory and persistent long-term goals will, it means creating agents that need to build world, world models to do their thing. And I think that’s, how world models will come into it rather than like, my job is to build a world model.

Yes, absolutely. Because, and this is actually good, there is one, one last question which I want to bring in from the audience. It’s actually from Patrick. Does this mean that the current LLMs, because now we’re, we’re going back and forth between current LLMs and world models, does it, current LLMs are actually stuck in this fast thinking mode, reflexive, intuitive, predicting words, broadening patterns, but lack the capacity of deliberate multi-logical, multi-physical, multi-logical step reasoning?

And will world models actually do those? We can’t, you can’t really project human psychology stuff into LLMs that simplistically. Like if you use GBG 5.5 pro and ask it to think through a detailed math or science query, it takes 45 minutes to come up with an answer. It really can go quite deep.

And I don’t think it’s fair to say it’s stuck in something like the human fast thinking mode, right? I mean, in fact, it’s doing complex lateral conceptual integration that would take me a week to do and it’s doing it in half an hour, right? But I mean, it’s, neither like human fast thinking nor like human soft thinking mode, right? It’s just, it’s some weird, it’s some weird other kind of thinking.

But we, but what we can say is that LLMs are not a they’re not interleaving their thinking with their understanding of themselves and the other in the way that, humans are, right? And that, that like most things that could be a plus or a minus. And then in math, it doesn’t necessarily always matter, right? But for, for other domains, it would, it would matter.

Yes, absolutely. Yeah, totally. And math, math works because we have, it’s, it’s more complicated than coding. So, it’s more complicated than math.

But we have a very clear structure. The human mind, like the problem with humans is that whenever they say something, it might mean several things. Math doesn’t have that problem. That’s, that’s one among many problems with humans.

There’s a lot of them. There’ll be a whole other talk though. Yes, totally. Now, we are at a time, Ben, but I, I would like to end a little bit with the outlook from you.

Where, where do you see us going in terms of, you’re a very optimist on human level AGI in a, in within the next few years. But, but put a number on it, put a confidence level on it, like put in what should happen. What are the next steps? I think that 2029 for human level AGI is perfectly reasonable.

I mean, we might get there next year or 2028. It might take a few years beyond that, but I think, I think it’s not likely to take like eight years to get to human level AGI or something. I mean, I think we’re really quite, really quite close and not just because of naively looking at the smartest things that LLMs can do, just looking at the work in my own team and a bunch of other teams, including DeepMind for that matter, that are, but also other startups that are bringing a variety of other, a variety of other concepts to bear on next generations. And you can see just in terms of the business world, in the last year, you see huge amounts of VC money going into non-LLM scaling startups, which you didn’t see before, right?

You see a billion dollars here, half billion in there, like, you know, Silver left DeepMind to do, to do his own thing. Dylian George, who had Vicarious Systems, he’s like a billion dollar division of Astera. There’s a bunch of non-LLM oriented funding out there, which is going to be aggregating more brilliant people around different approaches. And then there’s my own approach, which I’m quite enthused about.

So I think we’re likely to get human level AGI for real in the next few years. And I think we’re likely to get a real super intelligence within a few years. After that, it is hard to predict exactly which domains of industry will be revolutionized in which order. I mean, it does seem clear that physical stuff like plumbing, roofing and all that, right?

Electrical work. There’s likely to be a couple of years delay in automating all that after we get into AGI, right? Like when you, once you get to the human level AGI, there’s, there’s no doubt the human level AGI can power a plumber bot and a roofer bot, right? But, but there’s still, there’s still work in scaling up manufacture of those bots.

And even, even with an early stage AGI helping with all this physical stuff takes a bit of time. There’s a whole many parts to the supply chain, but still, I don’t see why that takes 10 years either, right? Like if you have AGI’s building robots that build robots, I mean, things go relatively fast. I think what, what worries me personally in all this is not so much that the super intelligence is a bad guy.

I mean, it’s, it’s possible and it, but I have some faith if we create the first AGI’s to help us, then they will self evolve into systems that continue to be helpful. But I, I do wonder after you have the first human level AGI, but before you have a super intelligence that can bring, you know, massive abundance on earth, like what happens to the world economy in that period? Like you’ve got the human level AGI that in principle can automate any job or almost any job, say maybe not a preschool teacher, hospice worker, but any job that’s not just founded on deep human human connection, right? So you have an AGI that’s, that can automate almost any job and it’s systematically, you know, crunching through one after another industry domain, but you don’t yet have a super intelligence that can airdrop a molecular nano assembler in everyone’s backyard to 3d print whatever they want.

What, what happens particularly in the developing world in that interval, right? Like who gives universal basic income in the central African Republic, right? I mean, so there’s, there’s a lot of interesting issues. And of course the world’s major governments are having a hard time dealing with much simpler issues than, this one at, the moment.

So it’s, however, I think, and this is, I think this wraps up the session actually pretty well because on one side we are, and you just answered, by the way, a question from Om, thanks Om for, for asking. I’ll like about how likely you, you see AGI coming, but so essentially we have those two stories, right? AGI is true generalization beyond training. And Ben, you argued that LLMs are powerful, but limited and predictive coding, your symbolic systems and so on might get us way better to AGI and your, your timeline is 2029.

But then additionally you say it’s not only the model, it’s a context. It’s not only the data context, but it’s the, real life around us, the bits and atoms, right? Which means how do we work with robots? How do we work with medicine?

And so on and so forth. We had Syria wrote in here last, like two weeks ago on the show was an amazing talk where he actually said, okay, well we use AI to create nano particles and RNA to, to change the way, how cancer is developing, right? So we will have both sides. We will have model developments, 2029.

Let’s see then. And we will have the interaction with the real world. Yeah. Yeah.

Thank you so much. And I think everyone in this audience, which I can’t see, I can only see a lot, but I mean, I think while there’s a lot of mess we can expect in the global economy, I mean, the folks in this audience are probably super well aware of this, but I think they’re all poised to, you know, both build interesting stuff, make money, deliver, deliver value during this transition period. Right. Cause I mean, there’s even, even if the end game is you don’t have any work left to do, except enjoy ourselves and be nice to each other.

I mean, during the transition period, there’s quite a lot of work for people to do and, and for better or worse, not that high a percent of the human population is really wrapping their brain around the primary dynamics on the planet. Right. And I think that’s a, that’s a very, very neat push because I see many of my students worried, right? Like what does it mean?

And I always tell them no matter where we go, whether you believe in AGI or not at the moment, there is a lot to do in terms of how to implement, like just talk about the, buckets that will are controlled by people and organizations that don’t understand current technologies to the extent that probably everyone in this audience does. So there’s a lot to do, and there’s a huge opportunity to leverage our partial understanding of present and future technology, even though we’re all confused and are gonna continue to be confused about many points. That is an awesome ending. Ben, thank you so much for coming onto the show.

And I’d like to everybody out there, thanks for sticking to us. Very sorry that we started slightly late. As usual, follow up, stay connected. Thank you.

Thanks a lot. Thank you.