A reality check on AI — Transcript

We have a new podcast, a new season of podcasts, and we actually thought we should rebrand ourselves a little bit, since we’re talking about quite a lot of topics that are new, that are different, that might also fail. Well, it’s so much like AI, right? I mean, AI is new, it’s different, it might fail. So we thought, what could be even a much better name than we already have our names for a podcast and something that talks about the future, talks about something a bit more edgy, a bit more special, something that is beyond the horizon.

Edgy, that’s the word, edgy. And that could go over the edge. Yes, yes. So it is the edge, No, I’m not going to ask whether you like it.

But it’s actually. You get used to it. It’s like BMW car design. You get used to it.

We thought we kick off this season with questions. Yeah, because we got asked a lot of questions and also kind of a little bit of an update of the current situation. As you remember, we recapped our old forecast. We made new forecasts in the previous podcast that we didn’t record as a video like this.

But there are a couple of questions that came up also in the discussion with our colleagues, with other VCs, with other founders. And yeah, let’s dive into it. Let’s do it. Let’s do it.

The first question is, what is actually different now in the AI hype? Now you could argue, guys, we’ve been debating this for more than one and a half years, but still, it makes sense. People asking, but we had AI before. We had it for decades.

What’s now different? Nothing and a lot. So technically speaking, we have had the idea of attention, meaning we tried to do sequence predictions. Maybe we decided to come up with motion Technically speaking, we created something like an attention model to figure out for a long string of words where to place a focus on it.

That has technically changed. Practically, honestly, nothing has changed. Since the 1950s, we had AI. We have AI now.

It’s only that way more people are using it. We had natural language processing or understanding before, and it worked pretty fine. I mean, also for the chatbot industry, there was already a couple of years ago, big companies coming out, being financed, working well. Now, the question is, do you remember whether I had a coffee this morning or not?

And the answer is yes, I had a coffee. Why? Because I just told you this a second ago. Meaning you recall the fact that I had a coffee.

That was your attention window. Large language model had difficulties to keeping a long attention window. And now we found a trick. That’s all there is.

But it’s still the same sequence prediction. That’s the only thing which has technically changed. But why do we have all this hype? Because everybody is talking now about it.

What is it? Well, I can type in something into chat GPT and I get an answer. And before that, I would have asked my friend at Stanford doing computer science, data science. To set up a model, train it, put all the data in.

Waiting a long time just to get my answer. It’s just happening right now. Meaning it’s actually usable now. Yeah, for everyone.

Yeah. We can experience it. And I think the usability is a huge part. We have this idea in product, the idea of a minimal viable product.

I think for large language model, it is actually more of a minimum quality product. And we have moved above the threshold of a certain quality. Do you remember? Any other time where we had word was unusable for quite a while and then suddenly everybody was talking about it?

I think if you look at the computer vision models. Yeah. They were around for also quite some time. And then we saw first this kind of detection of what is a cat?

What is a dog? And people said, oh, that’s pretty cool. That’s much better than what human being would do. But the big leap came with the autopilot and Tesla rolling it out on the streets.

And we also these semi kind of self-driving cars. That was a big leap. Yeah. case is there then people can adopt it before that it’s just theoretically and it’s good enough it’s kind of yes the car is crashing which is not good at all but most of the time it was driving by itself that’s pretty impressive so cool we have an excitement because we had some technology improvements fair but the real excitement is now because it’s usable and it makes sense as a consequence we are all excited but is this infinite we come to one question later which many people are asking but currently the expectation is a little bit this can solve everything but what are the limits and there are a couple of limits just when you I mean at the beginning I remember people would put in jokes into chat GPT and they would repeat themselves so these were limits we could clearly experience sometimes they came up with answers they were simply not right but speaking that imagining or putting all the data in increasing the attention the context windows what’s the limit here there’s so many limits I don’t know even to start first of all language is only part of us being right why do I like coffee I can’t even describe like if you look at coffee tastes I had this morning a very good coffee from Costa Rica and the taste was orange and lemon zest I don’t know I get a taste it like coffee to me we’re using words to describe a certain feeling humans had difficulties using AI tools because what was the big leap going from traditional AI where humans described the rules to neural networks was that we let the computer come up with the rules with a gut feel we know more than we can describe that means however when we now have a model we can describe it in a way that is not a traditional AI that means however when we now have a model we can describe it in a way that is not a traditional AI that means however when we now have a model we can describe it in a way that is not a things nicely that doesn’t mean it kind of knows everything because we actually as humans know way more we have emotional intelligence we have senses that is all not part of this llm but that’s not all there are other issues i think if i would be able to transcribe all your historic conversations about coffee i could probably predict what kind of coffee i should present to you when you come to a restaurant because i have these kind of conversations but i just this is just historically maybe your taste just have changed because you have tasted so many costa rican coffees you want to try some colombian coffees and that’s the bit more the human part that’s also understanding different layers of over satisfaction or saturation essentially what you’re saying well the model might not describe your coffee taste because it’s missing data which is true if you would you would go through all the effort describe all the different coffee taste and trying to describe my experience with the coffee in a more fine-grained detail then you probably could build a model but the whole point is we learn through experiences and there is a fundamental shortcoming in data we should just acknowledge this a four-year-old had definitely way more senses impression than any of the big large language models gun built despite not being able to actually read any sentence right but there was more impact there somehow in information they had more senses so that’s a huge impact but the second part is whether my taste changes and whether i want to be actually excited with something new that is actually something where large language models are good because a large language model the way it works it takes information life is like a box and it embeds that information in a vector form where it doesn’t come space and what we discussed before as hallucination is essentially in that embedded space this is like now you know like here you have the r from cherry right and this is one space and now it says okay i give you something closely next to the r which would be this yellow thingy whoever drove this we are moving in that space and that’s what we call innovation that’s what we call ideas and humans are not any different i totally agree with you my point is not that this is not interesting because i get inspired it’s hallucination is a feature as we said right it really helps me it’s something close by so i might even like the colombian coffee as well as the costa rican but i cannot predict what exact coffee you would drink with an lm next time because with historic information because it’s just a probabilistic distribution and i’m just trying to predict something yes on that totally i would go for the costa rican one i just don’t know right or if you listen to my conversations you would probably understand that i don’t 100 like electric cars just because they don’t sound good but what kind of petrol car i like might be more difficult i can attest to this he picked me up from the airport i’m still alive that’s actually a thing uh it was slippery yes but he does like those cars which makes sound sound yeah yes that’s another thing yeah you can feel it vibration so again hard to explain vibration to a language model so we can as a first information when we talk about what’s the limitation of a large language model it’s a language model that’s one limitation but there are other limitations and we should talk about them i think one interesting question is to our just discussion that we had before why did the model predict you would drink costa rican coffee so i asked the model why is that tell me your reasons and it can’t really explain it yes and the way models are built through neural networks is essentially we went from good old fashion ai where we had rule based where we said okay this person is most likely to going to click here because it clicked on that or like you are liking costa rican because you like the following tastes that was rule based or were the the artificial intelligence made up the certain structure we went then over to the gut feel and we talked about this before gut feel is when you are in a car and you drive down the road and you see some car is changing lanes the car didn’t indicate the change but you had a gut feel that this might happen yeah you just see the behavioral pattern of the car and you feel this is not a yes driver i don’t know what’s happening yeah i don’t know what’s happening i don’t know what’s happening yes let’s be careful and there is no rule yeah this is ingrained in the neural network now what we’re using for generative ai we’re using those trained neural networks no surprises that we cannot explain necessarily what gets out of it it was saved somehow the same way you cannot explain or suddenly to somebody why did you believe that the car will change lanes well you say would say experience yeah so and this is tough for regulators because regulators want to understand why the car crashed what was the reason why it crashed what can you do to actually mitigate that in the future to de-risk that factor how do you take decisions now this is not only tough for regulator it actually means for our large language models they cannot really reason because what a large language model does it predicts the next word and what is the likelihood probability distribution life is like a box of most likely word would be chocolate and you said sensations sensations life is like a box of sensations if you look at your probability distribution chocolate is the most common one sensations is somewhere in the wider tail but it’s still there yeah now if you want to have an explanation why did you choose sensations you only can say well there was a probability distribution which is very hard not only for the regulator but for us humans to understand meaning when we looking at large language models then that probability distribution is actually the core problem why we can’t do stuff like reasoning or planning because also for planning we would need to say step one step two step three yeah and planning is i mean it’s another topic especially when you want to have these things that are not going to be the same thing that you’re going to have to do to make sure that you’re not planning is i mean it’s another topic especially when you want to have these things that are not going to be the same thing that you’re going to have to do to make sure that you’re not agents that do everything for you i think there are a couple of reasons why this will become a very important trend this year next year to follow but you still have to make sure that the models of the different agents which are basically smaller models working on your tasks are going in the right directions and because again we can’t really explain why they do it they can’t reason by themselves planning what they would do planning by themselves now we’re here at cherry and we’re kind of thinking about investment theses and we will come to them in another podcast but if we think about planning and reasoning definitely something which we need to put our eye on at the moment and you hear companies going around and touting themselves like we did planning and we do it reasoning at the moment this is all faked i would say by just using a larger context window right you you make the context window larger you said oh i’m talking about a forest gump and there is a the sentence life is a like a box of chocolate and now you can obviously use reasoning and saying forest gump made the probability distribution for chocolate very sharp right so now you can look at all the different words in your attention modeling and saying at what point in time is this important fair is that reasoning in the context of a large context like context window yes overall and the logical is that reasoning in the context of a large context like context window yes overall and the logical part for us as humans part for us as humans no no and we will see a lot of discussion happening and we will see a lot of discussion happening there i think one last thing we should briefly there i think one last thing we should briefly mention is the data quality mention is the data quality because a couple of years ago with supervised learning was all about supervised learning was all about obtaining proprietary data sets obtaining proprietary data sets labeling the data with experts so the labeling the data with experts so the models the smaller models would actually what’s in there we could control the quality now these large models learn from the internet we heard about reddit and billions thanks to providing data to to these models we we heard about from x or twitter that is a model trained on on that kind of proprietary data and tesla so is it still how do you say in a nice way crap in crap out or it’s absolutely no it’s it’s even worse it’s so it’s not actually crap in crap out it actually depends as well on the staging of the crap so ai needs data it’s not just the models it’s also us humans we need data to learn i mean you know we come a bit pre-primed by our parents but then i experience a lot of things same for the model and we always get this debate of how biased is the model how do we create an unbiased model if you grow up in a in a in a in a in a in a in a in years out number one however there is as well a staging meaning if you show an emotional complex movie to a young child the young child will not necessarily grasp the complexity it just doesn’t have the mental capacity to pick it up the same is valid for ai models if you train an ai model too early on with too complex information it might not actually take this up so ai models very much depend on data quality and we saw that open air eyes and googles of this world used a lot of data as well as a lot of bad data and now we have a lot of tools trying to work through their shortcomings as essentially all of our models have ptsd and we’re trying to do our best to train them and to get them back which is why by the way some of the newer models like mistral i think was a good example if they use very clean data start from the scratch they can outperform older models to a certain extent but you dig into that or laura where you kind of like essentially you peel back layers you take the latest stuff which got learned you’re trying to forget it and then you re-weight in your weights but yeah data is important so we have limits we got a lot the question okay so now everybody tells me i have to rack i have to reduce rack so maybe we should just briefly talk about what is rack what does it solve and what does it definitely not solve at this point in time yeah you’re applying rack a lot i i yes for my own startup i i use a rag a lot rag is a way to augment data so at some point in time i try to convince open ai that i’m actually captain america so what i did i at some point in time i tried to convince open ai that i’m actually captain america so what i did i decided to come out of it So what I did is I fed the model with a lot of text where I said, I’m Captain America.

Lutz Finger is Captain America. I’m a superhero. And then I asked the model, who is Lutz Finger? And now the model has a problem.

It was trained on the fact that Lutz Finger is just a professor at Cornell, right? And not a superhero. And it was trained that Steve Rogers is a real Captain America. No longer anymore.

I think they are now. But anyhow, they have cut-off values. So now the model needs to decide, what is it? Is it Lutz Finger or is it Steve Rogers or some third person?

In this case, the model, and I really couldn’t get the model to break, always sided with the fact that I understand what Lutz Finger is trying to do. You’re trying to convince me that you’re Captain America. Captain America. But no, you are not.

Sadly. I was heartbroken. You should have tried the how it made it. But the model has information in their weights and biases, thinking fast and slow as the idea.

System one thinking, it’s in the weights and biases. System two thinking, which is slow, meaning I load this in and trying to say I’m not Captain America. Iraq is system two. Slow, bring data from the outside in.

But as I just described in this example, it’s not always working because how do you reconcile different information from system one or system two? So I think the brief answer is yes, you should use RAC. But it also doesn’t solve all of your problems. We have a lot also that fine tuning and RAC.

So fine tuning is feeding more data into the model, like my personal or my relevant data. So you have to combine basically everything. Which kind of brings us back seven years ago where you would say, oh, you need a really good data scientist understanding this. If you run your own models, it’s still not self-service for uneducated.

And I would actually say it’s not anymore the data scientist. It is actually you need a person who has a product vision. So it’s a time of the product manager for me more than a data scientist. No, definitely not the product.

Well, the prompt engineer is. So we spoke about planning and reasoning, meaning planning. Can the models work on themselves, work in steps, understand what they’re doing there and reasoning, explain why they actually did this, why they took certain decisions. So we covered that already.

But why is it so complex for an LLM to implement that? Why is it so complex? Why is it so complex to actually build an LLM that can do this? Because we are just doing always the next step.

So life is like a box of chocolate. You never know what you’re going to get unless you look inside. And the question comes from the fact that people are asking themselves, yeah, well, we hear from open AI. So this will happen.

This will be the next thing. And also others will talk about it. But our understanding is it’s actually not possible with LLMs. It will be for a while.

I think as long as we work with bigger and bigger context windows, we will be able to fake reasoning. Fake it. Yes. But there is another problem with it.

Because as you said, life is like a box of sensations. Now the sentence would go in a completely different direction. Like life is like a box of chocolate. You never know what you’re going to get.

Meaning the. One change in a direction will lead you off completely different direction. There was this movie here playing in Berlin. Lola runs.

Lola runs. Where essentially it shows one decision and two completely different storylines. And for LLMs, that’s what we have. We have one decision, two different completely storylines.

And not only one decision. Every word is a new potential completely. Different storyline. Which is the metaverse only happening in your touch language.

And isn’t it also not the problem that I work as a model. I’m an LLM. I knew that he is not real. I said it.

He is not real. And I just, you know, I just use my pattern recognition. I’m a VC. So we use pattern recognition, but I would have to go beyond pattern recognition, which I obviously do as a very good VC.

But that’s also a little bit the boundary. So. I use historic data. I see my probability distributions.

But for the next level of reasoning and also planning, I would have to have a different cognitive function. Exactly. Call it. Okay.

You cannot like, I mean, you going to your investor meetings and the ask is, why do you want to invest in this company? And you saying, because people invested in this company before. That’s kind of a very dull argument. Yeah.

But. I back a foundation model. Another one. Oh, right.

Because the others also did that. Okay. That’s I mean, that, but that’s what LLMs do. Yeah.

Meaning that’s not a good way to reason. Mm-hmm. Meaning we need a different approach for reasoning. And the approach for reasoning actually is you can think about protein foldings.

Yeah. So protein folding is super complicated to do. And like it used to be before we threw big models on it, that like one PhD made one protein kind of. Yeah.

Thing. Over their whole time. Now you can have an AI doing the folding and figuring out the potential outcome. But you want to do this in a way that you go over state by state by state.

Because you want to fold, right? You do a certain folding and one, then what is the next folding and so on and so forth. You get to an end state and you have all the states before. And now you can reason.

I get to this end state. Yeah. Because I had the states before. That gets you to a reasoning way as well as to a planning way.

In order to get to that state, I need to go through the states. Yeah. Then you have reasoning and planning. But just probability distribution doesn’t help you.

And also maybe another example that you are probably even more familiar with that I am. Product development is also usually a step by step process. You create an MVP, you test it with the customers, you get a reaction. Then you build another feature based on that.

Step by step. And a model wouldn’t at this point be able to do it. Yeah. Cool.

Last one. We had the big question and let’s discuss that big question. When is AGI happening? What is AGI?

What will it do? And that’s an interesting one because we read a lot in the press. Everybody’s thriving for AGI. So maybe just briefly, what is AGI and why would we need it?

Yeah. So at one point in time, I was thinking, well, I’m going to build this. I’m going to build this. I read AI is real artificial intelligence.

Today, we call everything AI, like logistic regression, AI. Yeah, that sounds good. No, but AGI means artificial general intelligence. And the idea is that the models learn by themselves, develop capabilities by themselves, and are becoming so much more intelligent that our human intelligence, our human intelligence is just dwarfed.

Yeah. Just imagine all the Nobel Prize winners of the world in one person. Yeah. Even more, I guess.

And even more. Like understanding, already understanding way more of our world than we ever have been able to understand. I don’t even know where to start. Like, I disagree with this.

For me, the big question is, I mean, we like to invest in founders that solve problems, big problems. And for me, what is the AGI? I mean, what is the AGI solving there? So first of all, like the AGI has as well still the data problem, right?

Second of all, the AGI has only one type of prediction modeling, whether it’s sequence prediction modeling or classifier. While we humans have way more ways of predicting, emotional predicting, and so on and so forth. Thirdly, we humans have agency. And when we talk about data, and we talk about how important data is for something, then we take a decision of cleaning up data.

We take a decision what data to feed to the model. We take a decision when to correct a model. All of those are decisions which a model by itself cannot take, based on which information, based on which assumption, right? Maybe I want a model which that says the Earth is flat, right?

Now, in the case of Earth is flat, there is precedent that the Earth is round. But let’s do something more complicated. Take any wedge issue in the United States. You can find arguments on both sides.

So what do you want the model to say? That becomes a product decision. And I think the interesting question here is also, do I want one answer from the model? Or as you said, giving different perspectives, even with the reasoning, showing different sources, so I actually help human beings to take better decisions.

Well-informed decisions. So you don’t have to be the super intelligence. Why not help a human to make a better decision? And the other part is for me at least, I’m not against much smarter models, but why not let the models focus on certain expert tasks, predict climate change, save human beings from catastrophes?

I like very much. So when we get proposals, then we share them. We’re not always sitting in the same office. So I recently, I recently got a proposal.

And in this proposal, it was like said like, oh, we are developing the next generation of AGI. Next generation already. I missed the first one. And one of the researchers marked this and was like, Lutz is not going to like it.

Yes, I’m not liking it if you put AGI into your proposal. Because look, we use AI, narrow, broad, foundational, whatever, however you want to call it, to create a better world, to create value, to change things. I’m not worried about an AGI as this ominous thing taking over. I’m more focused on how can we use data and machine learning to improve things.

I think these are very good closing words. Thank you for listening to this episode of Your Questions to Us of The Edge, Cherry’s Frontier Tech Podcast with Lutz. And myself, Jasper. I hope you enjoyed it.

We thought we should cover your questions. So not all of them. There are a lot. But the main ones just to restart the season, get everything going, getting everyone on the same level.

So today we have five main questions. It’s five. It’s really about what is different. Why is this new AI hype so special?

Why is everyone talking about it? Why are we talking about it? Quick answer. It’s not.

It’s cool. It’s cool. But there are limits. So we talk about the limits.

Not that you expect too much from the poor model, because the model is also not perfect like yourself, like us. Then we ask. You are perfect. Just making sure you’re perfect.

You can really see that you’re from the States and everything. There you go. The next one is, what is this RAC thing? What is that?

Is that a new music trend? Is that something special? Retrieval augmented generation? What is the new RAC thing?

What is that? So we talk about it a little bit. And let’s share some of his experiences trying to be Captain America, which didn’t work out, I think. And my recommendation is to be a different superhero, so you will see.

The last two ones are that some of those limits of the models, they don’t reason, so they can’t really explain. And they can’t really plan like you as amazing human beings, what to do next. So why is that? And how could we solve it?

And then we round everything up by talking about, A-G-I- You get my statement. Well, we talk about it, why he does that, and maybe others are more excited. And yeah, that’s the podcast. So I hope you enjoy it.

We did. Definitely.