Did our predictions hold true? — Transcript

Welcome back to another episode of Lutz and Jasper, our podcast about AI in practice and everything you can do with it. Lutz, how’s your coffee? Awesome. It’s blue skies and palm trees and a fresh brewed coffee.

It’s doing well. Well, it’s already pretty dark here in Berlin and cold, but we’re warming up. So yeah, we had half a year of vacation, both of us. So we prepared this podcast extremely well.

So you will hear us talking a lot today. We actually tried to limit ourselves a little bit, but yeah, you will see we definitely missed each other. So, well, what did we discuss? We did essentially discussing our 2024 predictions.

Let’s run through them very quickly, high level. If you listen on, what are you going to hear? So the first thing we discussed, obviously, is the open versus closed models. So at that time, there was definitely open AI being very big.

Anthropic was another model. And we were talking about who will win in the short term and in the long term. And yes, let’s recap on that. Then we talked about multimodal, that our predictions on how multimodal will work.

And we were right here almost as well, 50-50. And listen on. Yeah. And then with the multimodal part comes interface.

So it’s now text, the new interface chat. We have had that a lot with ChatGPT. But how are other applications? Doing that’s what we should talk about.

All down to the business, as you will discuss over and over. Well, then we go into MLOps. That prediction holds true, even when there is not yet a clear winner. And then music creation.

So the future is indie, which we really, really like. So we will see more creativity happening in the music industry. And then we discuss how Jasper made a fortune with Nvidia. No, but we are discussing essentially.

Yeah. About the chip wars and how our predictions, they are actually held true. Yeah. And then we have to recap, obviously, the very simple prediction.

Will there be more startups now using AI in a new form, solving problems? And how many of them have survived? Yes. And then the big question is AGI taking over the world.

Or will we have an AI winter? Yeah. No, not here in California. But yes, we do talk about AGI.

To poop the party. None of us believes in the AGI, but we should actually discuss why that is. So let’s not. Good.

Let’s kick it off. So, yeah, as the audience might have realized, we took some vacation. We just felt, okay, we should really come back. No, the real, the reality is we had a lot of work to do.

So more to come and to announce. Obviously, I was very active. On the investment side, this was very, very. Hold on.

You mean the AI revolution is not happening by itself? So far, the chatbot hasn’t replaced me. People are still calling. I’m working on it.

Okay. Yeah. And I heard you’re also announcing something soon. So, yeah, I’ll be prepared.

But essentially, we had still nice discussions, but we felt we should take a break. And now there’s so many things happening that we felt, hey, this is a good time to catch up. Also with. Some predictions we did last year and nobody should, you know, bury those predictions, but actually come back.

You know, the other thing is if and all of you listeners know this out there. What’s the whole like the biggest problem of the Internet is getting not technology is getting customers. Right. So, like, I could build a Facebook today, but nobody would care.

So when you make a podcast like we did. Then the biggest. Question is who is actually listening and what I thought it was very, very heartwarming. We had so many people who reached out to us and said, oh, well, guys, why did you stop?

This is awesome. I was in a discussion with somebody who’s like, oh, this is so like I met this person the first time and he’s like, you know what? I really appreciate talking to you, but I had because I have listened to all of your episodes and all this feedback was actually pretty amazing. We are not the million subscriber podcast, but the.

The ones who listened actually came back to us. And so if you’re listening to us, note that we don’t go for clicks. We go for feedback. So yeah.

Yeah. And we thought the first feedback we should give to ourselves. So as you remember, we made some predictions and we felt, hey, let’s take a look at them again. And that’s what we will do now.

Are we right or not? So our first hypothesis was the very, very early. Also a discussion. About open and closed models.

So closed models, just to recap our models where I have no idea what happens there. Essentially, you don’t know the weights in the model. You don’t know how they fine tune it. The data they’re using.

I think even more details how the whole model works. And then the open ones is like an open source essentially. So I can, I could, I could copy and paste it. I can, I can use the model myself.

I understand everything that is working in there. And we started by discussing loads. If I recall. It correctly.

We see these closed models being very dominant. I think everybody agrees. Chat GPT. There was a lot of money investing invested in those models.

And we discussed the fact that yes, you can use the model not just for compute, but also for fine tuning. The reinforcement learning part was very important. But this all costs a lot of money, but it makes the models much, much better. So that’s why we said they were dominant first.

But then there’s a big advantage of the open models. And I think the. Like our prediction has been proven true here, right? The, we have seen a lot of open models coming on the market.

And essentially once you have the core idea of the weights of language are the weights of an image, then it becomes easy to adopt it and to retrain it. And the, the way to actually like technically speaking, the way to think about it is when we look at a neural network, right? The first. Layer is very easy lines.

And then the deeper you go into the neural network, the more complexity, complexity from, for example, an image or language, you start to match out. Meaning we just train what is language like sentence completions, like is like a box of chocolate as we were often discussed, right? So we training that logic of human language. Once we have that, we don’t change our language.

So. So much. So now it becomes actually a commodity, meaning all of those huge investments have the issue, essentially, that they become very quickly commoditized. And if somebody pushes an open source, the whole thing breaks down.

So our prediction was actually correct here. It was correct. Plus also one thing we added was that companies like Meta, they have proven in the past. And I think also Google to some extent that they have high interest in getting down infrastructure costs.

So Lama was an example where they actually open source something that they did internally, something that the Mistral team from France, I think also capitalized on when they started developing their own model. Because some of them came from the Meta Research Center in Paris. So Facebook doesn’t, it’s not a hyperscaler. They don’t earn money from compute or they don’t have the interest like Microsoft.

So they would publish these things. I don’t know if Elon Musk. I think. Rock was just released a couple of days ago.

If Elon Musk has the same interest, but who can read his mind? Well, like, I think you need the whole X community to read Musk’s mind. At least it looks like that. Everybody tries to read his mind.

Actually, let’s stay on Facebook for one second, because I actually believe that the there is we said all of those large language models need to have a business purpose. And I. And that. I mean, that seems so dull, but it is actually like one of our topics we discussed many times and became so true.

If you see at what Facebook is doing, why is Facebook outsourcing it? For me, it’s it’s not only. To to be kind of like the punch in the eye for Google and Microsoft might be as well. I think Zuckerberg wouldn’t mind.

But there is a is a reason if you allow all you use. To create. Text to create music, to create images. You just increase your inventory and Facebook needs inventory to show pictures, stories, images to folks.

That was the Snapchat as I was working for Snapchat. The whole point for us was give a filter because a filter makes fun. So people create more. So those large language models allowing.

We had the podcast. The podcast about the future is India, right? Large language models to allow people to create more images, text, music. That’s what Facebook wants.

And it’s an interesting. It’s an interesting incentive, right? Because they want the attention of the user because they can also show advertisement. Whereas and which is to your point, creates an explosion of content.

So people might stay longer on all of their their media. But I think, yeah, I mean, just to cut this a little bit because we want to talk about our. Prediction. It’s interesting how the different incentives are there.

So Facebook, more content. That’s amazing. Google, it’s actually search. They want also to to show advertisements, but they want to show relevant results.

So they shouldn’t care that much if it’s artificially created content or not. Artificial. And as long as the user is interested in it and can then still click at the ads or, you know. I think we did a good prediction.

Yeah, let’s move on. All right. Next prediction we made was that. The LLMs.

Will open up new and not just the other lens. Also, also the vision part and the voice part will open up new interfaces to interact naturally with machines. So texture interaction, we said like chat GPT, everybody knows what it is become the new preferred interface, especially for consumers. It’s just easier sometimes than search by click or search by typing on the search engine.

And I would say we saw a lot of startups that would claim they replace existing models. Just because they have chat first. We more like a copilot set up. I haven’t seen that many really taking off except copilot.

I don’t know how many users, by the way, paying users they have at this moment. So Microsoft copilot, the coding support, but that’s a different type of interface. We haven’t seen that much on the voice space. I would say cars are actually the best ones I’ve seen in the market, but that’s also older technology and then looking, looking like an, a holistic way.

And then participants participants participants participants participants participants participants participants participants participants participants participants participants model and showing multimodality. We essentially, the core idea from us is still right, right? We said models like neural networks, they create something like a latent image. Think about Esperanto for computers or latent representation like a feature vector, which only the model understands.

Pinecone, embedding, whatever it is. And we discussed this in length. So if I have the ability to encode anything into this Esperanto for computers, then I should be able to switch back and forth. And that’s what we see.

We saw this with clip, voice, or like text to images. And now we see it with voice. We see it with voice to voice, right? Where you have encoding from one language into a voice representation of another.

We see those, but we kind of missed the timeline here, right? I think you’re right. It is not as strong as we had hoped. But it’s also a hard problem to solve being in a very, I mean, you want the interaction to be almost perfect, I guess.

So whatever input I give through whatever channel, I want a good reaction and I don’t want the model to ask me millions of questions or being actually wrong in what I ask for. So what’s the challenge there? Well, yeah, well, LLMs are essentially only probability distributions, right? Like you have your next token as a probability distribution and the next token in language or the next token in the vectorial space, whatever it is, right, in your representation.

So and that is just a guess. Like if it’s like a box of, okay, well, next probable guess is chocolate or could be as well pralines or ants or like surprises. So the problem is we have seen a lot of those models fail to be extremely focused and controllable. And if you have a machine which is not controllable, then it actually sucks as a machine, right?

This is okay for your friend at the bar at 2 a.m. in the morning to not be controllable, but not for your computer. Yeah, I guess it’s a little bit the multi-modelized reality then also comes with combining the LLM with rules a little bit what we saw with the kind of old chatbot space or the company Ultimate AI that recently got sold, which was also a combination of both. So yeah, maybe that’s something for this year to take a look out how to guard rail control and then the interface might become stronger.

Totally. But that is we actually had this, I think, as a discussion last year that we said there is narrow AI. There always will be narrow AI. Like take Google, right?

You think there is one search algorithm? Not at all. Google takes tiny little clues from did you go, did you come back, did you click on several links, right? All of this are separate models which together make your experience.

Now, I think for the multi-modality, we were right that multi-modality is a thing. We were slightly overly optimistic of how fast it will come. Like I had a discussion with a friend of mine, I had a discussion late, like some like early January with somebody at Google and he said, the running joke at Google is what’s the new programming language? And the new programming language is actually English.

But that’s only partially true because that is a translation from language to code and code is something very structured. So that’s the reason why Copilot, by the way, at Microsoft Copilot, is a very good product. I mean, we have a lot of customers around it. But that works, right?

You have we discussed this in the last season. There is something like a minimum quality product now. Yeah. And so Copilot, the minimum quality is actually relatively easy to achieve because you have the human interaction for explaining to you a participants participants participants participants participants participants participants participants participants participants participants participants participants knowing what’s going on.

And we said, hey, there will be an explosion, not a Cambrium explosion, but an explosion of these tools. And because people want to put this into production, they will test it. There will be a lot of open source, but also models, sorry, also tools that want to get money from the users for it. And then on the other side, we also said, hey, but still the big platforms like Bedrock, also Google, they have an incentive.

So they will also publish tools to make the application of these models more feasible, easier. So we saw a lot of startups, definitely. I mean, last year you can probably look at a couple of statistics and you see investments in the space. There were a couple of hundred million rounds.

A company like Databricks, which is not MLOps per se, but it’s seeing a big surge because people are using it. They want to use it to apply machine learning in certain fields. So it really makes sense. The thing that we still haven’t figured out yet is…

What is there to stay and what will be disrupted in the future? Because we’ve seen every month new tools coming out also from OpenAI, they have an enterprise version. So where you don’t need these startups anymore. And by the way, it’s the same thing for DevOps.

The DevOps tooling space is also seeing constant, not disruption, but actually commoditization because you have it in the open source community and also the big platforms are launching the products. Now, the reason why we had this prediction… This prediction is clear, right? Because as we said, LLMs just produce probability distributions.

Meaning an LLM is nothing else as your smart partner at your favorite consulting firm, right? It doesn’t know anything, but sounds very intelligent. And it is then later on the integration, the team work with the company, which makes it a real thing for a consulting company. Now, LLMs have that problem.

Yeah. And that participants participants participants Therefore, LLMs cannot remember things. They have just weights. They have system one thinking, thinking fast and slow.

They only think fast. They cannot remember. They cannot retrieve things. That’s the reason why we have all those rack discussions.

They cannot reason. We don’t get them to reason because it’s a probability distribution only. And they cannot plan. All of this doesn’t exist for LLMs.

So the ML ops space tries to overcome this by orchestrating different models. Now, we said it’s coming and it’s definitely coming. The question is, will, and I’m very skeptical, but that’s a 2024 prediction. Will we actually see companies who help with reasoning, help with planning and so on and so forth?

And then those parts are not. We will need an ML ops layer. We will need a layer to operate different narrow AIs in a structure. And all the startups we are talking to, like even my own startup, right?

I’m building this orchestration layer to make sure, okay, now I have a rack answer. Is that rack answer really true to the question I have been answering? Oh, now I have an answer from my LLM. Is it really true or is this LLM just hallucinating?

So. Those control problems are there and they won’t go away because the LLM won’t solve it. And therefore we need, we said we are going to need ML ops and yes, we see it. We need it.

Yeah. And at the end of the day, I don’t want to trust as least as a large enterprise. I don’t want to trust the infrastructure provider for controlling the infrastructure. I would love to do it myself.

And to your point, then I need to compromise. I need to compare and I have to, I need a different tool. What was the next one Jasper? So the next one was around music creation because obviously generative AI creates new things and music was one part of it.

So we discussed what actually happens that we introduced a little music jingle ourselves, created by Google’s music creator, a mix of jazz and I don’t know, a little bit of hip hop. I think we put in so, so that actually made sense. We also saw a lot of songs. I think.

I think one song from Drake or combination there was, was one of the first. Drake and the weekend. Drake and the weekend. Thank you.

Yeah. I’m, more an old hip hop guy, I guess. The interesting thing here is it kind of vanished when you look at the media coverage and everything. So that was big thing also strike for the Hollywood creators.

And then it kind of vanished, I would say after, after autumn. And what we still see is though in the back. Ground or let’s say in, in the, in the. Basements people use these tools.

So a lot of, I tools for music production. We know that all the big music labels are experimenting with this. All the music producers are doing that. The podcast industry is using it.

So you have jingles everywhere changing, but I would say publicly the whole thing, count of it down. So I, yeah, I totally agree. We, we were so spot on. It is absolutely right.

The future is India and not only for music for image generation, for text. Generation for all of this, because essentially we lowered the cost of creation. Now the interesting part, I think there are two things which are very interesting to see. Number one is the big shift of the business model is not to be expected here because distribution created already this.

This would be like free distribution coming through the internet, changed the business model, changed the business model from news organization, changed the business model from music. To music streaming right from, from producers down to apple music and others. So they had already changed the business model. The business model will change, stay the same.

So no big change to be seen, right? But still the cost was coming down. So this is a second part, which I think we, we again have overestimated the speed. I talked lately to an executive of, from Walmart and they said like, look, I mean, I’m a tech guy.

I’m a tech guy. We are, we are like, we’re not yet using here any generative AI images because we are scared of PR backlash. And it’s true, right? I mean, look at Disney.

They, they use a little bit AI and everybody’s yelling at them. But in reality, many people are using generative AI. So you will see a groundswell of generative AI and that will reduce costs and that will allow other people. I have students who create.

I have students who create Instagram channels with videos and stories and they create those Instagram videos within 30 minutes and they publish them out because they just use tools which are for them almost for free. And now they have the production of a large studio kind of thing. Yeah. And I would also add to that as a creator, if I say I did this a hundred percent myself, I get probably more because people perceive it as more effort.

So I get a higher price. So as long as I can, I wouldn’t, I wouldn’t like to tell maybe people because yeah, I have a higher margin. So it’s a good thing. Plus the backslash and everything.

But, and this is the good thing. And we come again to the shortcomings of LLMs. The human is still very much needed because we are looking only on probability distributions. I don’t know.

Did you test out perplexity? Perplexity has now a podcast. Taking perplexity search. And then you can go to search terms and write it up with an LLM and then use 11 labs to create a voice.

And you have it in a podcast. Gee, is that boring guys? Stay in our podcast, right? You get, you get opinion.

You get human. You get 100% just for real. Not, not this kind of LLM stuff. And it’s, it’s interesting.

I mean, it really, like touch and feel is completely different. Yeah. And excitement also comes from surprise and improvising. Yes.

That’s why you still have live shows nowadays, right? It’s not all scripted. Yeah. And that, however, while we are recording this, like we are recording this both on our ends and we are using good equipment, but we don’t have a huge studio because we will use AI later on to make Jasper look 50 years younger and his background more beautiful.

It’s like, what? No. Then I wouldn’t be alive. I think you.

No, I’m, but, but the, reality is we are both recording this to our machines and we bring it in and AI will bring it together. So our production costs here has gone down. The future is in you. That’s the reason we are the best example for it.

Yeah. A hundred percent. All right. So we have four more predictions left loads and yeah, we really, you can really see, we enjoy talking to each other after this half year of, of a break.

So I guess we should. Do the last four and a little bit of a quick fire, which is kind of a common thing in podcasts. I realized. So one was startups and AI integrations.

We said, why there be, will be a search and startups trying to leverage AI. Many might fail because of the challenges associated with integrating AI effectively and modifying human behavior. Wow. That was not an easy one.

We predicted, I would say it’s still a bit early to say all of them failed, but it’s very tough. So we saw a lot of pitches pre-product where people say, Hey, we do. Same old thing, but we now do it with an kind of an AI layer and the thinner the product layer was, and the more they were reliant on, on data that would actually fine tune or wreck the models. The tougher it was for them to control the quality.

So kind of something we already discussed before. So yes, we see a lot of startups failing because they don’t really solve problems. They just, I would say, throw AI on something very old. That was unsustainable.

And that was a failure that participants suggested that participants suggested that participants suggested that participants suggested Yes, we see even more startups. So there are some big ones failing. Graphcore was in the media. I don’t think they’ve failed, but at least they’re struggling.

So it’s very, very tough against NVIDIA. I mean, the stock market price tripled now again, six months later after I sold my NVIDIA stocks. So I wasn’t very smart there, I guess, but I bought them pretty early. But they also just released a new chip, which is pretty, pretty powerful.

The B200 GPU. And you can really see them doubling down on all their earnings. So investing in hardware, they are supposedly doing something in the cloud space as well. So we see more and more specialized chips, especially in the edge computing.

But these are all very early. And as we know from hardware development, this will take some time before they can actually kind of attack, if you want to call it like that, the big NVIDIA, the third most valuable company nowadays in the world. And I think one thing people shouldn’t underestimate about NVIDIA is not just the hardware. But it’s also the whole developer community, the tooling for the chips.

So they really build an ecosystem and that’s their mode. Totally agree. And what is fascinating to see is the whole discussion around chip wars. And NVIDIA is there spot on.

So, but that is more discussion on 2024 and ongoing, but we saw very much how the actual infrastructure, meaning those chips are dominating how we’re building. And we’re seeing how we can develop and who can develop, right? Yeah. Look, at how Lex Friedman suddenly made an entrance as a VC by saying, actually, you know what?

I have an infrastructure and we have seen this before. We have seen this by like media companies saying, oh, I have a platform to get you customers because customers is the biggest thing in the internet nowadays. The biggest thing in the internet is do you have chips for me? Yeah.

But I really wonder since they all secured the H100. So that was the NVIDIA chip. Yeah. What these guys are now thinking because the new chip is apparently 30 times more powerful and 25% less energy consumption.

So as soon as they’re available, everybody will don’t want the old ones anymore. And then by the way, I mean, good thought, but very much dependent of like it’s a bet on the development cycle. By the way, very similar to Zook, right? Zuckerberg.

Oh, a lot of capacity with for. A lot of capacity for AI chips. Also a good thought. If if somebody else comes out like rock with the chip, which is way more effective and very faster, that might have been a bad investment at the moment.

Everybody thinks it’s a cool investment, but as any investment, there’s risk. Agree. I hate so much money left moving from the metaverse, you know, so. Yeah.

Well, that was good. All right. Then we have the next one. We had.

Obviously, we had a discussion around whether or not there will be yet another AI winter. I actually we said there won’t because there is a lot of hype. This might which will carry us forward. And this is true.

There is no yet an AI winter. I probably will review this again for 2024 because we are definitely in an overhyped cycle. 100% agree. To all what you said, right?

It is the integration. It’s the existence of a business case. All of this will play. Out.

But for now, the hype cycle is still strong. So that was the right prediction. And then we had the discussion about AGI. Yes.

Which is fascinating because guys, there is no AGI. Let’s be like, take the G out of our discussion because all what we see is we see only models and those models are distributions and we cannot work around them. It’s. Yeah.

Like it’s auto regression and everybody is excited, right? Because it sounds like humans, but it ain’t yet. So we see that’s the whole idea of ML ops. We see more and more structures around those.

And that is not AGI. It just makes AI more utilizable. We have now, we can now talk to our computer via voice, via text, via image. Good for us.

But it doesn’t mean that the computer has an opinion or. Yeah. Has agency or has taken over. Those things can’t even plan my trip.

Good luck. Right. So the whole AGI discussion became more of a political discussion. Yeah.

And I’m looking at you, Mr. Musk, kind of like trying to trying to push their ideas. Yeah. And I think the interesting one will be when GPT-5 apparently this year will come out.

I think there was something today from Sam Altman. What we heard is it will be more about reasoning. So really understanding why the model is presenting something to me or actually took a decision here and there, but not so much about becoming an AGI. So I have not seen it yet, but I call BS on it.

This is the reasoning idea because that’s everybody’s talking about reasoning. Right. Google talks about it. Like technically, technically, it’s not feasible.

So what they need to do. Stick to the old transformer architecture. Yeah, exactly. So either they come up with a new architecture and there’s a lot of very cool work or like work around that.

But people have like young Lacoon has tried this since ages. Right. So the reasoning we are not there. The one from from Meta for the audience who don’t know him.

Oh, yeah. Yes. Like one of the granddaddies from from AI. Now, what we will see, though, is a combination.

Of we get a statistical output and now we are trying to reason around it. But that is I mean, I built my first startup in that space before we had actually RNNs or like virtual back of vector words. I kind of created, OK, what’s the noun? What’s verb?

What’s and how do I connect those? That’s we will do that kind of thing we want to do now for reasoning. It took them another 10 years till we suddenly had transformers. And I’m pretty sure that the AGI discussion won’t happen so quickly.

What will happen, though, is that models, LLMs have trained information to talk to OpenAI about liberalism and dear Europeans. You have a different opinion about liberalism than the US. Well, guess what? At the moment, OpenAI liberalism view counts.

And therefore, that’s not true. That’s the reason why you’re stuck with one opinion. And Mistral enters the space and saying, actually, you know what? There is a different opinion.

So this is that discussion we will have, but not an AGI discussion. What a pleasure. And yes, we definitely did talk too long. I’m like, like, we have to do this sometimes live, I think.

We definitely ran over time. And I think you had a lot of coffee. Was always good. But let’s for next time.

We will actually. Write up our predictions for the next six months in 2024 or the next 12 months, 2024, 2025. We have a lot of good content there. And you guys out there, if you if you think about a good integration, if you think about MLOps, if you think about how to build things around LLMs or like AI planning or AI reasoning, come to us.

We we love to talk to you. We we love to fund you. Like, well, I shouldn’t say this. This is just as money.

But we we definitely are very interested in and you will get our feedback. You will get our ideas and how to deal with it. So plus, we would also love to hear your predictions. So not just us getting here on the spot.

And yes, we will recap our predictions also next year again. But if you have anything to predict, please let us know. Write in the comments. Yeah.

Post it on LinkedIn. Happy to hear. You heard it. We’re doing it for you, not for the clicks.

Come back. We’ll talk to you. Come back. Ciao.