Building AI products: 5 lessons from our founders' workshop — Transcript

Greetings from London. London, yeah. We are in Soho here and we’re having a very nice coffee. I have filter.

What do you have? I have a Gibraltar, espresso-based with a plant-based nog. It’s actually pretty good. Nice.

I forgot the name of my coffee, but it’s very fruity. That’s what I can say. Why are we meeting here? Yeah, we’re meeting here because it’s actually a very nice place and we wanted to show you a little bit where we meet in the world.

Today, it’s beautiful in London, beautiful weather. So we’re recording a podcast after we had a nice workshop yesterday at Station F, where we spoke with a lot of founders about how to build products with AI and the typical mistakes and challenges. So we thought we’d do a quick recap. We had yesterday amazing discussions in Station F, as Jasper just said, about how to use AI.

And I think what’s helpful, if you have an existing product idea or an existing customer or an existing product, you can build products with AI. And so we thought we’d do a quick recap. about can we use AI? So it’s another podcast of Lutz and Jasper.

It’s called The Edge. Welcome. And we love to discuss now a little bit. It’s rebranded.

Yes. Yes, it’s rebranded. We still have to work on that. And we like to speak about common mistakes in AI, but being also a bit more positive, what is actually important when you build a product with AI, because it’s a slightly different thing.

We actually should do a longer podcast about that. I believe there is a doom time coming because we have now all the excitement in the market, but now it comes down to the real work, the product work, kind of like build it in. I guess I’m a more positive person, but that’s why we do this podcast. I’m hoping for a great time here because there’s still so much we can do with AI.

Maybe we actually just interview a couple of founders on the topic. Yeah, there is. I think when I say doom time, it’s like there will be a lot of applications where AI is not used in a right way. Yeah.

Those. Will be said. So guide us through the structure which we use when we chat with founders. Yeah, I think to kick it off, we spoke about it various times, building a product with AI slightly different than just building a like standard or old, very deterministic software, because essentially you have to do a couple of experiments to figure out if the AI, the machine learning, the large language model you’re using is actually relevant for the product you want to build for the problem you want to solve.

And you just don’t know exactly at the beginning if that’s possible. Now, we’re pretty lucky that latest large language models deliver already a lot of value without fine tuning or even creating your own models there. So you can kick it off pretty fast. However, as you might know from your chat GPT or other experiences, it’s not perfect yet.

And we spoke about this a lot. So the first question we always say, always ask ourselves and obviously debate with founders is, how do we build a product that is not so much of a problem for the problem you are building? And we chatted about this before, there is this concept of a minimum viable quality, not product, but quality, because the AI is not always there. But I see many startups actually just putting, oh, we do in generative AI now as well.

And I was like, how so? So I see many founders putting AI in the product, but not in the solution. And I think that’s a really good point. so is some kind of interface advantage with latest developments so i actually believe you’re right the if you talk the scale and like for ai always the question is what does it change if you apply it so the value creation comes down to do i have a decision which is different because i use the ai so for the scale if the scale allows me to remind me about my diet regimen yeah it’s like okay thanks do not eat this croissant in the morning i just want to inform you you gained again maybe some people in the silicon valley like it i heard i heard you guys so i’m more interested in those things you do remember it’s not an ai thing but it’s typical right i think bmw at one point in build a predictor that you fall asleep and that it was a rule-based system but they measured how much you blink how long you close your eyes the longer you close your eyes so the so The more often you blink, the more tired you are.

So the mirror, the back-view mirror would measure this and then inform you that you are tired. And the user would say, I know, damn it. Don’t annoy me with this. I think it still does.

I think you still have this attention measurement. My car doesn’t have it. It’s too old. So what we did to work a little bit with the portfolio, but also with new companies, we created kind of an AI fit analysis, which is definitely not perfect.

It’s not 100%, but it’s a good start. And to kick it off, there is a question around precision. There is a question around the output. Mainly, it leads like the discussion we had about bad things in, bad things out, to say it nicely.

So that’s the first thing we always discuss and check with companies. Yes. So essentially around the quality of the data, right? And one of the issues which we have with the quality of the data is that there is actually…

There is actually never ever a way or possibility for us to have non-biased data, right? Because data is biased. And we as a product person need to take a decision by how much we want to de-bias the data, right? Yeah.

Plus, maybe sometimes the bias is good because I could say I don’t want averages. I want the best outcome for the product. So maybe I inject very, very good data versus just the average data. If I want to identify certain patients, if I want to have a certain output in my machinery, I don’t want to have what the machine did in the past all the time, but just the best parts of it.

Yes, totally. Now, the other thing, do you have an example from a company which struggled to have the right data set? There was a company I invested in in the accounting space where just the data we would obtain to automate any accounting wasn’t good enough. And also the labeling process.

So the labeling process was obviously a challenge. So it was longer ago. But there we had to obtain the data from the clients in a new fashion. We wouldn’t get the historic data that easily.

So you would have to figure out a way that the client would share data with you in the future. So basically you have to build a kind of a wedge or product that lets you share the data first. And then over time, which could be much, much later, a couple of years even, you would actually be able to train an AI or a user. And then you would have to use a large language model nowadays on the data.

So data acquisition was a larger part to kick about. The problem for an entrepreneur is always they might have the data at that moment in time. But as you said, in a much, much longer time range, they only can make it valuable. Right?

Yeah. So the question is, does the quality increase with the number of possible decisions? Because, I mean, very essentially, if I know I go left or right and there’s a reason why I go left and I go right, I could have a rule engine. Yes.

So giving maybe the customer care example, you see many chatbots. You have probably also seen more modern versions. And I mean, ChatGPT could also be one. You could say, hey, where’s my package?

I know there is a number, a tracking number I put in. Obviously, when I put the number in, that’s 100% that I find the package. So I’m done. No large language model.

Yeah. Fascinating. Because I believe that, and we all have this from chatbot conversations, a chatbot, which only comes back to the same, set of three potential questions it can answer, is completely annoying to us. And we talked about that large language models are the new type of interface.

So when we are looking at how AI is used, and there are many different kinds of AI, from good old fashioned over neural networks, classifications, down to large language model. Large language models actually allow us to make an interface more usable because it offers more decisions. So. I expect many companies who used to think about UX click interfaces, or think about you have, like, why don’t you like Microsoft?

Because it has too many potential options. Yeah. If you say, no, no, there’s only one option, which is in a text interface, then it becomes usable. Yeah.

But the question, when you kick off your company, when you even try to enhance your product with AI, is at what part in the workflow? I can actually. Have a rule engine making a decision for me, or I would show different options to a human being, or when could actually the AI make a better decision or a good enough decision. And I think that’s the core part here for the fit analysis.

And this is funny because we should actually tell this, right? The fit analysis means there is an option for AI to be used. If it comes down to large language models, it is very often not clear how. Like not technically how, but from a user interface, the how to be used.

But, you know, that’s the reason why we’re in the game. Yeah. The next question then is, do you benefit from being able to model a lot of complex decisions? So does it actually have value if I do a lot of these decisions with AI versus maybe just a few?

Because essentially it comes back to the point, maybe I just built a rule engine around it and that’s good enough. And giving a couple of options to the user is good enough. But if I would have a very long string of decisions. Or actually various options that diverge to a very complex system.

Then an AI can actually help me not just to make these decisions, but also guide me through them. You could argue when you look now at the latest Neagle tech companies. Where you see a lot from contracts that would guide you through. I saw this, there’s a deviation here.

Comparing this to another contract, etc. Oh, we want, you’re searching for this kind of case. There are three cases that are very similar. These can be decision engines.

But also enhanced with AI because they let you search faster and maybe compare faster and easier. Totally. If it’s a linear problem, then good old fashioned AI where the human creates the rules works. If you have complex decision space, nonlinear problems and the search engine is already nonlinear problem.

Or Amazon’s recommendation engine is a nonlinear problem. Then it’s a time to deploy. An AI to find the tiny clues. And here it is most likely not a generative AI because we are not generating.

We are trying to find in a complex maze the right approach. Yeah. And maybe just doubling down on that quickly. It doesn’t even mean AI in a sense of large language model.

Sometimes a simple regression analysis that also does prediction might be good enough and even more precise. Not like that’s obviously linear. But if you have like also here. If it’s a linear problem.

It’s complex enough. It’s here. Yeah. Yeah.

If I have modeled that and I, you know, I have my left and right decisions. The AI will guide me through. Maybe I would converge again to kind of a decision tree with rules because I now know what the most likely decision is. Maybe I can even put a probability there.

But if my problem then changes over time, maybe the data changes the underlying data, the infrastructure. We spoke to companies that use AI to support infrastructure decisions on the cybersecurity side. You know, since we’re in a new security space, now, it becomes pretty interesting because of different outside influences, I might have to do, I have to make other decisions. Yes.

And the AI can use the pattern recognition and guide me through that again. Yes, I actually would see the values on a slightly different level, because if the problem changes over time, you have a moat for your AI, right? Meaning, it’s a complex decision, it benefits from data, it’s not linear. the type of cancer, as long as it’s one type of cancer, you make your analysis, you train your CNN, your neural network, and you identify this as the cancer.

And this is also helpful. But once that is done, anybody could now actually determine the cancer with the same data set. You’re not very protective in your algorithm. If you start identifying different types of cancer, if the cancer changes, then you benefit obviously from having a new tool and a new detection mechanism over and over again.

Yeah, that makes sense. And that’s definitely a very complex problem. You see this, by the way, in, for example, in all the social media monitoring tools, which kind of like all use the same standard of NLP, natural language processing models. But since the underlying data is constantly changing, everybody has yet another edge.

The next one, and we already mentioned it, is around the quality. And we deliberately mentioned this a little bit later. You could… I would say this should be mentioned first, but does the whole solution, the product, tolerate imperfect decisions?

Could not argue, obviously, that has to come at first. But just to make it very aware of, you start by understanding, is it a simple decision tree? There’s a change over time. We have very complex decisions and all these kinds of things.

And then you should ask yourself, OK, now if I’ve modeled that, understood that, and maybe I even have an MVP that I can use, how does actually the output now look? And then you should ask yourself, OK, how does the output now look like? And how good is the output? How much can I actually improve the output?

And is that good enough for the customer to see a value there and trust, obviously, also the AI? And that’s the, I guess, the most challenging one that you can’t even answer from day one when you start this. You just have to do a lot of experiments and see what happens, which makes the whole process very different to software development. We had actually two startup founders who we discussed.

We had a company called Station F yesterday, which were on the totally opposite spectrum of the quality need. And maybe we start with a very quick framework for this. The framework is, how is the quality? How good is your decision?

Or like, mathematically speaking, it would be a false positive rate, for example. And how important is the decision? Like, how is the value of a false decision? Or like, how bad is it?

So we had one company there which thought about doing maintenance, like addictive maintenance for infrastructure. Obviously, an error could be very costly, and you do not want to have a low quality prediction here. The other company did social media outreach in order to hire. Well, okay, if your model is wrong, then you reach out to one more person, and you have anyhow a very low close rate of that outreach.

So it’s not that bad. So those are the things you need to think about. And you can mitigate both. So you could say for the social media outreach, I do it in a very automated way, because I don’t care if I miss one or two.

But obviously, I want some kind of feedback loop. So if I miss them, that I can check the outreach and say, well, maybe you should correct this and that the wording or the way you address the person. For the infrastructure decision, it could be what we discussed in previous podcasts more an augmentation first. So you would guide engineers with decisions, maybe create simply tickets in Jira, and just see how they interact with them.

It doesn’t mean if it’s imperfect, and it has a critical outcome, that you should not use AI, but you have to find a way in the product to basically deal with it. Totally. So the last one is actually quite interesting, because we hear it quite often at a very, very early stage. Because of prices of large language models, many people ask themselves, how does scaling when I grow affect my business?

But we want to tackle this from a different perspective, because we think as and we discussed this infrastructure costs of these models, also the latency. So the time between I prompt the model and I get a response, this will all improve massively. But the big question is, do I have a benefit? If this whole system, the product I create scales exponentially, because they’re the AI can really shine.

Just one example. Chatbots again, the traditional chatbots were rule engines. So if I have a problem, I can just say, I’m going to do this. If that’s the question, then this is the typical answer.

Or you just ask another question. But that is done by humans, the rules were created by humans. So if you have a lot of different conversations, a lot of humans have to create the rules. So that’s not scalable.

We tried this for autonomous driving, I guess, at the beginning, that’s not scalable, too many decisions. But the AI can help that if you have thought about all the previous steps. There is a scalability in the business case, right? Is a business as such scalable?

Can I, through network effects or quality advantages, scale the business? And then obviously, is the cost set up from my AI scalable? We’re often saying, oh, yeah, generative AI is a new interface. Well, an interface has very low cost.

If I pay 10 cents for every query, that’s a very bad type of interface, right? So because it’s super costly, and then it doesn’t scale. Except, for example, you make a, a healthcare application, and suddenly the doctor is supercharged and becomes way better. The doctor charges in like $100 for the consultation and has two queries, which are 20 cents.

That’s kind of like an economics which works. So for me, the scaling part is very often an economics discussion. And we saw this in the last wave of AI, where we had neural networks. In order to train neural networks, we suddenly had way more data, we had way more layers, and everybody was, saying, oh, it’s super expensive.

Same discussion today. But it becomes an economic question of, okay, it becomes expensive. Yes, but I can scale it because I can scale my market, and the cost will come down. It came already down.

If you think about, we have now Lama, which is open source, and all what you have is the hosting cost. Yeah. And interestingly, the exponential scaling part can even be a mode. Obviously, the first one to the market is not always the most successful one.

The first one that kind of cracks this challenge of output quality, the decisions, most viable product or most sellable product at the beginning. If you reach that point where you know, okay, now I can take much more decisions, I can expand to more segments, I can even expand my AI across the product, the product decisions, and you’re faster than the others, then you can scale this very, very fast because everyone else would still have to do all the experiments, pay the cost, have an inferior product. So then you can really push down the pedal and move fast. Yeah, that’s the podcast.

So I hope you enjoyed. We did. Definitely. Definitely.