I know it’s been a few months since I published here. I dedicated much of that time to working more than full-time at an early-stage technology company. They’re deep in the AI space, bringing to market a truly transformative way of discovering, capturing, and mobilizing enterprise knowledge. Now, back to our regularly scheduled programming.
The past twelve months have been fantastic! Well, there were some low points. But overall, it’s been a great year, personally, professionally, and even in that portion of the Venn Diagram where the two overlap. What distinguished the year was the velocity of change I opened myself up to. In that time of curiosity and adventure, I saw the future. The future is search. Awesome search. But there’s a catch.
Finally, AI
From 2016 to 2022, I ran an enterprise search technology company. During that time, search engines began to exploit machine learning and semantic analysis (“AI”) in new and exciting ways to make content more findable. While plenty of legacy search engines are powered by decades-old tech, we were pushing the edges of the possible. Unfortunately, many customers didn’t understand how AI helped despite measurable, differentiated results. AI was a magical black box that could not be trusted. Full stop.
AI technologies have been around for decades but have only recently become broadly accessible to the developer community. There is now an entire ecosystem of AI tech, from ML frameworks like TensorFlow and PyTorch to AI services on major cloud platforms and NLP libraries. But even still, AI remained a sideshow to the main tent of traditional deterministic programming.
So, what’s changed during my year of discovery? Has it been the emergence of ChatGPT and other large language models (LLMs)? Is that what’s driving the future I’ve seen?
Sort of. Capable and readily available LLMs are the catalyst for the sudden shift in business sentiment about AI. The real achievement of GPT-3 wasn’t the miraculous tech. It was a magical user experience. Even if the answers were wrong, they were magnificent. People, buyers, were wowed. Fear has been replaced by curiosity. With the release of GPT-3 in November 2022, customers' imaginations were unshackled in a massive transformational moment. Now, it’s time for AI to solve real problems.
Search is about to get real.
When I first used ChatGPT, I thought it would be an excellent search engine. That’s somewhat attributable to my frame of reference. I’ve spent a lot of time in the search space. More importantly, it’s my point of view as a user. Conversational interfaces are people seeking answers to questions. It’s search by another name.
Search engines, from Google to the horrible tools corporations deploy on their websites, are all average to awful. Even if they’re good at finding source material, like Google, and unlike any corporate search engine, they’re horrible at finding information.
My least favorite search experience is Siri. Ask Siri anything other than the weather. She will respond with, “I found this on the web for you. Check it out.” without telling you the information. It’s words! You communicate verbally! Read the damn thing and tell me what it says! Maybe this is what Tim Cook is working on.
LLMs, on the other hand, have lots of information. The big problem with LLMs is they don’t cite their sources. They are opaque. You get no reference materials for the answers, so there’s no way to verify an answer.
This was a notorious problem early on when so much was wrong. OpenAI has improved the tech, so there are fewer stunningly wrong answers. However, just because answers aren’t obviously wrong doesn’t mean they’re right. The lack of obvious errors seems even more disquieting, given the mistakes of the past.
With the integration of Bing, ChatGPT has begun citing sources when it searches to support an answer. That’s a step in the right direction and very similar to what Google has done with its generative search.
Early last year, I looked at a promising start-up called Perplexity. At that time, I was underwhelmed by the results. Fast forward a year, and Perplexity has improved. The answers are solid and the UI much more coherent. The process of answer generation is transparent, even more so if you pay the $20 per month for the Copilot version.
With the Perplexity search results, you get links to the websites that are the source material. This increases the confidence one has that the results aren’t hallucinations. Running several experiments between ChatGPT and Preplexity (which uses, among others, ChatGPT as the language model), I have found that I prefer ChatGPT conversational output. It’s more readable, though that’s an ambiguous measure. That said, Perplexity shows source material and does something ChatGPT utterly fails at: consistency.
You can ask ChatGPT the same question several times and get very different responses. They may not be wrong; they are very different, which feels wrong. ChatGPT questions that require Bing are a bit more consistent but still highly variable. Perplexity, on the other hand, being grounded in very specific content, appears to give results that, while not identical, look like they came from the same expert.
Perplexity is using a technology called RAG to produce its results. RAG is an acronym for Retrieval Augmented Generation. This technique “augments” the generation with some data—in this case, primarily data derived from an internet search—to serve as the source material for the language model’s answer generation.
The way I think of this is that they’re fine tuning the body of source information before they generate an answer. While the technique doesn’t deliver inherently better text generation it does provide the user with more confidence that the answer is correct by sharing source documents.
What excites me though is not the technology. The search monopoly could be broken. Perplexity could be heir to Google! Or, at least a very solid competitor. While Perplexity does make a few bucks selling subscriptions, I believe the big commercialization opportunity is in selling ads.
I watched a presentation by Perplexity co-founder Aravind Srinivas where he mentioned Google but didn’t talk about building a Google-like business. It’s a bit confusing as to what the ultimate business model will be. There seem to several paths.
The first is the subscription path. Convincing people to pay $20 a month for search that they get for free today could be a challenge. You’d really have to demonstrate an advantage beyond what we see so far.
The second could be an ad model very similar to what Google does. While Aravind disparaged the Google “sea of ads” in his presentation, those ads are a cash volcano for Google.
The third is an enterprise search model. Helping sales reps find the right content to close the deal, making it possible for customer contact centers to answer questions quickly and correctly, or allowing underwriters to access the right risk and policy information to make better pricing decisions are all valuable enterprise use cases for this sort of tech.
The reality is that Perplexity will likely follow some version of all of the above but Google has demonstrated that an advertising model is very lucrative.
But there’s a catch.
The Covenant is Broken
Search relies upon every website essentially granting the search engine a perpetual license to use their proprietary data for free, forever. Website owners do this because the implicit deal Google cut with all of them was: Give us your content, and if it’s good enough, we’ll send you visitors. Or you can pay us to send you visitors.
Generative answers break this covenant. If the generative text is good enough at providing a correct answer, you don’t need to click. Even if Perplexity is offering links to the pages where conversational answers are sourced, those links end up being social proof of a correct answer rather than a link to be followed. Once you have confidence that Perplexity is correct, you’ll never click a link. So why would content owners continue to make their content crawlable?
It’s an excellent question.
Conversational AI can’t deliver a product to your door, so commerce websites are still in. There are a bunch of other use cases, such as product support, where a content owner might initially be ambivalent about the traffic as long as a correct answer is delivered. But media sites? No way. Traffic is their entire game. Brand sites. Same thing. The covenant with content owners is broken.
Some of this will be litigated. But the vast majority of this conversational future—again, it is really search by another name—is going to require a negotiation between content owners and search engines about the value of their content.
The future is answers. The boost to personal and professional productivity from this new technology will be dramatic. But the model will also break some traditional business patterns and there are no clear indicators of what patterns will emerge to satisfy both answer seekers and content owners. Litigation is only the beginning. Conversational platforms need to ensure the economics work for all the stakeholders.
I’m Back
I’ll be publishing every Wednesday morning. My topic area will remain the enterprise with articles related to emerging software, marketing, and business transformation. If you have any comments or suggestions, please ping me through the normal channels. I look forward to the conversation.