LLMs – An Operator's Blog

AI Musings #8 – Thoughts from South Park Commons Demo Day

Quick observations on the latest AI startup products.

Attended an amazing South Park Commons Summer Demo Faire yesterday!

Reporting back a few thoughts running through my head in real-time:

1. Essentially, the capabilities & design of every core SaaS use case are being reimagined by AI founders as we speak. In a future steady state, I see many of them living inside larger product suites as “features”, either via the incumbent fast-following and shipping them, or via small acquisitions/acqui-hires.

2. Consumer AI products remind me a lot of the 1st gen iPhone apps. Founders (developers) rapidly shipping entertaining, almost “toy-like” use cases. Like in mobile, will something massive eventually come out of these? So hard to tell…

3. An underlying capability of AI that a majority of products seem to be leveraging is “contextual artifact creation”. Eg. creating videos & decks in real time, replacing specific elements instantly in pre-existing media etc.

4. While the underlying “intelligence” capabilities of the products seem to be next-level, the UI/ UX as of now seems quite incremental relative to the mobile/cloud era. Lots more discovery & risk-taking needs to happen here.

5. Across enterprise & consumer/prosumer, it’s clear that these products can only manifest their power when they have access to extremely differentiated & diverse sources of data. In some contexts, it was unclear how a startup would get access to many such datasets in a fresh & relevant manner.

6. In legacy industries like govt/ public sector, AI-native products, even with game-changing capabilities, will still need to deal with age-old GTM challenges (long sales cycle, who will buy, what are the incentives for users to adopt etc).

7. Finally, it’s still pretty effin’ hard to pull off a glitch-free, low-latency AI demo.

Congrats to all the presenting SPC founders. Can’t wait for how these products shape up going forward!

AI Musings #7 – latest on AI in the Valley

Given the exponential rate of change in AI, Dev Tools appear to have the most risk of use case durability, compared to Infra and Applications.

Earlier this week, I attended the AGI Builders Meetup in San Francisco. The event had 5 product demos ranging from new AI features by Twilio to a YC S24 startup building an AI agent to handle calls on behalf of users.

These demo-based events are always helpful for me to track the latest in AI. Also ended up chatting with a few engineers and founders attending the event, to get their thoughts on what they are seeing in their respective domains within AI.

One key insight I got from this event was that LLMs aren’t the real future of AI. No one really knows what’s going on inside them. They hallucinate much more than desirable (especially for accuracy-driven enterprise use cases). They are prone to prompt injection hacking.

In fact, the presenting Twilio PM said that “working with LLMs is like getting toddlers to do something”. They don’t take instructions promptly. You don’t know what’s going on inside their heads. You have to proactively ensure their safety as they are doing a task. I found this framing really interesting.

Building on this further, I have an updated (but still working) POV on the 3 buckets of AI – infra, applications, and dev tools.

#1 Infra

Irrespective of where AI goes, hardware infra like GPUs will always be needed. Hence, it makes sense that Nvidia is doing so much capex.

Also, Big Tech software infra players (Microsoft, Google, Meta, AWS, etc.), as well as AI-native hyperscalers (OpenAI, Anthropic, etc.), will continue playing a key role in defining where AI goes from here.

Sadly, as a micro VC, I can’t play much in this bucket (except personally investing in the public markets).

#2 Applications

Application layer founders that are starting up today are leveraging the capabilities of AI from Day 0 to solve customer problems. Their core focus still remains commercial-first – using the best-available software capabilities to solve customer problems, rather than getting overly enamored by the research aspects of AI and where it’s headed.

In a sense, these startups are centered on customer problems, not AI per se. Wherever AI ends up going, LLMs and beyond, these founders will leverage whatever capabilities they can get their hands on, and modify their architectures accordingly.

As an investor, the key is to back founders who are starting up now with an AI-first mindset and therefore, are fresh enough and agile enough to keep evolving their software as the underlying AI capabilities evolve.

Therefore, it’s reasonable to expect that these AI-native application startups should be fairly resilient to changes in the overall AI landscape. Hence, I feel reasonably comfortable in backing them (eg. portcos like Confido Health, Loop, and Soulside).

#3 Dev tools

This is the bucket I am most confused (and concerned) about. From seeing these demos, it seems like dev tools startups are essentially using the mental models of the previous cloud & mobile waves to make assumptions on use cases.

Further, I have observed that many of them are solving short-term, immediate pain points that could easily become irrelevant due to where AI goes from here, and/ or from the competition (eg. open source alternatives, AWS quickly launching it as a feature, etc.).

As I was seeing these demos, I looked up how much capital some of these companies had raised. Many of them have raised anywhere from $10-35Mn. The capitalization of these companies seems out of sync with the durability of their underlying use cases and revenue.

Essentially, what all this means is that I have a macro “Why Now?” question around the AI dev tools bucket. A top Bay Area engineer who recently left a cushy Big Tech job to start up was recently saying – “Given how things are changing every month, I am really not sure what to build right now”. I feel this is an intellectually honest view, rather than a FOMO-based approach that many VCs are taking.

Based on what AI practitioners like this person are saying about the exponential rate of change in AI, I fear that a majority of these dev tool use cases won’t endure.

Again, this is just my working POV. Would love to hear your views on what you are seeing.

to my weekly newsletter where in addition to my long-form posts, I will also share a weekly recap of all my social posts & writings, what I loved to read & watch that week + other useful insights & analysis exclusively for my subscribers.

AI Musings #6: The Bull Run Is Just Beginning (90s Telecom Boom Vibes)

The current AI landscape is giving strong mid-90s telecom boom vibes. Studying that cycle suggests that we might only be at the beginning stages of a multi-year bull cycle, wherein the ongoing massive investments in chips, infra and foundational models will ultimately enable the rise of enduring AI applications.

The last few days have been hectic in the world of AI. Introspecting on how things are unfolding at present in public, private, and commercial markets, where AI is right now is giving me major mid-90s telecom boom vibes.

Hyperscalers continue to raise huge $$$

1/ Microsoft did a sort of acqui-hire of Inflection AI for $650Mn via an interesting licensing deal structure. Given that Inflection was one of the high-flying foundational model startups and had raised $1.3Bn from Microsoft and Nvidia (cash + cloud credits) at a valuation of $4Bn in June last year, it’s unclear whether this is a good or bad outcome for employees and investors. Although, going by this tweet from Reid Hoffman where he mentions “good future upside”, looks like shareholders got some sort of equity package from this deal.

2/ Amazon concluded its initially committed $4Bn investment in Anthropic by investing the second tranche of $2.75Bn. Am assuming like other hyper scaler deals, this is a mix of cash and cloud credits. As per CNBC, this tranche was done at the first tranche’s valuation of $18.4Bn. The press release from Amazon hinted at deep integrations between both platforms, with Anthropic using AWS as its primary cloud provider internally for product development while also offering the latest generations of Claude-3 foundational models to AWS customers via Amazon Bedrock (a fully managed service for LLMs).

3/ As per The Information, Canadian pension fund PSP Investments is about to co-lead a fresh round of financing in Cohere at a ~$5Bn valuation. The company’s last round was at a ~$2.1Bn valuation last year, and given it’s reportedly at a $22Mn ARR currently, this is a rich revenue multiple to pay. Based on my conversations with BigTech AI operators, Cohere is significantly lagging Anthropic in terms of foundational model capabilities.

The current phase of AI seems to be like the beginning of the telecom boom in the mid-90s

Personally, I am finding it hard to predict which of the current foundational model hyperscalers and AI-first application companies will survive. Further, given the inflated valuations these deals are being done at, barring logo grabbing, I don’t see how investors can make outsized venture returns in these deals.

In parallel, while meeting super-early AI companies in categories like dev tools, security, and deep domain applications, I am struggling to see a clear right-to-win for a majority of them. Given data and distribution advantages of incumbent products both in Enterprise and Consumer, it’s unclear which seemingly-white spaces are actually viable startup opportunities.

However, amidst these struggles as a venture investor, I am feeling good about one hypothesis – all this capital going into infra and foundational model companies is actually building capacity for the next generation of enduring AI products to be built. This is quite similar to the role that in hindsight, the telecom boom of the 90s ended up playing for the adoption of the Internet.

There is an excellent Fabricated Knowledge post outlining the history of the telecom bubble and its comparisons with AI today. I especially loved this insight:

Source: Lessons from History: The Rise and Fall of the Telecom Bubble (Fabricated Knowledge)

As the Telecommunications Act of 1996 opened up the telecom sector to competition, a host of new entrants came in to become ISPs. They were followed by companies like Cisco, Ciena, Lucent, Nortel, and others who were desperate to sell networking equipment to these telecom companies.

Comparing this to today’s AI landscape, cloud providers seem to be similar to telecom companies, while semiconductor companies selling chips to these cloud providers are like the networking equipment companies eg. Cisco.

Also, during this boom, telecom capex was unlike ever seen before. As per the earlier cited post, just in the year 2000, capital spending by publicly traded telecom service providers was at an astonishing ~$120Bn (~$213Bn in today’s dollar terms).

This telecom boom capex is one of the largest capital bases ever built in such a short amount of time. I can see the same vibes in the amount of dollars going into AI chips, infra, and foundational models today.

Btw, one more learning from the telecom boom is how these flywheels become even stronger as the adoption of new tech starts reflecting in productivity gains. Here are some interesting excerpts on this from the Fabricated Knowledge post:

As this telecom boom was unfolding, LTCM blew up and therefore, the Fed ended up cutting rates in 1998 to avoid negative ripple effects. This is like adding tons of gasoline to a raging fire, ultimately leading up to a massive dotcom bubble.

Cut to today, as AI continues to drive massive private market investments while public markets continue to rip, the Fed is still talking about 3 rate cuts being on the cards for this year. Sounds familiar? 1998-99 and 2020-21 vibes anyone?

Looking ahead…

If I play out the AI cycle like the 90s telecom boom, we might only be at the beginning stages of a multi-year bull cycle, similar to say 1995-96 (perhaps the launch of ChatGPT is similar to the Netscape IPO?).

Investments into the buildout of AI infra could run into trillions of dollars. In parallel, it seems the public markets have already started pricing in some of the future promises of AI. Going by the telecom boom, this pricing-in of future expectations could significantly accelerate for several years from hereon, driving stocks of both the telecom service equivalents (Cloud providers) as well as the networking equipment equivalents (Nvidia and perhaps any new entrants into chip manufacturing?).

That all this is happening in a higher-interest rate environment is a critical point. If for any reason (economic, geopolitical, or otherwise) the Fed starts cutting rates (which they are publicly saying they will), this could provide a major kicker into an already accelerating bull market.

So, we can reasonably posit an oncoming AI bull market for the next few (at least 3-5) years. Ultimately, like all bull markets, it will transform into a bubble, which will then peak and eventually crash. If you look at the Internet wave, the massive telecom capex of the 90s ultimately enabled the rise of enduring Web 1.0 companies like Google and Facebook, but only after the dotcom crash. Hence, I tweeted this yesterday:

What do you think?

Bonus Section: Commentary On Sequoia Capital’s AI Ascent 2024

As I was trying to make sense of recent AI funding developments, I chanced upon the just-released videos from Sequoia Capital’s AI Ascent 2024. I found these points from the keynote particularly interesting:

1/ If we draw parallels with the Cloud wave, in 2010, the entire global software TAM was ~$350Bn, of which Cloud was a tiny ~$6Bn sliver. Cut to 2023, the global TAM has grown to ~650Bn but more importantly, Cloud has grown to a ~$400Bn large piece of this pie (~40% CAGR over 15 years).

The starting pie for AI is not just software products, but also services that can be automated. So the hypothesis is that the starting pie for AI is ~$10Tn.

2/ The “Why Now” for AI is really strong, wherein a set of additive waves, starting from semiconductors in the 60s to Cloud and Mobile in the 2000s, has brought us to this stage. The ingredients to take AI from research to commercial applications are all there today.

Source: Sequoia Capital’s AI Ascent 2024 Keynote

Personally, I feel Sam Altman created the ‘iPod’ moment for AI by taking the power of AI to everyday users via a step-change ChatGPT product. In parallel, Jensen Huang should get shared credits for this catalytic moment given Nvidia’s rapid progress on giving chips more power at smaller form factors and hopefully over the next few years, making compute considerably cheaper and easier to access.

3/ In the last Cloud and Mobile transition waves, a host of new categories were created and new leaders were born in each of them. For AI, most of the major categories – (a) Infra, (b) Security, (c) Data, (d) Developer, and (e) Apps, are open right now. Hence, a massive opportunity for new category leaders to be created.

Interestingly, by depicting the white spaces this way, Sequoia also seems to hint that the Infra category is likely to be dominated by BigTech incumbents in chips and cloud. I am also reading the sub-text that Sequoia, in a way, views hyperscalers like Anthropic to be embedded within the existing cloud ecosystems (and hence, no separate logos depicted).

4/ Sequoia estimates that Generative AI companies are clocking in ~$3Bn in annual revenues in aggregate at present. As a comparison, SaaS took 10 years to get to this aggregate revenue scale as an industry, something that AI has achieved in almost the first year out of the gate.

5/ One of the early signals that AI is a real transformative wave is the sheer traction that the early products are getting across both Enterprise and Consumer.

6/ Over the last year, a majority of the capital has gone into the foundational model companies. In the Web 1.0 wave, the Application companies that came later in the cycle (eg. Google) captured the most value. The current uneven distribution of funding indicates that the Applications layer in AI hasn’t even gotten out of the stables yet.

7/ The usage numbers of AI-first products are still way behind incumbents. Eg. the median DAU/MAU ratio of AI-first products is a mere ~14%, compared to ~51% for incumbent products. This indicates that AI adoption is still in its infancy.

It’s encouraging to see that the ability of foundational models is on a continuous upward trend. At some point, this will translate into product capabilities that meet the expectations of users, which will then eventually reflect in better usage and retention numbers.

8/ I loved this slide that showed how when the iPhone was launched, the first generation of apps were either gimmicky or basic utilities. It wasn’t until a few years later that companies learned how to harness the capabilities of the iPhone to build enduring products.

Reasoning by analogy, we should expect that it will take a few more years (though perhaps a smaller number than previous waves?) for enduring AI applications to emerge.

9/ Sequoia is calling AI primarily a “productivity revolution”, similar to farm mechanization. At a macro level, this should bring down the costs of doing any task or delivering services, creating a strong deflationary force in areas like education and healthcare that have historically seen a perpetual rise in costs.

AI Musings #3 – LLMs for Beginners

Sharing my notes from an awesome talk by Andrej Karpathy (top researcher at OpenAI) titled ‘Intro to Large Language Models’. A simple and quick primer on AI and LLMs for a general audience.

Source: my notes from Intro to Large Language Models by Andrej Karpathy.

Just listened to this awesome presentation by Andrej Karpathy (top researcher at OpenAI) on The Busy Person’s Intro to LLMs. I found it extremely helpful and was making notes throughout. Sharing them below for anyone who wants to understand AI and LLMs in the simplest way possible:

1/ What is a Large Language Model (LLM)?

LLMs are intelligent pieces of code that can learn from a vast universe of data, make inferences using it, and use those inferences to answer a user’s questions or perform certain tasks.

An LLM typically consists of 2 files – (1) Parameters file and (2) Run file.

a) Parameters are the weights that go into the LLM and power the neural network within it.

b) The Run file is some sort of code to run logic on the neural network within the model.

LLMs can be of 2 types:

a) Proprietary – users don’t have access to the actual parameters/ weights within the model, and can only use it via a web interface or API. Eg. OpenAI’s ChatGPT, Google’s Bard etc.

b) Open source – users have access to the full model, including the parameters/ weights. They can easily modify the model for their own specific purpose. Eg. the Llama-2 model by Meta.

Here’s the example of Llama-2 given by Andrej, wherein the Parameters file is 140GB and the Run file is just 500 lines of code written in any language like C, Python etc.

Interestingly, you just need a standard computer to be able to run an LLM like Llama-2 and derive inferences. It runs locally on the machine and hence, even Internet connectivity isn’t a requirement.

However, training an LLM is a significantly more computationally heavy task that requires thousands of GPUs and needs millions of dollars in investment.

2/ What is the LLM’s neural network really doing?

Very simply, a neural network tries to predict the next word in a sequence with the highest confidence. Eg., for the sequence “cat sat on a…”, the next word is predicted to be “mat” with 97% confidence.

The next word prediction task forces the neural network to learn a lot about the world and assimilate tons of knowledge on the Internet, including the interconnections within it.

The outcome of this basic learning and word-prediction capability is the ability to create various kinds of documents eg. a piece of Java code, an Amazon product description, or a Wiki article.

The now-famous Transformer architecture is what powers this neural network.

3/ How the neural network “really” works is still a mystery

While we can input Billions of parameters into a network, and iteratively adjust them for better predictions, we really don’t know how the parameters are exactly collaborating within the network

We know that these Parameters build and maintain some kind of knowledge database, but it’s a bit strange and imperfect. For eg., the model can answer “who is Tom Cruise’s mother?” as “Mary Lee Pfeiffer” but if you ask “who is Mary Lee Pfeiffer’s son?”, it answers “I don’t know”. This is popularly called a Recursive Curse.

LLMs aren’t like a car, where we know each part and how they exactly work with each other.

Think of LLMs as mostly inscrutable artifacts. The only thing we truly know about them is whether they work or not, and with what probability.

We can give LLMs inputs, and empirically evaluate the quality of outputs. We can only observe their behavior.
Andrej Karpathy (paraphrased from Intro to Large Language Models)

4/ How do you obtain the secret sauce of an LLM – the Parameters? Stage 1: Pre-training

Getting the Parameters requires essentially “compressing the Internet”.

To illustrate how Llama-2 was trained, imagine taking a chunk of the Internet (~10TB of text, procured by crawling the Internet). You then compress this large chunk of Internet data dump using 6,000 GPUs* for 12 days. FYI, this would cost ~$2Mn as GPUs are super expensive. [*A graphics processing unit (GPU) is an electronic circuit that can perform mathematical calculations at high speed. This is what Nvidia makes and AI is the reason behind its stock ripping.]

What this GPU computation throws out is the Parameters file – a compression of all this Internet data, almost like an intelligent distillation of it.

5/ Stage 2 of training – Finetuning

The document generation capability of a base-trained LLM is of limited use. What we want is a Q&A-type assistant model. The way to achieve this is to take the model through the next stage of training called Finetuning.

In this stage, we swap out the generic, large Internet dataset, and replace it with a specific Q&A dataset that has been collected manually.

This step of finetuning is done by companies like OpenAI and Anthropic, who hire people that create ideal answers to questions that users might typically ask. This human-curated dataset of conversations then becomes labeling instructions for the LLM to be finetuned upon.

It’s important to note that at this stage, quality is more important than quantity.

After finetuning, your base LLM becomes an assistant model. Interestingly, and again we don’t understand how it really happens, but the model is able to use the generic Internet knowledge from the pre-training stage and combine it with the conversational datasets from the finetuning stage.

The below slide summarizes both parts of LLM training. Pretraining is very expensive so update training typically happens once a year. Finetuning is more of a model alignment exercise and can happen as frequently as once a week.

6/ Stage 3 of training – comparisons

There is another subsequent stage of finetuning that can happen where for each question, human labelers compare multiple answers from the assistant model and label the best one. OpenAI calls this step Reinforced Learning from Human Feedback (RLHF).

7/ How do labeling instructions given to humans look?

Here is an excerpt from InstructGPT paper by OpenAI. It shows how human trainers should craft responses that are helpful, truthful, and harmless. These instructions can run into hundreds of pages.

Rather than being totally manual, labeling is fast becoming a human-machine collaboration exercise where LLMs can create drafts for humans to splice and put together, LLMs can review and critique labels based on instructions etc.

8/ How to compare various LLMs and find the best one for your use case?

There are platforms like Chatbot Arena (managed by Berkeley) where models are compared against each other and ranked as per an Elo rating, kind of like comparing and ranking chess players.

As seen below, closed proprietary models are ranked much higher than open-source models. However, the former can’t be modified for your use while the latter can.

9/ LLM scaling laws

Performance of LLMs is a smooth, well-behaved, predictable function of (1) N – no. of parameters in the network and (2) D – the amount of text we train on.

And the trends don’t show signs of topping out.

We can expect “more intelligence for free” by scaling.
Andrej Karpathy (paraphrased from Intro to Large Language Models)

Essentially, algorithmic progress isn’t necessary for better models. We can just throw more N and D via more compute power and get better intelligence.

Another interesting point – we can get better ‘general capability’ simultaneously across all areas of knowledge by training a bigger model for longer. Essentially, with more training, we should expect the performance of models to rise more across all areas for free.

This is the reason driving the gold rush for (1) data and (2) compute, as the more we throw of it at LLMs, we can get exponentially better models. Algorithmic progress then just becomes a bonus.

10/ Future Idea #1: System 1 vs System 2 thinking by LLMs

Currently, LLMs are System 1 thinkers, similar to what Daniel Kahneman calls the intuitive, fast-thinking part of the human brain (like speed chess). Where researchers are trying to go is converting them into System 2 thinkers, mimicking the slow, deliberate, rational, reflective part of the human brain (like professional chess players thinking through decision trees on the spot).

Turning LLMs into System 2 thinkers is about trading off time for accuracy.

11/ Future Idea #2: self-improvement

Another area of future improvement is making LLMs capable of self-improvement. The AlphaGo program by DeepMind has already demonstrated that in a narrow, sandbox environment with a clear reward function (telling the program whether how it played is good or bad), AI can self-improve to become better than the best human player. Can this be achieved in other, more open contexts?

12/ Future Idea #3: custom LLMs

This is already being done by OpenAI via GPT agents that can do custom tasks and are available on the GPT Store (more details in my post on the OpenAI Dev Day).

Bringing it all together – LLM OS

I don’t think it’s correct to think of LLMs as just chatbots or word predictors. Think of it as the kernel process of an emerging operating system.

This process is coordinating a lot of resources like memory and computational tools, for problem solving.
Andrej Karpathy (paraphrased from Intro to Large Language Models)

As per Andrej, this is how the LLM OS framework will likely look:

There is another similarity between today’s OS ecosystem and the LLM OS view. Today, there are proprietary OS like Windows, iOS, and Android, with a fragmented ecosystem of open-source products co-existing. Similarly in LLM OS, there are proprietary LLMs like ChatGPT, Claude, and Bard, with a burgeoning parallel ecosystem of open-source LLMs, many of which are presently powered by Meta’s Llama series.

Appendix: LLM Security

As per Andrej, one of the major security threats facing LLMs is various ways to jailbreak them and get answers to undesirable questions. For eg. a user can role-play within a question to make it sound harmless to the LLM. In other cases, users can create encoded versions of harmful questions and bypass the LLM’s security check.

Another type of security threat is prompt injection, where bad actors can infect images, documents, and web pages with harmful prompts that can cause undesirable behavior by LLMs when the user tries to input these materials.

The final type of threat is data poisoning where attackers can inject trigger words within training data that corrupts the model and catalyzes specific undesirable behavior.

AI Musings #1 – How The Odds Are Stacking Up?

From OpenAI getting close to $100Bn valuation and Anthropic partnering with Amazon, to Google and Meta doubling-down on their LLMs faster than ever before, the AI chess game is getting more intriguing by the day.

In this post #1 of the ‘AI Musings’ series, I share a few running thoughts on the odds for each category of players.

**This is the first post in a series called ‘AI Musings’ that I hope to write regularly over the next few months. The idea is to periodically analyze major developments and milestones in AI, both from a startup and BigTech perspective.

Frantic activity around AI continues in the US. Just in the last week, OpenAI is looking at a $80-90Bn valuation for a secondary sale of existing employee shares. Even as Anthropic announced a strategic collaboration with Amazon last week, which includes up to a $4Bn investment, there is news today of the company raising another $2Bn from Google and others at a $20-30Bn valuation. This is a 5x jump from its last round valuation in March.

Greylock has gone AI-first with its newest early-stage fund. The Nvidia stock continues to rip (read my post on how it illustrates The Bunches Principle). Dharmesh Shah (Co-founder and CTO of Hubspot) is back to coding and selling, building ChatSpot over a weekend of hacking as a first step towards making his CRM AI-powered.

Amidst all this action, I have been meeting academics, founders, investors, and BigTech operators working on the frontiers of AI, trying to refine my hypothesis on the space. Here’s a working version of some of my thoughts:

1/ High confidence that AI is real and here to stay

Though the space is definitely in a financing hype cycle, to me, it’s now beyond doubt that AI as a platform shift will be transformative for the world. Unlike Web3, progress around AI has been driven by large tech companies since the very beginning. These companies are much too shrewd and tracked to spend significant resources on something that is merely a low-probability moonshot. Therefore, they have been focused on driving real commercial value from LLMs from Day 0.

OpenAI first launched ChatGPT on Nov 30, 2022. The fact that Generative AI capabilities are already integrated into mainstream products like the MS Office suite, Google Search, LinkedIn, Notion etc. in less than a year just goes to show that this particular platform shift is happening significantly faster than the Internet, Mobile, and Cloud.

Another confidence booster for me personally has been the commercial revenue traction of AI-native hyper scalers. Here are some numbers based on my research:

Company	Started	Latest Valuation	Current Revenue Traction (Est.)	Source
OpenAI	2015	~$80-90Bn, reported as of Sep’23	$80Mn est. MRR (~$1Bn annualized), reported as of Aug’23	Reuters
Anthropic	2021	~$20-30Bn, reported as of Oct’23	$200Mn proj. revenue in 2023, reported as of Sep’23	Information
Cohere	2019	~$2.1Bn, reported as of Jun’23	Sub $50Mn proj. revenue in 2023, reported as of Aug’23	Industry Sources
Hugging Face	2016	~$4.5Bn, reported as of Aug’23	$30-50Mn est. annualized revenue, reported as of Aug’23	Axios

These are tangible business revenues generated from enterprises, SMBs, and individual developers as customers. And the ramp-up over the last 12 months is astonishing. Honestly, looking at the depth of commercial traction these hyperscalers are showing, the valuation numbers don’t look entirely out of whack.

2/ Large incumbents are highly likely to capture disproportionate value from AI

About 9 months back, when Google’s stock was tanking as a reaction to ChatGPT’s growth and OpenAI’s partnership with Microsoft (a botched Bard demo made things worse!), I asked this simple question:

In hindsight, this was a very pertinent question to ask. As various BigTech-AI hyperscaler partnerships are playing out, it’s becoming clearer that large incumbents are strongly positioned to capture a significant portion of market value created from AI. They have a unique combination of the following:

Chips and cloud computing infrastructure to train and deploy foundational models, as well as build custom applications that are reliable, safe, and secure.
Distribution reach to get Generative AI in the hands of exponentially more customers.
Capital to place bets on AI hyper scalers and align with them to leverage their core strengths around faster and more disruptive innovation.

Bill Ackman, who runs Pershing Square and is one of the top-performing hedge fund managers, has been doubling down on Google since its price hit the $80-90 range post-ChatGPT. Here’s his rationale on why Google is strongly positioned in an AI world:

Bill Ackman’s (Pershing Square) pitch on Google’s positioning in AI

Based on my conversations with senior AI operators at the likes of Google and AWS, I believe the AI manifestations we are currently seeing in their mainstream products are not even the tip of the iceberg. Think of them as small experiments or POCs. The depth and range of their pipeline of AI capabilities are beyond regular imagination.

Btw, I am a believer in Bill Miller’s thought – “The economy doesn’t predict the market. The market predicts the economy“. Going by how BigTech stocks are ripping amidst a rather cool economic and market environment, the wisdom of public markets also suggests that these incumbents are poised to reap huge dividends from AI.

So, amidst all the noise and hype, if you are trying to figure out a simple, risk-adjusted way to benefit from this AI platform shift, here’s a thought to consider:

3/Early-stage startup plays are still fuzzy

After spending significant bandwidth meeting AI founders, I am seeing that, as opposed to the BigTech and AI Hyperscaler plays, there is significantly more fuzziness in the early-stage ecosystem (and rightfully so!).

Inspired by the recent SaaStr session between David Sacks (Craft Ventures) and Jason Lemkin, here are my running thoughts on 3 categories of AI startups:

(I) Infrastructure

These include LLMs and other aspects of foundational AI infra. This bucket is really challenging to invest in simply because:

Building AI infra requires deep technical chops and/ or very specific prior experience, ideally in a particular set of companies. These teams are rare, extremely hard to source, and often get spotted very early by the likes of Sequoia and A16Z.
AI infra startups require large amounts of capital and therefore, need major VCs to be in them from very early on. In other words, these companies are hard to bootstrap, and funding them requires playing a very different kind of game that’s hard for a small check writer to play.

(II) Classic vertical SaaS with AI capabilities

The hypothesis here is that given AI is a massive platform shift, does it create new gaps in existing verticals like healthcare, education, sales, customer support etc. that a fresh generation of AI-first startups can exploit?

The hurdle I face while evaluating these startups is – why wouldn’t an existing growth or late-stage company just leverage AI as a new capability in their existing product suite? Incorporating AI features into an existing installed base (eg. what Microsoft is doing with OpenAI) seems like a superior ROI proposition compared to taking a brand-new product to market.

If this generalization is indeed true, it definitely raises the bar for this bucket. However, again to think out loud, there are some contexts where there could be a real commercial case for new AI-powered vertical software. For eg.:

Legacy verticals where fewer growth-stage startups of the prior generation have entered – say transportation? Or construction? The argument here is that it’s easier to beat old incumbents by using AI as tech leverage, compared to other late-stage startups who might be equally good at incorporating it.
Verticals where brand new paradigms are opening up, which will change the game itself – given winner-takes-all dynamics in tech, most incumbents are hard to beat at their own game. But, if the game itself changes (often due to a tech inflection), then David has a better chance against Goliath (read my post “David (Microsoft) vs Goliath (Google)“). Eg. using AI in genomics, drones, automotive etc. to solve problems and deliver work in totally new ways.

(III) Job co-pilots

The hypothesis here is that AI will spawn a generation of job-specific assistants called co-pilots, that will make a specific job more efficient and effective. So everyone from a doctor and lawyer to CFO and marketer will have a co-pilot that does everything from workflow automation to insights generation, all in a conversational UX.

This seems to be an extension of the productivity-software thesis that many VCs followed over the last 5 years. Sounds interesting and plausible, though I am still not able to build conviction on what a winning company in this space could potentially look like, how it would need to be capitalized and built, and whether it can generate venture returns.

I am learning new thesis, approaches and frameworks every week, especially related to the early stage startup plays in AI. More to follow in AI Musings #2…

Tag: LLMs

AI Musings #8 – Thoughts from South Park Commons Demo Day

AI Musings #7 – latest on AI in the Valley

Subscribe

AI Musings #6: The Bull Run Is Just Beginning (90s Telecom Boom Vibes)

Subscribe

AI Musings #3 – LLMs for Beginners

Subscribe

AI Musings #1 – How The Odds Are Stacking Up?

Subscribe