I particularly found the part on what Ed calls “defense” highly intriguing – he says: “It just takes one weak link to finish you off”. Essentially, my understanding is that among other things, defense involves avoiding a bunch of things that when done repeatedly, could essentially wipe you out (as in, you die!).
Ed gives the example of “a terrible skiing accident” but other things that come to mind include say helicopter rides, binge drinking, bungee jumping, living next to a dangerous turn on a busy street etc. While the probability of death in a one-off instance of these games might be low, when done repeatedly, the chance of death becomes non-trivial due to repeated exposure.
Nassim Nicholas Taleb calls this concept Risk of Ruin, and I remember getting blown away when I read about it for the first time in his books ‘Skin in the Game‘ and ‘Antifragile‘. So much so that Ruin has become an essential mental model in my toolkit both personally and as an investor.
In particular, I love 2 specific examples of Ruin that Taleb frequently cites:
The Casino Experiment – let’s say we know that 1% of all gamblers playing at the Casino win. So for every batch of 100 gamblers that visit the Casino daily for 100 consecutive days, we know that each day, 99 will be wiped out and 1 will walk away with money. Now take a different case – one gambler visits the Casino daily for 100 consecutive days. In this case, his probability of getting wiped out at some point is 100%.
Russian Roulette (quoting Taleb directly here) – “Assume a collection of people play Russian Roulette a single time for a million dollars –this is the central story in Fooled by Randomness. About five out of six will make money. If someone used a standard cost-benefit analysis, he would have claimed that one has 83.33% chance of gains, for an “expected” average return per shot of $833,333. But if you played Russian roulette more than once, you are deemed to end up in the cemetery. Your expected return is … not computable”.
Taleb calls the former case where different groups do an activity, and probabilities and expected values are computed “in average”, as Ensemble Probability whereas the latter case of a single person repeatedly doing an activity across time as Time Probability.
As common sense would tell us, Ensemble Probability represents a mathematically driven, academic (almost artificial?) scenario analysis, whereas Time Probability represents how we as individuals get exposed to risk in real life.
Time Probability suggests that there is an underlying Risk of Ruin in many more things & activities in real life than our brains can cognitively appreciate in the thick of the action.
Over the years, as I have read/ listened to more thinkers across fields, many of them highlight this same concept of avoiding Ruin in their own words. Examples include:
It is remarkable how much long-term advantage people like us have gotten by trying to be consistently not stupid, instead of trying to be very intelligent. – Charlier Munger
Never forget the six-foot-tall man who drowned crossing the stream that was five feet deep on average. – Howard Marks
The 1st rule of investment is don’t lose money. And the 2nd rule of investment is, don’t forget the 1st rule.– Warren Buffett
[Paraphrasing]“Arithmetic returns are false hopes; the truth lies in geometric returns” / “Profit is finite. Risk is infinite”. – Mark Spitznagel
Essentially, all these quotes are pointing to the same underlying idea:
The most important force that governs life, be it in health, relationships or portfolios, is compounding.
Given its cumulative nature, the optimal strategy for enabling the magic of compounding is combining avoidance of ‘Ruin’ (going to zero/ total destruction) with ensuring ‘Survival’ over a long enough period of time.
Say you are managing a corpus of $100 and intend to invest it across a portfolio of ‘N’ bets, how should you determine the size of each bet?
The Kelly Formula, as outlined by the famous math professor, investor, and gambler Ed Thorp, shows us a path.
One question that I have been studying for a while now is how to most optimally bet on a given deal? Public market investors call this ‘position sizing’ – say you are managing a corpus of $100 and intend to invest it across a portfolio of ‘N’ bets, how should you determine the size of each bet?
A. Nuances for every strategy
Based on studying how some of the best public market and venture investors approach position sizing, it’s clear to me that, like most things in life, it’s part art and part science.
Every investor will have their own nuanced perspective on this topic based on their individual ‘strategy’. This includes the following aspects:
1/ Asset class
Publicly listed stocks are liquid and play in a mostly-efficient, no information-asymmetry market. On the other hand, venture capital is perhaps the most inefficient market, governed by intense power laws and extreme loss ratios.
In a totally different game, real assets generate cash flows and are, therefore, conducive to debt financing that improves levered-equity returns.
2/ Beliefs and personality
Warren and Charlie believe in buying extraordinary businesses at fair prices. Joel Greenblatt believes in special situations. YC and 500Startups believe in the ‘Moneyball’ style of venture investing. Benchmark and Kleiner Perkins believe in the classical, craftsperson style of venture capital. Brookfield believes in buying high-quality real assets on a value basis.
3/ Circle of competence
Often called an ‘edge’ or ‘competitive advantage’. Peter Thiel and Vinod Khosla understand revolutionary technologies better than others. Li Lu gets China more than Western fund managers. Berkshire is unique in its understanding of insurance businesses.
4/ Selection criteria
Don Valentine, Founder of Sequoia, famously said that “great markets make great companies”. Keith Rabois of Founders Fund has a founder-driven investing style where he looks to figure out whether this founding team can build an iconic company that changes the world.
Public market OG Chuck Akre’s investment criteria are captured in the ‘three-legged stool’ – (1) extraordinary business, (2) talented management, and (3) great reinvestment opportunities and histories.
5/ Portfolio construction
On the public market side, Charlie Munger’s Daily Journal Corp has a super-concentrated portfolio of 4 stocks (~40% Wells Fargo, ~40% Bank of America, ~15% Alibaba Group, and the rest is U.S. Bancorp). Bill Ackman of Pershing Square has a classical, concentrated ’10×10′ portfolio that presently includes 8 stocks, with each position being 10-20% of the portfolio. Seth Klarman of Baupost Group is comfortable with a bit more diversification, owning 28 stocks at present with the largest holding at ~15% portfolio, and a bunch of positions in the single digit % range.
In venture capital, given its high-risk profile, the importance of diversification is generally well-understood. Yet, firms exhibit significant variance in their approaches to portfolio construction. While the likes of Brad Feld and Mark Suster believe in taking 30-40 shots even from reasonably large $300Mn+ funds, Mike Maples Jr. of Floodgate believes that a typical venture portfolio becomes statistically diversified at 12 companies, and beyond 25, there is no incremental value from excess diversification. Miriam Rivera of Ulu Ventures has studied data from LPs and concluded that even the best VCs have ~4.5% picking skills and therefore, a portfolio of 70-100 shots at goal is needed. Finally, an accelerator like YC funded 229 startups in just one Summer 2023 batch.
So, as we can see, position sizing approaches can vary dramatically based on the investing context and strategy being followed. But are there any broad rules and heuristics that can be useful for any investor out there?
B. The Kelly Formula
John L. Kelly was a researcher at Bell Labs in the 1950s. He developed a mathematical theory on how to bet most-optimally from a finite bankroll, in favorable gambling games.
Without going into the mathematical details of it*, here’s the basic idea behind the theory as explained by Rob Vinal of RV Capital in his H1 2023 Investor Letter:
The basic idea is that the greater the upside relative to the downside, the more an investor should bet. However, if there is a probability of a total loss, the bet size should be zero as the product of any series of numbers with a zero in it is zero.
Based on studying the Kelly system, including commentary on it from the likes of Ed Thorp and Rob Vinal, here are some key rules that any investor should be aware of while position sizing for any strategy:
1/ Play only when you have an advantage
Here’s how Ed Thorp describes it:
The Kelly system calls for no bet unless you have the advantage. Therefore, it would tell you to avoid games such as craps and slot machines. However, if you have the knowledge and skill to gain an edge in blackjack, you can use the Kelly system to optimize your rate of gain.
Warren Buffet’s ‘Circle of Competence’ rule is also a play on this idea. To have the best odds of winning, choose a game you have an edge in and choose to play at the table with weaker players.
TLDR: focus on identifying your edge before thinking through bet sizing.
2/ Avoid the risk of ruin
In repeated games (say a coin toss) where there are some odds of a total loss (eg. heads you win, tails you lose), if you bet everything in each turn knowing that you have an edge in each turn (say you have odds of 0.52 for getting heads in each turn), as the number of turns ‘N’ increases, the probability that you will be ruined tends to 1 or certainty.
In the Kelly system, you never bet everything in a single turn so the chance of ruin is zero.
3/ Bet more when asymmetricity is high
The Kelly formula tells us to bet large where there is a big asymmetry between upside and downside. Conversely, it shows that if the risk of loss is too high on a single bet (eg. in Roulette), it’s too dangerous to bet a large fraction of your bankroll.
The former scenario is the method that top-value investors follow – betting a big proportion of the fund (10-20%+) on each high-conviction, high-quality business with a large margin of safety. Case in point: Berkshire has ~50% of its publicly traded portfolio in Apple.
We don’t put the most money into things that are going to give us 7-10x returns. We put the most in positions where we will never lose money.
Conversely, the latter ‘Roulette’ insight is the method venture capital investors follow while investing in extremely high-risk startups. They play on the right side of the power law curve – assembling an optimally diversified portfolio of high-risk, high-reward bets; and deploying enough capital in each bet so as to ensure enough ownership per company such that if and when it wins, it wins big enough to compensate for all the other losses in the portfolio.
4/ Value of holding cash
By using concepts like bankroll, betting small portions of it at a time, and not going broke, the Kelly formula also subliminally suggests the value of always holding cash in the portfolio.
Berkshire is famous for holding significant amounts of cash ($100Bn+ in recent years) on its balance sheet at all times.
We believe in always having cash. There have been few times in history where if you don’t have it, you don’t get to play the next day.
Cash is like oxygen. It’s there all the time but if it disappears for a few minutes, it’s all over.
Holding cash also helps in going on offense when unforeseen crises like the dotcom crash or GFC occur. As asset prices crash, these become once-in-a-lifetime opportunities to deploy capital.
C. Incorporating special considerations
While following the above heuristics from the Kelly criterion, it’s also important to keep some room to account for special considerations in your personal context. Eg. Rob Vinal keeps some buffer in case LPs want to withdraw money on short notice:
Position sizing is both an art and a science. Having a well-defined view on it that is congruent with your overall investment strategy is crucial for any investor.
As you think through its nuances, it’s useful to keep in mind the guardrails that the Kelly formula tells us. The Kelly heuristics guide us towards the most optimal, risk-adjusted path for generating returns in probabilistic games like investing while avoiding the risk of total ruin.
Subscribe
to my weekly newsletter where in addition to my long-form posts, I will also share a weekly recap of all my social posts & writings, what I loved to read & watch that week + other useful insights & analysis exclusively for my subscribers.
During a recent train ride in London, I observed an interesting pattern in the crowd that “rang a bell” in my head.
Here’s why understanding patterns in crowd behavior is important for successful investing.
For those of you who regularly follow my writings, am sure you have observed by know my fascination with behavioral economics/ finance & the psychology of crowds. One of my major insights from studying the work of OG investors like Charlie Munger, Howard Marks and Bruce Flatt is that the key to superior (i.e., above market average) returns is to be non-consensus & right. Getting a read on how the crowd is behaving at any point in time is one of the important analytical tools necessary to achieve non-consensus behavior.
To simplify, a crowd is a set of largely independent & uncoordinated entities, though you can define it in many other ways as per your context. There are many mental models to visualize the properties & behavior of a crowd. These include the Madness of Crowds, Herd Behavior, Social Proof, Incentives etc. However, during a recent trip to London, the city’s “Tube” train system brought back the most fundamental of these models right in front of my eyes – the normal distribution, popularly called the bell curve.
So, here’s the story. Last week, I landed at Gatwick on a busy morning, and boarded the train to Heathrow. The first thing I observed is how significantly better the London transit system is compared to anything I have experienced in the US. Even the NYC subway is nowhere close in terms of quality, multi-modality & cleanliness.
This particular train (I think it was called the Southeastern) had a very cool feature wherein it displayed how crowded each carriage was in the train, so people could shuffle around. Check out the below pic I took of the display in my train – do you notice an interesting pattern within it?
The distribution of the crowd across carriages is very close to a bell curve. Out of 12 carriages, the middle 5 are “standing room only” (yellow), 3 on the right and 2 on the left are “few seats available” (dark green) and the 2 carriages on extreme left & right are “plenty of seats available” (light green).
Seeing this pattern in a random, real-life event involving hundreds of independent & uncoordinated strangers blew my mind. I couldn’t resist taking its picture even while hanging on to 2 large bags while getting jostled in a..wait for it..middle carriage (see the bottom part of the above pic, it says “you are in coach 7”). I was myself in the middle bulge of the bell curve!
Now, besides this being a nerdy but cool anecdote, is there anything to learn from it? The applicability or importance of a normal distribution is not the main point here. The real insight is that attempting to decode & model how the crowd is behaving in a certain environment, as well as its potential implications, can by itself give investors a massive head start.
So, to be successful at contrarianism, you have to understand (a) what the herd is doing, (b) why it’s doing it, (c) what’s wrong with it, and (d) what should be done instead & why.
The importance of rigorously decoding crowd behavior (or what we often call “the Market”) can’t be emphasized enough due to the simple reason that the crowd is right most of the time. When the investor-crowd is signaling that a company is un-fundable, most of the time it has correctly identified a weak business. If the market is predicting an interest rate cut by the Fed in the next few quarters, its combined wisdom is likely to be more accurate than most experts. If investors at large are investing in the AI wave or piling into an EV stock, they are indeed spotting a market opportunity that is likely to be exponential. If investor interest is low in a particular real estate location or type, most of the times it’s due to the right reasons.
While going blindly against the market consensus is flawed, first-order thinking, asking the right questions around “what” the market is doing & “why” is the first step of rigorous, second-order thinking.
The difference between “the market has spotted/ rejected an opportunity correctly” vs “the market is overly optimistic/ pessimistic on the said opportunity” is a fine nuance that can create a big delta on long term returns.
In particular, second-level thinkers understand that the convictions of the masses shape the market, but if those convictions are based on emotion instead of sober analysis, they should often be bet against, not backed.
Abstracting this idea of understanding patterns in crowd behavior a bit more, I believe there is tremendous value in seeing various aspects of life as a distribution of outcomes. Personally, I find probability distributions more helpful in understanding how the real world works in a continuum, as opposed to statistical distributions, which are like static snapshots of reality & more academic in their usefulness.
Probability reflects how life operates in the “grey”. I have found viewing the world probabilistically to be immensely helpful in managing risk & uncertainty in every aspect of life. Too bad they don’t teach these applications while covering the subject in school!
Btw, coming back to the earlier train story, I practically used the bell curve pattern in how Londoners board trains by myself lining up either in the extreme beginning or extreme end of the platform during subsequent trips. Oh, the joy of boarding an empty carriage from the busy London Bridge station. Just goes to show that being a bit nerdy can sometimes be useful in practice!
Subscribe
to my weekly newsletter where in addition to my long-form posts, I will also share a weekly recap of all my social posts & writings, what I loved to read & watch that week + other useful insights & analysis exclusively for my subscribers.