If You're Not First, You're (Maybe? Potentially? Probably?) Last
Convexity of long-tail outcomes and why they matter
One of the most common traps investors fall into regardless of asset class or sector looks something like this: Company A leads an emerging growth category. Company B is #2 or #3 in said category but has comparable technology & trades at a much lower multiple. Therefore, Company B is an attractive investment.
On the surface the logic is reasonably tight. If this emerging sector is real and growing then surely the whole competitive field will benefit. And if the whole field benefits but today the leader appears to be trading at some premium multiple, then isn’t it reasonable to underwrite Company B as a cheaply-priced version of the same investment?
The problem is this is almost always wrong. Intuitively we know this because it tends to be the case that power law dynamics emerge and so winners keep winning and losers keep losing.
That’s ultimately the end-stage result but I think there’s something more precise and less articulated that helps explain why some investors are able to appropriately underwrite this dynamic1. That specific thing is the convexity and asymmetric structure of probability-weighted tail outcomes. The proverbial “catch-up” trade ignores (or at a minimum incorrectly sizes) the tails where much of the information lives.
Candidly, the genesis of this piece came from watching an increasing number of people make the case for Lighter in the perp DEX space. I’ll say a few words later on this specifically, but I think the actual framework for thinking about the broader pattern that consistently plays out across a bunch of categories is more interesting.
Putting aside the difficulty many “value” investors have had over the past decade plus, it’s reasonable to expect that in mature, stable markets, relative value investing can work beautifully. If Pepsi trades at a lower-than-normal forward multiple to Coke, there’s a reasonable thesis one could make that market forces2 will close this gap over time. The defining feature of these types of markets is that they’re large, established and unlikely to experience either (i) some new order-of-magnitude growth catalyst or (ii) collapse entirely. Similarly, these companies have deeply entrenched distribution, real brand equity and customer relationships meaning any spread between them is likely pricing very specific and actionable ~things~.
This is not the case for emerging technology categories. The space that Pepsi and Coke operate in is largely a defined environment whereas nascent markets represent a probability distribution of possible futures, many of which involve the market becoming dramatically larger or smaller than whatever the current consensus is. And so the typical analyst will make the mistake of anchoring to their base case:
“I believe this market will grow from $X billion to $Y billion over the next 5 years”.
Under this assumption they build some kind of fundamental view, attach a revenue multiple model that suggests at $Y billion of market revenue, Company B, with a 20% share, is worth Z. And since Company B currently trades at a discount to Z, it looks cheap.
To be clear there’s nothing inherently wrong with this math. The problem is that the investor is implicitly assigning zero value to the tails of the distribution3, which in emerging categories can be fatal. These markets tend to have the fattest and most asymmetric tails4.
If you simplify this:
Scenario A (base case): the market grows roughly in line with expectations. The category leader benefits, as do the followers. Revenue multiples stay roughly in line with current levels, and the catch-up trade might even work here.
Scenario B (the right tail): the market grows dramatically larger than anyone could have expected. We see this happen all the time in technology: the internet, smartphones, cloud, even DeFi all grew way beyond what the most bullish early participants predicted. This has arguably become the defining characteristic of transformative technology, so much so that now people like Cathie Wood have made their whole personality squawking outrageous growth claims.
Scenario C (the left tail): the market fails to develop at the expected scale. For whatever reason the timing is wrong, the technology isn’t ready, society or culture isn’t willing to adopt it, regulation intervenes or the business model turns out to just be broken.
So we have a few scenarios here but in each case the leader and its competitive set are not affected equally.
In Scenario B, the market leader often captures a disproportionate share of the upside. The mechanisms that drove the leader to establish its dominant position (better technology, deeper liquidity, stronger network effects, better unit economics, more talent, etc) compound much more aggressively when the market itself is accelerating. A leader who owns 60% of a $10 billion market tends to capture >60% of the share as it grows into a $100 billion market because the same competitive advantages it leveraged to create a moat in the smaller market become more pronounced and decisive in the larger one. The leader tends to attract better capital, talent, and its tentacles spread faster to create a cycle that the second, third and fourth best in the category can’t replicate or keep up with. The faster this new market emerges and the more exponential its growth, the more likely it is that the leader swallows most of the value creation.
In Scenario C, the asymmetry cuts the other way just as hard. If the market fails to materialize at scale, the leader is of course going to suffer. But the odds that it goes to literal zero are much smaller than its competitors. Will it perform poorly on an absolute basis? Almost definitely. Will it disappear completely? So long as the market is large enough to support someone, then no. On the flip side, if the market isn’t large enough to support multiple players at any real scale, then the competitive set gets zeroed out completely.
The math here is what matters. Obviously any underwriting is going to believe the base case is most likely (duh), but you can’t really rationally price an investment without probability-weighting the tail outcomes, especially in fast-moving and disruptive markets. It’s adjacent but imo distinct from the “winner-take-most” framing that many investors like to fall back on.
That framing effectively says that in network-effect businesses, the leader tends to accumulate share and the market converges. It doesn’t explain why the multiple should be higher today before that convergence happens. It also doesn’t help identify how to treat the rest of the competitive set: are these companies actually undervalued or is the market sniffing out their ultimate fate? One thing it does do is allow investors to be a bit hand-wavey and imprecise when thinking through the future5. The convexity of tails can help explain why a meanignful gap in multiple exists (and persists): the leader deserves a higher multiple because it has a better probability-weighted claim on the upside tail and also a higher floor in the downside tail scenario.
There are a bunch of representative examples of this:
Uber & Lyft: Lyft was roughly one turn cheaper on revenue during their last private rounds. The rideshare market grew well beyond 2019 consensus expectations but the growth compounded almost entirely to Uber’s benefit. Global expansion, Uber Eats, Uber Freight and advertising all helped created a platform with operating leverage that Lyft couldn’t replicate. The right tail scenario materialized and Uber’s structural advantages widened the gap. Both companies saw their multiples compress sharply post-IPO but Uber’s settled into a more mature 3-4x range. Lyft’s collapsed to <1x as the market priced it as structurally capped.

Fwd rev: Ser. D→~$400M est | Ser. E→~$3B est | Pre-IPO→$14.1B FY2019 actual | IPO→~$16B pre-COVID consensus | FY2021→$31.9B FY22 | FY2022→$37.3B FY23 | FY2024→$51B FY25 est (kind of dated but mostly unch since so deal with it) Cloudera / Snowflake / Databricks: When Snowflake went public in September 2020 at 60x forward multiple, critics pointed to Cloudera as the “cheap” alternative in a world where cloud data was expected to grow meaningfully. The more well-established enterprise data company was trading at 7-10x revenue and so surely this meant it was relatively cheap and would also stand to benefit if cloud data growth was even larger than expected. The cloud data market indeed grew way beyond base case expectations but Snowflake’s >150% NRR6 meant its compounding revenue nearly doubled every year. At the same time Cloudera’s legacy on-prem architecture became a liability. Snowflake grew revenues to $3.6B by FY2025 while Cloudera was unceremoniously taken private in 2021. Maybe just as interesting is that all the while Databricks was also building here. Most people thought it was complementary to Snowflake, up until its SQL product launch in 2020-2021. Databricks had bet earlier and more aggressively on the convergence of data engineering, ML and enterprise AI and is now growing at more than twice the rate of Snowflake7. The broader implications here are that the right question(s) for any emerging category is “what does the market look like in the right tail outcome scenario and who is architecting for that version?”
(the exact numbers in these curves are less relevant than the shape + implications)



There are other examples here like US sports betting for example, or in the case of a market that never reached expectations, the daily-deals sector8. But the last specific case worth talking about is the original inspiration for writing.
To understand Hyperliquid today, it’s important to understand where the perp DEX market came from. In early 2023, dYdX held ~70% of the total decentralized perp DEX volume and by all intents & purposes was publicly viewed as the undisputed category leader. It had the deepest liquidity, institutional backing, a robust codebase and what some believed was a durable first-mover advantage. That obviously turned out to be misplaced as a new wave of perp DEXs led by Hyperliquid has rendered dYdX irrelevant9.
The argument ahead of Lighter’s launch from some was that Hyperliquid would suffer a similar fate as dYdX. “People will just continue to farm the next perp exchange in a never-ending cycle of mercenary capital flight”.
That has (thus far) been largely proven false. But the more recent argument now sounds more like:
“perps and on-chain trading will eat a larger share of the financial world, Lighter offers a genuine technical differentiator and is undervalued even if it never overtakes Hyperliquid”.
I think this could maybe be true. But I also think it’s more likely not the case. And the burden of proof is certainly on Lighter (and every other exchange competing for the same pie) to prove its skeptics wrong.
No single metric is perfect but volume/OI for perps exchanges is particularly useful. Lower = more organic activity (capital stays open in real positions). Higher = more churn, farming or incentive-driven turnover. Stable, consistent ranges are when you get a sense for an exchange normalizing and exiting any incentive programs or volume explosions around anticipated airdrop farming. Lighter and Aster both showed this classic arc and have now compressed as post-airdrop volume normalizes relative to OI. Variational saw a spike that coincided with the launch of their Omni points program. Hyperliquid’s OI has grown in proportion to its volume, reflecting a healthy pattern of real adoption.
I didn’t want to spend too much specific time on Hyperliquid vs Lighter because most people are too tribal and will assume I am just a hater. Realistically I would much prefer LIT becomes a token I can buy because the company is doing well. I want more good tokens.
The last thing I will highlight is more broadly applicable than just crypto10, but is easily observed through the lens of these two protocols. In traditional equity markets, the concept of a dividend yield is just the annual dividend per share / current share price. The mechanical reality of this formula makes it extremely dangerous as a standalone metric since yield and price move inversely. If a company’s stock falls 50% and the dividend remains flat, then the yield doubles. Should we consider this a more attractive investment now since the yield has gone up? Or is it possible the market repricing the equity lower is the true signal?
The relationship between this yield and valuation quality depends a lot on what phase of the business cycle we’re analyzing.
In a regime where the market is mature and stable, then a high yield is a feature: think of utilities or REITs. There’s limited reinvestment opportunity and the TAM isn’t expanding meaningfully. Here it’s fair to say that yield is a legitimate valuation anchor.
Then there’s a second regime where there’s real growth alongside rising valuation, in which case a falling yield is a feature. In high-growth categories, a falling yield means the market is pricing in future growth at a rate that overwhelms the current payout. An analyst who said “Apple’s yield is too low, it’s overvalued relative to AT&T” in 2012 was applying the wrong framework to the wrong asset.
The final regime is a distressed business with collapsing valuation, in which case an exploding yield is actually a warning signal. This is the proverbial yield trap. Telecom companies in the early 2010s or European banks in 2008-12. Retailers that face disruption or shifting consumer habits fall into this bucket all the time. The yield looks attractive when the reality is that the market has already made a judgment that the business faces structural headwinds that it is unlikely to overcome.
If we applied this to the perps DEX space today then one could argue Hyperliquid sits somewhere in regime number two: a growth business where the right analytical framing is revenue compounding, market share growth and TAM expansion. Its buyback mechanism is well-understood, but the dominant signal is that the market is pricing in a future in which Hyperliquid captures a material share of a $50-100B+ annual fee pool that doesn’t exist yet. In that scenario, a lower current buyback yield makes sense as the denominator is pricing future growth. The buyback yield shrinking over time reflects exactly this11.
Lighter on the other hand faces a fundamentally different situation. First, the FDV has compressed materially since TGE. It’s possible this is short-term distribution and just weak broader markets weighing more heavily on it. But the denominator of the buyback yield is smaller now, so mechanically the yield is going to look higher without any changes to the underlying business. It’s also important to note that an outsized amount of the revenue generating buybacks was earned during a farming-incentive period, regardless of whether or not this was explicit or implicit. That revenue has declined sharply as volume normalizes and so the burden of proof is on the company to prove that the numerator of the fraction won’t continue to shrink moving forward. I’d actually prefer to see this come to fruition because more “good” tokens is what crypto needs, but if it doesn’t then this is kind of textbook regime three dynamics: a yield that looks high precisely because the market is (correctly) repricing the asset down in anticipation of revenue deterioration.

For this to exhibit a healthy repricing you’d want to see the yield compress (i.e. price recovering faster than revenues) or the revenue stabilizing while price begins trending higher. Neither of these is happening which suggests the market is essentially saying the current revenue run-rate is the new normal and the token is being priced accordingly every day.
Even if you disagree with how I’m framing this specific example, the broader point that matters in my view is an appreciation for why assets trade and are valued the way they are. Every argument or framework that flattens the way we think about pricing assets, relative value, dispersion and long-tail outcomes does a disservice to the craft of investing.
and simultaneously recognize where relative value does actually exist
this can obviously mean many things
the slightly less lazy version of this is doing the bear/base/bull case. but even this is pretty arbitrary because realistically this isn’t capturing tails, it’s just capturing the first standard deviation around the base case
understanding kurtosis and skew and then applying your pov on these to the markets you spend time in is probably worthwhile (nfa)
“the market will be so large it doesn’t matter bro”
existing customers were not just renewing but meaningfully expanding their usage y/o/y
databricks completed a Series L (lol) in December at $134B, which to be fair I suspect would trade meaningfully lower if it were a public company today
groupon, LivingSocial
some of this was self-inflicted tbf
though it is particularly apt in crypto as we have more limited ways to push value back to tokenholders atm
One of the legitimate criticisms of Hyperliquid is the static nature of its buyback program (“the capital could be deployed more effectively to growth initiatives”). In my opinion this is a fair (though moot) argument. The counterpoint is that it works because of the flywheel it reinforces (fees → HYPE buybacks → HYPE price support → validator incentives → better infra → more volume → more fees) but that flywheel only functions at scale. For a challenger doing 5% of Hyperliquid’s revenue, the same dollars deployed into competitive tactics would almost certainly generate higher returns than buybacks.








