<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Bunnyhopping]]></title><description><![CDATA[Bunnyhopping to AGI]]></description><link>https://tmychow.substack.com</link><image><url>https://substackcdn.com/image/fetch/$s_!Q5OG!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7e5f76-c6c6-40e6-bd6f-28e0cf1b0fb3_856x856.png</url><title>Bunnyhopping</title><link>https://tmychow.substack.com</link></image><generator>Substack</generator><lastBuildDate>Mon, 09 Mar 2026 01:12:05 GMT</lastBuildDate><atom:link href="https://tmychow.substack.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Trevor Chow]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[tmychow@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[tmychow@substack.com]]></itunes:email><itunes:name><![CDATA[Trevor Chow]]></itunes:name></itunes:owner><itunes:author><![CDATA[Trevor Chow]]></itunes:author><googleplay:owner><![CDATA[tmychow@substack.com]]></googleplay:owner><googleplay:email><![CDATA[tmychow@substack.com]]></googleplay:email><googleplay:author><![CDATA[Trevor Chow]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[RL turns smart models into useful products]]></title><description><![CDATA[Diffusing the AGI with product-model co-design]]></description><link>https://tmychow.substack.com/p/rl-turns-smart-models-into-useful</link><guid isPermaLink="false">https://tmychow.substack.com/p/rl-turns-smart-models-into-useful</guid><dc:creator><![CDATA[Trevor Chow]]></dc:creator><pubDate>Tue, 19 Aug 2025 10:25:29 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!DB4M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39233045-96e3-4fca-8390-3e9f1615d45e_1222x762.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Github repo <a href="https://github.com/tmychow/financial-autocomplete">here</a></p><div><hr></div><p>Some people think of it purely in terms of intelligence. Today, the models are smashing academic olympiads <a href="https://x.com/SherylHsu02/status/1946478334013321231">left</a> and <a href="https://x.com/SherylHsu02/status/1954966109851119921">right</a>, as well as <a href="https://x.com/joel_bkr/status/1953531298510979235">completing longer tasks</a> up to <a href="https://x.com/andresnds/status/1945655849822670896">10 hours at a time</a>. To them, it might feel like we&#8217;re pretty close to AGI.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VciH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34a7ac35-a591-4309-adef-3bc0655dbda1_1600x942.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VciH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34a7ac35-a591-4309-adef-3bc0655dbda1_1600x942.png 424w, https://substackcdn.com/image/fetch/$s_!VciH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34a7ac35-a591-4309-adef-3bc0655dbda1_1600x942.png 848w, https://substackcdn.com/image/fetch/$s_!VciH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34a7ac35-a591-4309-adef-3bc0655dbda1_1600x942.png 1272w, https://substackcdn.com/image/fetch/$s_!VciH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34a7ac35-a591-4309-adef-3bc0655dbda1_1600x942.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VciH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34a7ac35-a591-4309-adef-3bc0655dbda1_1600x942.png" width="1456" height="857" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/34a7ac35-a591-4309-adef-3bc0655dbda1_1600x942.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:857,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VciH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34a7ac35-a591-4309-adef-3bc0655dbda1_1600x942.png 424w, https://substackcdn.com/image/fetch/$s_!VciH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34a7ac35-a591-4309-adef-3bc0655dbda1_1600x942.png 848w, https://substackcdn.com/image/fetch/$s_!VciH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34a7ac35-a591-4309-adef-3bc0655dbda1_1600x942.png 1272w, https://substackcdn.com/image/fetch/$s_!VciH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F34a7ac35-a591-4309-adef-3bc0655dbda1_1600x942.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I went into university expecting to do an economics PhD, and switched to working on ML research due to some optimistic extrapolations of the <a href="https://arxiv.org/abs/2005.14165">GPT-3 paper</a>, so my view of AGI is driven by its potential to be economically transformative.</p><p>From this perspective, the diffusion and impact of AI across the broad economy remains limited: <a href="https://en.wikipedia.org/wiki/Productivity_paradox">you can see the AI age everywhere except in the productivity statistics</a>.</p><h3>Is there a capability overhang?</h3><p>One explanation for this divergence is that products haven&#8217;t caught up to model intelligence. While there are valuable products which are yet to be created by scaffolding current models, this need for scaffolding reflects a deeper issue: people buy products for usefulness, not intelligence.</p><p>Usefulness requires agency, reliability, low cost and latency, alignment with user intent, integration with business processes, and much more. Along all of these dimensions, the models are simply not capable enough right now.</p><p>One direction which people have been betting on is using RL to bake these traits into the model, making it less reliant on scaffolding. RL seems to be very sample-efficient at getting domain-specific improvements, and certainly more cost-effective than pre-training.</p><p>Yet the majority of RL progress has been limited to a few domains like software engineering and competitive mathematics. Can we actually use it to accomplish economically valuable labour?</p><h3>How do you apply RL to a real task?</h3><p>The best way to understand this is to actually go after a real in-the-wild task, so I picked the problem of giving inline completions for financial data as you are typing. </p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;809575d4-1fae-46a8-8141-a29383e91387&quot;,&quot;duration&quot;:null}"></div><p>This maps onto a decent chunk of work in finance, which is retrieving information when writing a memo or creating a deck. It is also an unsolved problem: there is no good inline autocomplete for writing, and certainly not one which retrieves financial data accurately. Large models are far too slow and expensive, while on-edge models struggle to chain tool calls, figure out when data is needed, and avoid making up numbers altogether.</p><p>To simulate this task, I populated a database full of 10-K and 10-Q data, synthetically generating many prompt-completion pairs using this data, as well as some prompts which did not require any completion.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oBfc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F747df088-4c76-4f30-82d5-4dae6ac95c49_1176x422.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oBfc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F747df088-4c76-4f30-82d5-4dae6ac95c49_1176x422.png 424w, https://substackcdn.com/image/fetch/$s_!oBfc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F747df088-4c76-4f30-82d5-4dae6ac95c49_1176x422.png 848w, https://substackcdn.com/image/fetch/$s_!oBfc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F747df088-4c76-4f30-82d5-4dae6ac95c49_1176x422.png 1272w, https://substackcdn.com/image/fetch/$s_!oBfc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F747df088-4c76-4f30-82d5-4dae6ac95c49_1176x422.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oBfc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F747df088-4c76-4f30-82d5-4dae6ac95c49_1176x422.png" width="1176" height="422" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/747df088-4c76-4f30-82d5-4dae6ac95c49_1176x422.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:422,&quot;width&quot;:1176,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oBfc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F747df088-4c76-4f30-82d5-4dae6ac95c49_1176x422.png 424w, https://substackcdn.com/image/fetch/$s_!oBfc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F747df088-4c76-4f30-82d5-4dae6ac95c49_1176x422.png 848w, https://substackcdn.com/image/fetch/$s_!oBfc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F747df088-4c76-4f30-82d5-4dae6ac95c49_1176x422.png 1272w, https://substackcdn.com/image/fetch/$s_!oBfc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F747df088-4c76-4f30-82d5-4dae6ac95c49_1176x422.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Then I built an environment where a model would be given a prompt, as well as the ability to use a &#8220;search&#8221; tool to find the right information from the database and use a &#8220;return_answer&#8221; tool to either return the datapoint or an empty string for the no-completion cases. The answer would be judged on whether it was accurate compared to the ground truth pairs I had synthetically generated.</p><p>For evaluation, I held out some companies, time periods, metrics, and prompts, and as a baseline, I evaluated GPT-4.1, which achieved 80% accuracy with a p95 latency of 3.3 seconds. By contrast, Qwen 2.5 3B Instruct, which can be run locally on a laptop, achieved only 43% accuracy with a p95 latency of 2.6 seconds.</p><p>I then RL fine-tuned the Qwen model for a few hours on a single A100, and it went up to an accuracy rate of 93% while dropping its p95 latency to 1.1 seconds. In other words, it transformed the model into a genuinely useful product with frontier level performance at 3x the speed.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DB4M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39233045-96e3-4fca-8390-3e9f1615d45e_1222x762.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DB4M!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39233045-96e3-4fca-8390-3e9f1615d45e_1222x762.png 424w, https://substackcdn.com/image/fetch/$s_!DB4M!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39233045-96e3-4fca-8390-3e9f1615d45e_1222x762.png 848w, https://substackcdn.com/image/fetch/$s_!DB4M!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39233045-96e3-4fca-8390-3e9f1615d45e_1222x762.png 1272w, https://substackcdn.com/image/fetch/$s_!DB4M!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39233045-96e3-4fca-8390-3e9f1615d45e_1222x762.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DB4M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39233045-96e3-4fca-8390-3e9f1615d45e_1222x762.png" width="1222" height="762" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/39233045-96e3-4fca-8390-3e9f1615d45e_1222x762.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:762,&quot;width&quot;:1222,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DB4M!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39233045-96e3-4fca-8390-3e9f1615d45e_1222x762.png 424w, https://substackcdn.com/image/fetch/$s_!DB4M!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39233045-96e3-4fca-8390-3e9f1615d45e_1222x762.png 848w, https://substackcdn.com/image/fetch/$s_!DB4M!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39233045-96e3-4fca-8390-3e9f1615d45e_1222x762.png 1272w, https://substackcdn.com/image/fetch/$s_!DB4M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39233045-96e3-4fca-8390-3e9f1615d45e_1222x762.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>It seems like RL gives us a clear line-of-sight towards model improvements. If it&#8217;s such a free lunch, why isn&#8217;t everyone doing RL?</p><h3>What are the footguns in applied RL?</h3><p>Let me describe 4 lessons from <a href="https://github.com/tmychow/financial-autocomplete">doing this experiment</a>, which make this less straightforward.</p><p>The first is the importance of obsessing over data quality: <a href="https://nonint.com/2023/06/10/the-it-in-ai-models-is-the-dataset/">the &#8220;it&#8221; in the AI models is the dataset</a>. For example, some of the financial metrics were getting saved with the wrong units, such that $9 billion was getting shown to the model as $9 billion billion. This is the kind of thing that can confuse a model, but is hard to notice without manually looking through lots and lots of rollout trajectories.</p><p>The second is that &#8220;model empathy&#8221; matters. Try and simulate what you would do if you had only the information it was given, and no additional context. This requires talking with the model a bunch, to calibrate on what the model knows.</p><p>For example, looking at the logs made me realise that the model was getting told that its tool calls were invalid, because it had used &#8220;selling, general &amp; administrative&#8221; and &#8220;sg&amp;a&#8221;, instead of &#8220;selling, general and administrative&#8221;. It&#8217;s just not very reasonable to expect a model to iterate through all the different ways of typing about a particular metric in order to figure out which is acceptable, so I made the environment more tolerant of different names for the same thing.</p><p>The third is that reward shaping is hard, reward hacking is always looming, and every step forward can feel like a step back.</p><p>Here&#8217;s an illustration:</p><ul><li><p>You start by giving the model only a reward signal depending on correctness.</p></li><li><p>This is very sparse, so the model struggles to learn to reason about tool use.</p></li><li><p>You add a reward signal on the intermediate steps e.g. rewards for using the &#8220;search&#8221; tool correctly, penalties for using &#8220;search&#8221; in a no-completion case or too many times.</p></li><li><p>The model only returns &#8220;no completion needed&#8221;.</p></li><li><p>You do some back-of-the-envelope-calculations and realise that with the relative weights of the rewards and penalties, the likelihood of it getting at least one wrong tool call, and the fraction of the dataset which is no-completion, the expected value of trying to answer is less than simply giving &#8220;no completion needed&#8221; every time.</p></li><li><p>You re-adjust the weights, but notice that the model now brute forces by using many turns to try many different tool calls, so you impose a maximum number of turns.</p></li><li><p>The model suddenly seems to be performing well, but the number of turns has gone up.</p></li><li><p>You take a look and realise that the model is spamming non-ASCII characters in each turn and hitting the maximum turn limit. This leads the environment to return &#8220;No response&#8221; to the judge, which takes that to be correct in the no-completion cases.</p></li></ul><p>In other words, you need to be paranoid every step of the way.</p><p>The fourth is a corollary of the three above, which is that tooling matters. One part of this is observability e.g. inspecting the data, seeing what the model is seeing, and tracking what is happening in the rollouts. A lot of the time spent babysitting a training run is spent on looking through metrics on wandb, and using them to diagnose if issues are arising e.g. right as the model went into the weird bit of the policy which involved spamming non-ASCII characters, both the KL divergence and the gradient norm spiked.</p><p>The other part of tooling is the infrastructure to make iterations faster. This can help with the more usual ML science e.g. sweeping hyperparameters, but it is especially useful for RL, since the speed of your rollouts can make a big difference to wall clock time due to a larger fraction of training being spent on inference.</p><h3>What determines the right to win?</h3><p>None of these pitfalls are individually intractable, but the reason they nonetheless deter many folks from applying RL is because they require the <a href="https://www.youtube.com/watch?v=gLwiPrwUDJ8">co-design of product and research</a>.</p><p>Unlike pre-training, you aren&#8217;t trying to just improve capabilities broadly, but instead you are after a specific goal in RL. This means you need to know how the model fits into the end product, so that you can decide on what the right data and environment is, as well as what to optimise for and how to craft the right rubric.</p><p>Figuring this out looks a lot like what I spent time doing as a founder i.e. talking with customers, dealing with the enterprise sales cycle, and finding an internal champion at the company. In many cases, what is most helpful to getting a customer to conviction is being able to give them a working demo very quickly, which they can play with and use to advocate internally.</p><p>At the same time, your ability to ship quickly is so dependent on the research tooling. This project took me a bit over a week, but much of that was spent debugging infrastructure issues. Now that I have done it once, I can recycle it for the next project, and the one after that,</p><p>Very few teams are set up to ship at startup speed while also building a reusable research platform. One view from the <a href="https://www.8vc.com/resources/the-ai-services-wave-lessons-from-palantir-in-the-new-age-of-ai">full-stack AI startups</a> and the <a href="https://www.theinformation.com/articles/venture-capitals-latest-strategy-private-equity-style-roll-ups?rc=gxzu9x">private equity rollups</a> is to run a &#8220;services-to-product&#8221; playbook that starts with owning a vertical and gradually building the tooling around it. </p><p>However, I&#8217;m actually more excited for a different approach, one where the frontier AI labs and <a href="https://www.chemistry.vc/post/rl-reigns-supreme">RL-as-a-service companies</a> leverage the infrastructure they have to rapidly deploy across domains. Unlike incumbents at many other points in history, these companies are still incredibly young.</p><p>They also have an advantage in cost structure, since they can amortise fixed research costs and bring down the average cost of each deployment. Their ownership of the research means they can integrate traits they care about into the training pipeline e.g. baking in certain general capabilities into pre-training to make RL easier or distilling smaller models to make inference cheaper.</p><p>If they can scale this co-design of product deployment with model research, we have a real shot at AGI, not just as a singular academic milestone, but as a diffuse economic achievement.</p>]]></content:encoded></item><item><title><![CDATA[Pre-training isn't dead, it’s just resting]]></title><description><![CDATA[GPT-4.5, the value of RL, and the economics of frontier training]]></description><link>https://tmychow.substack.com/p/pre-training-isnt-dead-its-just-resting</link><guid isPermaLink="false">https://tmychow.substack.com/p/pre-training-isnt-dead-its-just-resting</guid><dc:creator><![CDATA[Trevor Chow]]></dc:creator><pubDate>Mon, 21 Apr 2025 07:05:25 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/3f6e6104-ed07-48ab-b3f6-5ab4d629059a_1162x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>For two years after GPT-4, every model OpenAI released was smaller. On February 27, 2025, that finally changed. OpenAI launched GPT-4.5, calling it &#8220;<a href="https://cdn.openai.com/gpt-4-5-system-card-2272025.pdf">the next step in scaling the unsupervised learning paradigm</a>&#8221;. Here&#8217;s what people thought:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8rxK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc4c8494-99d8-4d14-a226-a19bdc59018f_1178x342.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8rxK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc4c8494-99d8-4d14-a226-a19bdc59018f_1178x342.png 424w, https://substackcdn.com/image/fetch/$s_!8rxK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc4c8494-99d8-4d14-a226-a19bdc59018f_1178x342.png 848w, https://substackcdn.com/image/fetch/$s_!8rxK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc4c8494-99d8-4d14-a226-a19bdc59018f_1178x342.png 1272w, https://substackcdn.com/image/fetch/$s_!8rxK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc4c8494-99d8-4d14-a226-a19bdc59018f_1178x342.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8rxK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc4c8494-99d8-4d14-a226-a19bdc59018f_1178x342.png" width="1178" height="342" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bc4c8494-99d8-4d14-a226-a19bdc59018f_1178x342.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:342,&quot;width&quot;:1178,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8rxK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc4c8494-99d8-4d14-a226-a19bdc59018f_1178x342.png 424w, https://substackcdn.com/image/fetch/$s_!8rxK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc4c8494-99d8-4d14-a226-a19bdc59018f_1178x342.png 848w, https://substackcdn.com/image/fetch/$s_!8rxK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc4c8494-99d8-4d14-a226-a19bdc59018f_1178x342.png 1272w, https://substackcdn.com/image/fetch/$s_!8rxK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc4c8494-99d8-4d14-a226-a19bdc59018f_1178x342.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9SHq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6585c3bf-c9e3-4bde-8506-b91cfcce2ca3_1164x468.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9SHq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6585c3bf-c9e3-4bde-8506-b91cfcce2ca3_1164x468.png 424w, https://substackcdn.com/image/fetch/$s_!9SHq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6585c3bf-c9e3-4bde-8506-b91cfcce2ca3_1164x468.png 848w, https://substackcdn.com/image/fetch/$s_!9SHq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6585c3bf-c9e3-4bde-8506-b91cfcce2ca3_1164x468.png 1272w, https://substackcdn.com/image/fetch/$s_!9SHq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6585c3bf-c9e3-4bde-8506-b91cfcce2ca3_1164x468.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9SHq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6585c3bf-c9e3-4bde-8506-b91cfcce2ca3_1164x468.png" width="1164" height="468" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6585c3bf-c9e3-4bde-8506-b91cfcce2ca3_1164x468.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:468,&quot;width&quot;:1164,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9SHq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6585c3bf-c9e3-4bde-8506-b91cfcce2ca3_1164x468.png 424w, https://substackcdn.com/image/fetch/$s_!9SHq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6585c3bf-c9e3-4bde-8506-b91cfcce2ca3_1164x468.png 848w, https://substackcdn.com/image/fetch/$s_!9SHq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6585c3bf-c9e3-4bde-8506-b91cfcce2ca3_1164x468.png 1272w, https://substackcdn.com/image/fetch/$s_!9SHq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6585c3bf-c9e3-4bde-8506-b91cfcce2ca3_1164x468.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OkWy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffada1ab2-b3bb-460f-9c4f-150cc189b998_1160x424.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OkWy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffada1ab2-b3bb-460f-9c4f-150cc189b998_1160x424.png 424w, https://substackcdn.com/image/fetch/$s_!OkWy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffada1ab2-b3bb-460f-9c4f-150cc189b998_1160x424.png 848w, https://substackcdn.com/image/fetch/$s_!OkWy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffada1ab2-b3bb-460f-9c4f-150cc189b998_1160x424.png 1272w, https://substackcdn.com/image/fetch/$s_!OkWy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffada1ab2-b3bb-460f-9c4f-150cc189b998_1160x424.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OkWy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffada1ab2-b3bb-460f-9c4f-150cc189b998_1160x424.png" width="1160" height="424" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fada1ab2-b3bb-460f-9c4f-150cc189b998_1160x424.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:424,&quot;width&quot;:1160,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OkWy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffada1ab2-b3bb-460f-9c4f-150cc189b998_1160x424.png 424w, https://substackcdn.com/image/fetch/$s_!OkWy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffada1ab2-b3bb-460f-9c4f-150cc189b998_1160x424.png 848w, https://substackcdn.com/image/fetch/$s_!OkWy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffada1ab2-b3bb-460f-9c4f-150cc189b998_1160x424.png 1272w, https://substackcdn.com/image/fetch/$s_!OkWy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffada1ab2-b3bb-460f-9c4f-150cc189b998_1160x424.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nnmc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd20a536b-6e9b-47ab-98ad-ce4904928047_1166x328.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nnmc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd20a536b-6e9b-47ab-98ad-ce4904928047_1166x328.png 424w, https://substackcdn.com/image/fetch/$s_!nnmc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd20a536b-6e9b-47ab-98ad-ce4904928047_1166x328.png 848w, https://substackcdn.com/image/fetch/$s_!nnmc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd20a536b-6e9b-47ab-98ad-ce4904928047_1166x328.png 1272w, https://substackcdn.com/image/fetch/$s_!nnmc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd20a536b-6e9b-47ab-98ad-ce4904928047_1166x328.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nnmc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd20a536b-6e9b-47ab-98ad-ce4904928047_1166x328.png" width="1166" height="328" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d20a536b-6e9b-47ab-98ad-ce4904928047_1166x328.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:328,&quot;width&quot;:1166,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nnmc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd20a536b-6e9b-47ab-98ad-ce4904928047_1166x328.png 424w, https://substackcdn.com/image/fetch/$s_!nnmc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd20a536b-6e9b-47ab-98ad-ce4904928047_1166x328.png 848w, https://substackcdn.com/image/fetch/$s_!nnmc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd20a536b-6e9b-47ab-98ad-ce4904928047_1166x328.png 1272w, https://substackcdn.com/image/fetch/$s_!nnmc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd20a536b-6e9b-47ab-98ad-ce4904928047_1166x328.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The launch video didn&#8217;t do the model any favours. Rather than focusing on GPT-4.5&#8217;s strengths, it highlighted how small the improvements were relative to GPT-4o and that it underperformed o3-mini, a 50x cheaper model.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lXSS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2d0655b-7867-49c9-bf33-28ed93865661_1046x454.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lXSS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2d0655b-7867-49c9-bf33-28ed93865661_1046x454.png 424w, https://substackcdn.com/image/fetch/$s_!lXSS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2d0655b-7867-49c9-bf33-28ed93865661_1046x454.png 848w, https://substackcdn.com/image/fetch/$s_!lXSS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2d0655b-7867-49c9-bf33-28ed93865661_1046x454.png 1272w, https://substackcdn.com/image/fetch/$s_!lXSS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2d0655b-7867-49c9-bf33-28ed93865661_1046x454.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lXSS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2d0655b-7867-49c9-bf33-28ed93865661_1046x454.png" width="1046" height="454" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c2d0655b-7867-49c9-bf33-28ed93865661_1046x454.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:454,&quot;width&quot;:1046,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lXSS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2d0655b-7867-49c9-bf33-28ed93865661_1046x454.png 424w, https://substackcdn.com/image/fetch/$s_!lXSS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2d0655b-7867-49c9-bf33-28ed93865661_1046x454.png 848w, https://substackcdn.com/image/fetch/$s_!lXSS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2d0655b-7867-49c9-bf33-28ed93865661_1046x454.png 1272w, https://substackcdn.com/image/fetch/$s_!lXSS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2d0655b-7867-49c9-bf33-28ed93865661_1046x454.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Only a month and a half after the model&#8217;s release, they announced it was <a href="https://techcrunch.com/2025/04/14/openai-plans-to-wind-down-gpt-4-5-its-largest-ever-ai-model-in-its-api/">going to be deprecated</a>.</p><h3>Why did GPT&#8209;4.5 disappoint?</h3><p>When GPT-4 was released in 2023, it was seen as something fundamentally unprecedented. <a href="https://arxiv.org/pdf/2303.08774">It blew past GPT-3.5 on the majority of benchmarks</a>, and in some cases, saturated them entirely by reaching human-level performance.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fD4h!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9f0e307-49f2-47e6-9e34-500f27119efa_1164x622.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fD4h!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9f0e307-49f2-47e6-9e34-500f27119efa_1164x622.png 424w, https://substackcdn.com/image/fetch/$s_!fD4h!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9f0e307-49f2-47e6-9e34-500f27119efa_1164x622.png 848w, https://substackcdn.com/image/fetch/$s_!fD4h!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9f0e307-49f2-47e6-9e34-500f27119efa_1164x622.png 1272w, https://substackcdn.com/image/fetch/$s_!fD4h!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9f0e307-49f2-47e6-9e34-500f27119efa_1164x622.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fD4h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9f0e307-49f2-47e6-9e34-500f27119efa_1164x622.png" width="1164" height="622" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a9f0e307-49f2-47e6-9e34-500f27119efa_1164x622.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:622,&quot;width&quot;:1164,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fD4h!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9f0e307-49f2-47e6-9e34-500f27119efa_1164x622.png 424w, https://substackcdn.com/image/fetch/$s_!fD4h!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9f0e307-49f2-47e6-9e34-500f27119efa_1164x622.png 848w, https://substackcdn.com/image/fetch/$s_!fD4h!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9f0e307-49f2-47e6-9e34-500f27119efa_1164x622.png 1272w, https://substackcdn.com/image/fetch/$s_!fD4h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9f0e307-49f2-47e6-9e34-500f27119efa_1164x622.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>GPT-4&#8217;s launch also came with a <a href="https://www.youtube.com/watch?v=outcGtbnMuQ">developer livestream</a> full of feats that seemed unimaginable for GPT-3.5, such as converting a hand-drawn wireframe into a fully functioning website in a single request.</p><p>Where GPT-4 felt like a step change, GPT-4.5 felt like a letdown.</p><p>The biggest contributor to that was the many models released in between the two. Just from OpenAI, we saw updates to GPT-4, GPT-4 Turbo, the many iterations of GPT-4o, and the o-series of reasoning models. This made it hard to see the gains from purely increasing pre-training compute from GPT-4 to GPT-4.5 and begged the question: is pre-training a dead end?</p><h3>Is Pre-Training A Dead End?</h3><p>To answer this question, we need to know what happens when we only vary pre-training compute.</p><p>A natural experiment in scaling pre-training comes from Meta, who released <a href="https://arxiv.org/abs/2407.21783">8 different text-only Llama 3 language models</a> in 2024. The models range from 1B to 405B parameters, and since they came out over the course of only 8 months, we assume that there was relatively little compute efficiency improvement.</p><p>Based on these models, we can approximate a scaling relationship between the amount of pre-training compute and benchmark performance. This lets us extrapolate how GPT-4.5 should perform relative to GPT-4, since we know that it used <a href="https://x.com/karpathy/status/1895213020982472863">10x more pre-training compute</a> than GPT-4<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>.</p><p>What we find is that GPT-4.5 is on track with, if not beating, the pre-training trends<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a>:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!p-98!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6d2d6a9-44db-4778-850f-09a21cf9a5fd_1582x872.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!p-98!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6d2d6a9-44db-4778-850f-09a21cf9a5fd_1582x872.png 424w, https://substackcdn.com/image/fetch/$s_!p-98!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6d2d6a9-44db-4778-850f-09a21cf9a5fd_1582x872.png 848w, https://substackcdn.com/image/fetch/$s_!p-98!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6d2d6a9-44db-4778-850f-09a21cf9a5fd_1582x872.png 1272w, https://substackcdn.com/image/fetch/$s_!p-98!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6d2d6a9-44db-4778-850f-09a21cf9a5fd_1582x872.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!p-98!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6d2d6a9-44db-4778-850f-09a21cf9a5fd_1582x872.png" width="1456" height="803" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f6d2d6a9-44db-4778-850f-09a21cf9a5fd_1582x872.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:803,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!p-98!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6d2d6a9-44db-4778-850f-09a21cf9a5fd_1582x872.png 424w, https://substackcdn.com/image/fetch/$s_!p-98!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6d2d6a9-44db-4778-850f-09a21cf9a5fd_1582x872.png 848w, https://substackcdn.com/image/fetch/$s_!p-98!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6d2d6a9-44db-4778-850f-09a21cf9a5fd_1582x872.png 1272w, https://substackcdn.com/image/fetch/$s_!p-98!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6d2d6a9-44db-4778-850f-09a21cf9a5fd_1582x872.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8l2u!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67d9d9e8-e22e-40b6-8a0b-3a59bd107d8e_1588x872.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8l2u!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67d9d9e8-e22e-40b6-8a0b-3a59bd107d8e_1588x872.png 424w, https://substackcdn.com/image/fetch/$s_!8l2u!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67d9d9e8-e22e-40b6-8a0b-3a59bd107d8e_1588x872.png 848w, https://substackcdn.com/image/fetch/$s_!8l2u!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67d9d9e8-e22e-40b6-8a0b-3a59bd107d8e_1588x872.png 1272w, https://substackcdn.com/image/fetch/$s_!8l2u!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67d9d9e8-e22e-40b6-8a0b-3a59bd107d8e_1588x872.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8l2u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67d9d9e8-e22e-40b6-8a0b-3a59bd107d8e_1588x872.png" width="1456" height="800" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/67d9d9e8-e22e-40b6-8a0b-3a59bd107d8e_1588x872.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8l2u!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67d9d9e8-e22e-40b6-8a0b-3a59bd107d8e_1588x872.png 424w, https://substackcdn.com/image/fetch/$s_!8l2u!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67d9d9e8-e22e-40b6-8a0b-3a59bd107d8e_1588x872.png 848w, https://substackcdn.com/image/fetch/$s_!8l2u!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67d9d9e8-e22e-40b6-8a0b-3a59bd107d8e_1588x872.png 1272w, https://substackcdn.com/image/fetch/$s_!8l2u!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F67d9d9e8-e22e-40b6-8a0b-3a59bd107d8e_1588x872.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sffu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54c2df68-a9cb-4656-af5e-1ec7fc69f79e_1588x876.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sffu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54c2df68-a9cb-4656-af5e-1ec7fc69f79e_1588x876.png 424w, https://substackcdn.com/image/fetch/$s_!sffu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54c2df68-a9cb-4656-af5e-1ec7fc69f79e_1588x876.png 848w, https://substackcdn.com/image/fetch/$s_!sffu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54c2df68-a9cb-4656-af5e-1ec7fc69f79e_1588x876.png 1272w, https://substackcdn.com/image/fetch/$s_!sffu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54c2df68-a9cb-4656-af5e-1ec7fc69f79e_1588x876.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sffu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54c2df68-a9cb-4656-af5e-1ec7fc69f79e_1588x876.png" width="1456" height="803" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/54c2df68-a9cb-4656-af5e-1ec7fc69f79e_1588x876.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:803,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sffu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54c2df68-a9cb-4656-af5e-1ec7fc69f79e_1588x876.png 424w, https://substackcdn.com/image/fetch/$s_!sffu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54c2df68-a9cb-4656-af5e-1ec7fc69f79e_1588x876.png 848w, https://substackcdn.com/image/fetch/$s_!sffu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54c2df68-a9cb-4656-af5e-1ec7fc69f79e_1588x876.png 1272w, https://substackcdn.com/image/fetch/$s_!sffu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F54c2df68-a9cb-4656-af5e-1ec7fc69f79e_1588x876.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!A2v2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F654eab2a-a865-4d1e-a509-1f887ff4ed56_1584x870.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!A2v2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F654eab2a-a865-4d1e-a509-1f887ff4ed56_1584x870.png 424w, https://substackcdn.com/image/fetch/$s_!A2v2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F654eab2a-a865-4d1e-a509-1f887ff4ed56_1584x870.png 848w, https://substackcdn.com/image/fetch/$s_!A2v2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F654eab2a-a865-4d1e-a509-1f887ff4ed56_1584x870.png 1272w, https://substackcdn.com/image/fetch/$s_!A2v2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F654eab2a-a865-4d1e-a509-1f887ff4ed56_1584x870.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!A2v2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F654eab2a-a865-4d1e-a509-1f887ff4ed56_1584x870.png" width="1456" height="800" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/654eab2a-a865-4d1e-a509-1f887ff4ed56_1584x870.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!A2v2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F654eab2a-a865-4d1e-a509-1f887ff4ed56_1584x870.png 424w, https://substackcdn.com/image/fetch/$s_!A2v2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F654eab2a-a865-4d1e-a509-1f887ff4ed56_1584x870.png 848w, https://substackcdn.com/image/fetch/$s_!A2v2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F654eab2a-a865-4d1e-a509-1f887ff4ed56_1584x870.png 1272w, https://substackcdn.com/image/fetch/$s_!A2v2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F654eab2a-a865-4d1e-a509-1f887ff4ed56_1584x870.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PRm4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd1c7f82-e633-4fae-8ef6-90d1b777a0c1_1264x684.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PRm4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd1c7f82-e633-4fae-8ef6-90d1b777a0c1_1264x684.png 424w, https://substackcdn.com/image/fetch/$s_!PRm4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd1c7f82-e633-4fae-8ef6-90d1b777a0c1_1264x684.png 848w, https://substackcdn.com/image/fetch/$s_!PRm4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd1c7f82-e633-4fae-8ef6-90d1b777a0c1_1264x684.png 1272w, https://substackcdn.com/image/fetch/$s_!PRm4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd1c7f82-e633-4fae-8ef6-90d1b777a0c1_1264x684.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PRm4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd1c7f82-e633-4fae-8ef6-90d1b777a0c1_1264x684.png" width="1264" height="684" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cd1c7f82-e633-4fae-8ef6-90d1b777a0c1_1264x684.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:684,&quot;width&quot;:1264,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PRm4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd1c7f82-e633-4fae-8ef6-90d1b777a0c1_1264x684.png 424w, https://substackcdn.com/image/fetch/$s_!PRm4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd1c7f82-e633-4fae-8ef6-90d1b777a0c1_1264x684.png 848w, https://substackcdn.com/image/fetch/$s_!PRm4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd1c7f82-e633-4fae-8ef6-90d1b777a0c1_1264x684.png 1272w, https://substackcdn.com/image/fetch/$s_!PRm4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd1c7f82-e633-4fae-8ef6-90d1b777a0c1_1264x684.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!I1-U!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff015db52-41e7-4c7a-a49d-f5a913d9f0e5_1586x874.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!I1-U!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff015db52-41e7-4c7a-a49d-f5a913d9f0e5_1586x874.png 424w, https://substackcdn.com/image/fetch/$s_!I1-U!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff015db52-41e7-4c7a-a49d-f5a913d9f0e5_1586x874.png 848w, https://substackcdn.com/image/fetch/$s_!I1-U!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff015db52-41e7-4c7a-a49d-f5a913d9f0e5_1586x874.png 1272w, https://substackcdn.com/image/fetch/$s_!I1-U!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff015db52-41e7-4c7a-a49d-f5a913d9f0e5_1586x874.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!I1-U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff015db52-41e7-4c7a-a49d-f5a913d9f0e5_1586x874.png" width="1456" height="802" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f015db52-41e7-4c7a-a49d-f5a913d9f0e5_1586x874.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:802,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!I1-U!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff015db52-41e7-4c7a-a49d-f5a913d9f0e5_1586x874.png 424w, https://substackcdn.com/image/fetch/$s_!I1-U!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff015db52-41e7-4c7a-a49d-f5a913d9f0e5_1586x874.png 848w, https://substackcdn.com/image/fetch/$s_!I1-U!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff015db52-41e7-4c7a-a49d-f5a913d9f0e5_1586x874.png 1272w, https://substackcdn.com/image/fetch/$s_!I1-U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff015db52-41e7-4c7a-a49d-f5a913d9f0e5_1586x874.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IUlG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89228352-15f9-4179-869b-8a9ccf91d28c_1584x858.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IUlG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89228352-15f9-4179-869b-8a9ccf91d28c_1584x858.png 424w, https://substackcdn.com/image/fetch/$s_!IUlG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89228352-15f9-4179-869b-8a9ccf91d28c_1584x858.png 848w, https://substackcdn.com/image/fetch/$s_!IUlG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89228352-15f9-4179-869b-8a9ccf91d28c_1584x858.png 1272w, https://substackcdn.com/image/fetch/$s_!IUlG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89228352-15f9-4179-869b-8a9ccf91d28c_1584x858.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IUlG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89228352-15f9-4179-869b-8a9ccf91d28c_1584x858.png" width="1456" height="789" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/89228352-15f9-4179-869b-8a9ccf91d28c_1584x858.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:789,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!IUlG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89228352-15f9-4179-869b-8a9ccf91d28c_1584x858.png 424w, https://substackcdn.com/image/fetch/$s_!IUlG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89228352-15f9-4179-869b-8a9ccf91d28c_1584x858.png 848w, https://substackcdn.com/image/fetch/$s_!IUlG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89228352-15f9-4179-869b-8a9ccf91d28c_1584x858.png 1272w, https://substackcdn.com/image/fetch/$s_!IUlG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89228352-15f9-4179-869b-8a9ccf91d28c_1584x858.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In other words, scaling still works. However, there&#8217;s a very good reason most frontier models have been smaller than GPT-4<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a>. It&#8217;s because researchers found ways to get better performance for far less money in other parts of the training stack.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GDrt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde2624d2-77f4-4ac7-8fd4-f527ac53c7c9_1264x684.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GDrt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde2624d2-77f4-4ac7-8fd4-f527ac53c7c9_1264x684.png 424w, https://substackcdn.com/image/fetch/$s_!GDrt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde2624d2-77f4-4ac7-8fd4-f527ac53c7c9_1264x684.png 848w, https://substackcdn.com/image/fetch/$s_!GDrt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde2624d2-77f4-4ac7-8fd4-f527ac53c7c9_1264x684.png 1272w, https://substackcdn.com/image/fetch/$s_!GDrt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde2624d2-77f4-4ac7-8fd4-f527ac53c7c9_1264x684.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GDrt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde2624d2-77f4-4ac7-8fd4-f527ac53c7c9_1264x684.png" width="1264" height="684" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/de2624d2-77f4-4ac7-8fd4-f527ac53c7c9_1264x684.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:684,&quot;width&quot;:1264,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GDrt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde2624d2-77f4-4ac7-8fd4-f527ac53c7c9_1264x684.png 424w, https://substackcdn.com/image/fetch/$s_!GDrt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde2624d2-77f4-4ac7-8fd4-f527ac53c7c9_1264x684.png 848w, https://substackcdn.com/image/fetch/$s_!GDrt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde2624d2-77f4-4ac7-8fd4-f527ac53c7c9_1264x684.png 1272w, https://substackcdn.com/image/fetch/$s_!GDrt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde2624d2-77f4-4ac7-8fd4-f527ac53c7c9_1264x684.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Consider the models from the mid-2024 vintage, like GPT-4o and Claude Sonnet 3.6. They were 4 to 8 times smaller while still having comparable or better performance. This was driven by a range of techniques in <a href="https://vintagedata.org/blog/posts/what-is-mid-training">mid-training</a> and <a href="https://www.interconnects.ai/p/the-state-of-post-training-2025">post-training</a>:</p><ul><li><p>Annealing on higher quality, distributionally different or longer context data</p></li><li><p>Instruction-tuning on synthetic data</p></li><li><p>Preference-tuning on human and AI feedback e.g. reinforcement learning from human feedback (RLHF)</p></li></ul><p>Meanwhile, the late 2024 / early 2025 vintage, like o1 and o3, showed staggering results by <a href="https://tmychow.substack.com/p/from-apples-to-strawberries">doing reinforcement learning</a> (RL) on chains of thought (CoT).</p><h3>The Future of Model Training</h3><p>Scaling RL and CoT is a very effective strategy for getting improvements at specific domains. That&#8217;s why the trend in the immediate future will be continuing in that direction. However, there will come a point when the industry&#8217;s focus will return to pre-training.</p><p>This is because of the economic realities of serving models.</p><p>Firstly, if customers have a range of different use cases, it will require doing RL training on a very large set of domains. This is especially because RL in one area can degrade unrelated capabilities. Models that have been <a href="https://dynomight.net/chess/">RLHF&#8217;ed for chat</a> are worse at playing chess than peers which are <a href="https://x.com/GrantSlatton/status/1703913578036904431">closer to just pre-trained models</a>, and anecdotally, Claude 3.7 Sonnet feels more reward-hacky than its less RL&#8217;ed predecessor. Doing lots of RL is expensive<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a>.</p><p>Secondly, if customers are using many CoT tokens at inference, models will become very expensive to serve. It will eventually become cheaper to have a bigger and smarter model which requires fewer CoT tokens to get the answer<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!O3gS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27daa8ed-632d-48ed-8aa3-2115d1aadf9b_1266x686.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!O3gS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27daa8ed-632d-48ed-8aa3-2115d1aadf9b_1266x686.png 424w, https://substackcdn.com/image/fetch/$s_!O3gS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27daa8ed-632d-48ed-8aa3-2115d1aadf9b_1266x686.png 848w, https://substackcdn.com/image/fetch/$s_!O3gS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27daa8ed-632d-48ed-8aa3-2115d1aadf9b_1266x686.png 1272w, https://substackcdn.com/image/fetch/$s_!O3gS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27daa8ed-632d-48ed-8aa3-2115d1aadf9b_1266x686.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!O3gS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27daa8ed-632d-48ed-8aa3-2115d1aadf9b_1266x686.png" width="1266" height="686" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/27daa8ed-632d-48ed-8aa3-2115d1aadf9b_1266x686.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:686,&quot;width&quot;:1266,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!O3gS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27daa8ed-632d-48ed-8aa3-2115d1aadf9b_1266x686.png 424w, https://substackcdn.com/image/fetch/$s_!O3gS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27daa8ed-632d-48ed-8aa3-2115d1aadf9b_1266x686.png 848w, https://substackcdn.com/image/fetch/$s_!O3gS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27daa8ed-632d-48ed-8aa3-2115d1aadf9b_1266x686.png 1272w, https://substackcdn.com/image/fetch/$s_!O3gS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27daa8ed-632d-48ed-8aa3-2115d1aadf9b_1266x686.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_nuj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e70046f-b52d-4da1-9051-dfe0418c5bd7_1268x822.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_nuj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e70046f-b52d-4da1-9051-dfe0418c5bd7_1268x822.png 424w, https://substackcdn.com/image/fetch/$s_!_nuj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e70046f-b52d-4da1-9051-dfe0418c5bd7_1268x822.png 848w, https://substackcdn.com/image/fetch/$s_!_nuj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e70046f-b52d-4da1-9051-dfe0418c5bd7_1268x822.png 1272w, https://substackcdn.com/image/fetch/$s_!_nuj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e70046f-b52d-4da1-9051-dfe0418c5bd7_1268x822.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_nuj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e70046f-b52d-4da1-9051-dfe0418c5bd7_1268x822.png" width="1268" height="822" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0e70046f-b52d-4da1-9051-dfe0418c5bd7_1268x822.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:822,&quot;width&quot;:1268,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_nuj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e70046f-b52d-4da1-9051-dfe0418c5bd7_1268x822.png 424w, https://substackcdn.com/image/fetch/$s_!_nuj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e70046f-b52d-4da1-9051-dfe0418c5bd7_1268x822.png 848w, https://substackcdn.com/image/fetch/$s_!_nuj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e70046f-b52d-4da1-9051-dfe0418c5bd7_1268x822.png 1272w, https://substackcdn.com/image/fetch/$s_!_nuj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e70046f-b52d-4da1-9051-dfe0418c5bd7_1268x822.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>It would be far cheaper if there were a way to make the model both more intelligent across domains and also more intelligent in absolute terms, such that it requires fewer tokens to get to the same quality output at inference time. This is exactly what pre-training gets you<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a>.</p><p>Thus, pre-training will eventually come back and the OOMs of compute will continue to grow. Pre-training isn&#8217;t dead, it&#8217;s just resting.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-pM8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd516462f-0529-4a64-b192-0e3ab2c705b8_1000x552.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-pM8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd516462f-0529-4a64-b192-0e3ab2c705b8_1000x552.png 424w, https://substackcdn.com/image/fetch/$s_!-pM8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd516462f-0529-4a64-b192-0e3ab2c705b8_1000x552.png 848w, https://substackcdn.com/image/fetch/$s_!-pM8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd516462f-0529-4a64-b192-0e3ab2c705b8_1000x552.png 1272w, https://substackcdn.com/image/fetch/$s_!-pM8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd516462f-0529-4a64-b192-0e3ab2c705b8_1000x552.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-pM8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd516462f-0529-4a64-b192-0e3ab2c705b8_1000x552.png" width="1000" height="552" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d516462f-0529-4a64-b192-0e3ab2c705b8_1000x552.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:552,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-pM8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd516462f-0529-4a64-b192-0e3ab2c705b8_1000x552.png 424w, https://substackcdn.com/image/fetch/$s_!-pM8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd516462f-0529-4a64-b192-0e3ab2c705b8_1000x552.png 848w, https://substackcdn.com/image/fetch/$s_!-pM8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd516462f-0529-4a64-b192-0e3ab2c705b8_1000x552.png 1272w, https://substackcdn.com/image/fetch/$s_!-pM8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd516462f-0529-4a64-b192-0e3ab2c705b8_1000x552.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><p>Thanks to <a href="https://x.com/aidan_mclau">Aidan</a>, <a href="https://x.com/sambrashears">Sam</a> and <a href="https://x.com/SherylHsu02">Sheryl</a> for reading drafts of this.</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>We estimate GPT-4&#8217;s pre-training compute to be 2.5e25. This is based on a few approaches. Firstly, there are reports that it had <a href="https://semianalysis.com/2023/07/10/gpt-4-architecture-infrastructure/">280 billion active parameters and was trained on 13 trillion tokens</a>, giving an estimate of 6 x 2.8E+11 x 1.3E+13 = 2E+25 FLOPs. Secondly, there are reports that it was trained on 25000 A100s for 3 months at 33% efficiency. Each A100 can do 312 TeraFLOPs per second, giving an estimate of 25000 x 90 x 24 x 60 x 60 x 3.12E+14 x 0.33 = 2E+25 FLOPs. Thirdly, the largest GPT-3 model took 3e23 FLOPs to pre-train and there is a <a href="https://x.com/karpathy/status/1895213020982472863">100x compute increase between GPT-n and GPT-n+1</a>.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>We estimate what GPT-4.5&#8217;s performance should be by plugging its pre-training compute into the estimated relationship directly. An alternative method is to take the difference in pre-training compute between GPT-4 and GPT-4.5 and use the estimated relationship to get the increase in performance expected from GPT-4.5 relative to GPT-4. Given the former came out two years later, this calculation should consider &#8220;effective compute&#8221; i.e. physical compute x compute efficiency. If we assume that we get <a href="https://www.darioamodei.com/post/on-deepseek-and-export-controls">4x compute efficiency gains per year</a>, this yields very similar results to our first method.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p><a href="https://epoch.ai/gradient-updates/frontier-language-models-have-become-much-smaller">Epoch AI</a> uses the inference cost of a model and tokens per second it is served at to approximate its size, giving an estimate for GPT-4o and Sonnet. We extend this to o1 and o3 using tokens per second data from <a href="https://artificialanalysis.ai/">Artificial Analysis</a>.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>Based on OpenAI&#8217;s data from the <a href="https://openai.com/index/introducing-o3-and-o4-mini/">o3 launch</a>.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p>Based on OpenAI&#8217;s data from the <a href="https://openai.com/index/introducing-o3-and-o4-mini/">o3 launch</a>.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-6" href="#footnote-anchor-6" class="footnote-number" contenteditable="false" target="_self">6</a><div class="footnote-content"><p>For example, Anthropic CEO Dario Amodei has said that &#8220;<a href="https://youtu.be/om2lIWXLLN4?t=177">the whole reason for scaling these models up was that the models weren&#8217;t smart enough to do RLHF on top of</a>&#8221;. We also have <a href="https://x.com/kalomaze/status/1896040259915497715">evidence from open-source efforts</a> that RL works <a href="https://x.com/HrishbhDalal/status/1899152460800840007">much better on models with more pre-training compute</a>.</p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[The Intelligence Consolidation]]></title><description><![CDATA[Scaling laws reward consolidation, but investors are diversifying anyways]]></description><link>https://tmychow.substack.com/p/the-intelligence-consolidation</link><guid isPermaLink="false">https://tmychow.substack.com/p/the-intelligence-consolidation</guid><dc:creator><![CDATA[Trevor Chow]]></dc:creator><pubDate>Wed, 09 Apr 2025 07:07:12 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/ae81d1ba-4882-4666-be68-17e16d200704_1660x1187.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Over the past three years, we have seen the rise and fall of dozens of frontier AI labs.</p><p>In many cases, these labs have had shared investors: Sequoia Capital has invested in OpenAI, xAI, SSI, Reflection AI, Magic, Keen Technologies and Harmonic, while Andreessen Horowitz has invested in OpenAI, xAI, SSI, Character AI, Mistral AI, and Anysphere.</p><p>If you are investing in five or more AI labs, is that because you expect there to be an oligopoly where each oligopolist is a good investment, or do you simply lack conviction on who will win?</p><p>The former is a popular story, with a <a href="https://promorphcapital.com/wp-content/uploads/2023/06/mistral.ai-strategic-memo.pdf?ref=nocode.ai">leaked 2023 Mistral memo</a> claiming that &#8220;an oligopoly is shaping up&#8221;. At first glance, this tracks with API usage on <a href="https://openrouter.ai/rankings?view=month">OpenRouter</a>:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!atlr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F212573aa-3aa8-408d-b5ce-3e6318e1b254_1454x704.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!atlr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F212573aa-3aa8-408d-b5ce-3e6318e1b254_1454x704.png 424w, https://substackcdn.com/image/fetch/$s_!atlr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F212573aa-3aa8-408d-b5ce-3e6318e1b254_1454x704.png 848w, https://substackcdn.com/image/fetch/$s_!atlr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F212573aa-3aa8-408d-b5ce-3e6318e1b254_1454x704.png 1272w, https://substackcdn.com/image/fetch/$s_!atlr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F212573aa-3aa8-408d-b5ce-3e6318e1b254_1454x704.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!atlr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F212573aa-3aa8-408d-b5ce-3e6318e1b254_1454x704.png" width="1454" height="704" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/212573aa-3aa8-408d-b5ce-3e6318e1b254_1454x704.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:704,&quot;width&quot;:1454,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:89629,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://tmychow.substack.com/i/160913056?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F212573aa-3aa8-408d-b5ce-3e6318e1b254_1454x704.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!atlr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F212573aa-3aa8-408d-b5ce-3e6318e1b254_1454x704.png 424w, https://substackcdn.com/image/fetch/$s_!atlr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F212573aa-3aa8-408d-b5ce-3e6318e1b254_1454x704.png 848w, https://substackcdn.com/image/fetch/$s_!atlr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F212573aa-3aa8-408d-b5ce-3e6318e1b254_1454x704.png 1272w, https://substackcdn.com/image/fetch/$s_!atlr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F212573aa-3aa8-408d-b5ce-3e6318e1b254_1454x704.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>One oligopoly story could be that different frontier AI labs are making different technical trade-offs, resulting in their own niche and specialties, and many of their fundraising announcements echo this sentiment:</p><ul><li><p>Magic&#8217;s 2024 fundraise focused on their <a href="https://magic.dev/blog/100m-token-context-windows">100m token context window</a></p></li><li><p>Cohere&#8217;s Series C focused on <a href="https://cohere.com/blog/announcement">enterprise use cases</a>, like RAG and embeddings</p></li><li><p>Mistral&#8217;s initial memo focused on privacy and domain-specific finetuning</p></li><li><p>Adept&#8217;s launch focused on <a href="https://www.adept.ai/blog/introducing-adept">tool- and computer-use models</a></p></li></ul><h2><strong>The proprietary trading industry</strong></h2><p>An industry that parallels this &#8220;many roads to Rome&#8221; quality is proprietary trading. Not only does it draw from a very similar talent pool as frontier AI labs, but it also sees strong differentiation between the few oligopolist players, due to the variety of asset classes and time horizons.</p><h3><strong>Secrets are all you need</strong></h3><p>Underpinning this differentiation is the incredible emphasis on secrecy. Beyond the &#8220;loose lips sink ships&#8221; culture that pervades the industry, all of these firms use non-competes that can be easily 3 years long. Even after you sit out this garden leave, you&#8217;ll still be bound by stringent confidentiality agreements.</p><p>These agreements are enforced litigiously. Perhaps the most famous and recent example is <a href="https://www.reuters.com/legal/hedge-funds-jane-street-millennium-settle-case-alleging-theft-trading-strategy-2024-12-05/">Jane Street suing Millennium</a>, but this is not an isolated incident. Back in 2003, Millennium faced a <a href="https://www.reuters.com/article/markets/us/renaissance-millennium-settle-trade-dispute-idUSN19159445/">similar lawsuit from Renaissance Technologies</a>, and in 2014, <a href="https://www.bloomberg.com/news/features/2018-11-19/the-triple-jeopardy-of-ke-xu-a-chinese-hedge-fund-quant?embedded-checkout=true">G-Research used private prosecution</a> to put one of their former researchers in prison for stealing their trading secrets.</p><p>This means these differentiated edges persist, helping firms preserve their market share in that asset class, even if they might otherwise struggle.</p><h3><strong>Flash boys: not so fast</strong></h3><p>One example of edge with this &#8220;secrets&#8221;-esque property of being difficult to replicate is the use of microwave tower networks to minimize trading latency.</p><p>Although its usage is well-known, optimal tower locations are severely limited, and there is only a limited amount of money to be made from being the fastest. Due to this steeply diminishing return, if you are a firm that has already invested heavily in this infrastructure, it might not be worth it for everyone else to engage in a red queen race to be faster than you.</p><p>Unsurprisingly, the industry has consolidated its investments. The microwave network in the Chicago-New Jersey corridor that connects America&#8217;s biggest exchanges boils down to three main groups:</p><ul><li><p>McKay Brothers, which is used by IMC, Tower Research, Citadel Securities, SIG, Jane Street etc.</p></li><li><p>New Line Networks, which is run by Jump Trading and Virtu Financial</p></li><li><p>Vigilant Global, which is run by DRW</p></li></ul><p>Thus, firms with access to this infrastructure can keep the market share available just by being fastest, since it is not worthwhile for new entrants to go after that edge.</p><p>Meanwhile, new players can go after new edges. For example, XTX Research, which famously is <a href="https://www.xtxmarkets.com/liquidity/our-7-market-making-principles/">opposed to the low latency game</a>, managed to break out and become the largest marketmaker in European cash equities in the span of a few years, thanks to their superior pricing and <a href="https://files.xtxmarkets.com/publications/kajaani/index.html">investment into AI data centers</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GFza!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66b5620d-f0e7-42a0-933e-866dab6901a4_1536x832.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GFza!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66b5620d-f0e7-42a0-933e-866dab6901a4_1536x832.png 424w, https://substackcdn.com/image/fetch/$s_!GFza!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66b5620d-f0e7-42a0-933e-866dab6901a4_1536x832.png 848w, https://substackcdn.com/image/fetch/$s_!GFza!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66b5620d-f0e7-42a0-933e-866dab6901a4_1536x832.png 1272w, https://substackcdn.com/image/fetch/$s_!GFza!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66b5620d-f0e7-42a0-933e-866dab6901a4_1536x832.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GFza!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66b5620d-f0e7-42a0-933e-866dab6901a4_1536x832.png" width="1456" height="789" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/66b5620d-f0e7-42a0-933e-866dab6901a4_1536x832.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:789,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:100538,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://tmychow.substack.com/i/160913056?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66b5620d-f0e7-42a0-933e-866dab6901a4_1536x832.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!GFza!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66b5620d-f0e7-42a0-933e-866dab6901a4_1536x832.png 424w, https://substackcdn.com/image/fetch/$s_!GFza!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66b5620d-f0e7-42a0-933e-866dab6901a4_1536x832.png 848w, https://substackcdn.com/image/fetch/$s_!GFza!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66b5620d-f0e7-42a0-933e-866dab6901a4_1536x832.png 1272w, https://substackcdn.com/image/fetch/$s_!GFza!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66b5620d-f0e7-42a0-933e-866dab6901a4_1536x832.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The combination of these two effects &#8211; the defensibility of secrets and diminishing returns to exploiting secrets &#8211; means proprietary trading has tended to be oligopolistic, with a few firms dominating each asset class where they have some persisting comparative advantage.</p><h3><strong>The bitter lesson strikes again</strong></h3><p>On the surface, it may seem like the different API use cases in AI map on to the different asset classes in proprietary trading, and each AI lab could focus on getting really good at one of them. For example, the top models on OpenRouter vary a lot by use case. In the recent weeks, these have been:</p><ul><li><p>Gemini Flash 2.0: Roleplay, marketing, translation, legal, trivia, academia</p></li><li><p>Claude 3.7 Sonnet: Programming, SEO</p></li><li><p>GPT-4o-mini: Technology, science</p></li><li><p>DeepSeek V3: Finance</p></li><li><p>Gemma 3 4B: Health</p></li></ul><p>However, the broader trend of overall usage makes this far less compelling, and suggests models have very limited sticking power. On average, a model only stays in the top 5 most used models for just over 6 weeks. This is drawn up by a few really long-lasting models, with the median landing at a mere 3 weeks. This is because API users tend to be very sensitive to performance.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iav_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4822577b-c8a1-4f9d-973c-25d9c87b4ca0_1412x810.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iav_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4822577b-c8a1-4f9d-973c-25d9c87b4ca0_1412x810.png 424w, https://substackcdn.com/image/fetch/$s_!iav_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4822577b-c8a1-4f9d-973c-25d9c87b4ca0_1412x810.png 848w, https://substackcdn.com/image/fetch/$s_!iav_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4822577b-c8a1-4f9d-973c-25d9c87b4ca0_1412x810.png 1272w, https://substackcdn.com/image/fetch/$s_!iav_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4822577b-c8a1-4f9d-973c-25d9c87b4ca0_1412x810.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iav_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4822577b-c8a1-4f9d-973c-25d9c87b4ca0_1412x810.png" width="1412" height="810" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4822577b-c8a1-4f9d-973c-25d9c87b4ca0_1412x810.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:810,&quot;width&quot;:1412,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:85945,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://tmychow.substack.com/i/160913056?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4822577b-c8a1-4f9d-973c-25d9c87b4ca0_1412x810.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iav_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4822577b-c8a1-4f9d-973c-25d9c87b4ca0_1412x810.png 424w, https://substackcdn.com/image/fetch/$s_!iav_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4822577b-c8a1-4f9d-973c-25d9c87b4ca0_1412x810.png 848w, https://substackcdn.com/image/fetch/$s_!iav_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4822577b-c8a1-4f9d-973c-25d9c87b4ca0_1412x810.png 1272w, https://substackcdn.com/image/fetch/$s_!iav_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4822577b-c8a1-4f9d-973c-25d9c87b4ca0_1412x810.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>So far, performance in AI has been defined by the <a href="https://www.cs.utexas.edu/~eunsol/courses/data/bitter_lesson.pdf">Bitter Lesson</a>: that &#8220;leveraging of computation&#8221; is more effective than leveraging &#8220;human knowledge of the domain&#8221;. Thus, performance improvements have not been isolated to specific niches, and instead depend on how much money is deployed on compute.</p><p>This means that secrets are much less important, and even if they were, the regulatory prohibition on non-competes in California, coupled with its open tech culture, means secrets do not stay secret for very long.</p><h2><strong>Scaling up to 2028</strong></h2><p>To put a number on compute, the jump between GPT-4 and GPT-4.5 over the past 2 years has involved increasing effective compute by two orders of magnitude, of which half came from algorithmic progress and half came from more physical compute. If we assume that algorithmic progress continues at its current rate and want to maintain the same overall rate of scaling, we&#8217;d need to increase physical compute by another 1.5 orders of magnitude from today to 2028<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>.</p><p>Since GPT-4 took around $63M to train, the model in 2028 should take 300 times that, which is around $19B. A naive estimate for converting the training run costs into the capex spending required for the datacenter is to assume that the cost of the datacenter gets paid back by the total value of training runs you can do across its operational lifetime. Assuming a 4 year depreciation schedule, the GPT-4 training run which took 3 months (and 25000 A100s) implies a 16x multiple, giving a total cost of $1B.</p><p>With A100s going for $10k a pop, this gives a 4x multiple between the cost of the raw GPUs and the cost of the datacenter. As a sense-check, xAI's $700M Atlanta datacenter had around 12000 H100s and 370 A100s. H100s are around triple the price of A100s, leaving a GPU cost of $364M and a 2x multiple.</p><p>These bounds imply that a 2028 training run would require an upfront datacenter capex of anywhere between $152B and $304B. What are the market dynamics which this implies?</p><h3><strong>Incumbency in a capital intensive market</strong></h3><p>One analogy which has this similarly ramping capital intensity, and happens to be AI&#8217;s most critical upstream dependency, is chipmaking.</p><p>If we take a look at the CPU market, Intel became the dominant market player after IBM selected its x86 CPUs for the original IBM PC. Within 10 years, it had reached 80-90% market share. AMD would claw its way to 25% market share in the mid 2000s, drop back into irrelevance by 2015 and start gaining market share again in the late 2010s.</p><p>What this shows is that while a single generation of technical edge can help you grab market share, the economies of scale needed to amortize good research still gives the incumbent a lot of momentum. Thus the brief period of weakness in the mid 2000s was not enough, and it took a decade of Intel stagnation to truly break its monopoly.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oDV6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F719e4ff6-fe57-4b8f-9aa3-f4c8b70880f0_797x480.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oDV6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F719e4ff6-fe57-4b8f-9aa3-f4c8b70880f0_797x480.png 424w, https://substackcdn.com/image/fetch/$s_!oDV6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F719e4ff6-fe57-4b8f-9aa3-f4c8b70880f0_797x480.png 848w, https://substackcdn.com/image/fetch/$s_!oDV6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F719e4ff6-fe57-4b8f-9aa3-f4c8b70880f0_797x480.png 1272w, https://substackcdn.com/image/fetch/$s_!oDV6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F719e4ff6-fe57-4b8f-9aa3-f4c8b70880f0_797x480.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oDV6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F719e4ff6-fe57-4b8f-9aa3-f4c8b70880f0_797x480.png" width="797" height="480" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/719e4ff6-fe57-4b8f-9aa3-f4c8b70880f0_797x480.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:480,&quot;width&quot;:797,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!oDV6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F719e4ff6-fe57-4b8f-9aa3-f4c8b70880f0_797x480.png 424w, https://substackcdn.com/image/fetch/$s_!oDV6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F719e4ff6-fe57-4b8f-9aa3-f4c8b70880f0_797x480.png 848w, https://substackcdn.com/image/fetch/$s_!oDV6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F719e4ff6-fe57-4b8f-9aa3-f4c8b70880f0_797x480.png 1272w, https://substackcdn.com/image/fetch/$s_!oDV6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F719e4ff6-fe57-4b8f-9aa3-f4c8b70880f0_797x480.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>High capex drives consolidation</h3><p>By default, this means that you should expect consolidation in these high capital intensity industries. This is exactly what you see in the foundry market too.</p><p>Moore&#8217;s second law states: <em>the cost of a semiconductor chip fabrication plant doubles every four years.</em> Although this law started to weaken in the 1990s, the cost of building a single advanced fab today is still 1,000x higher than in the early 1970s<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a>.</p><p>As the cost of staying at the frontier rises, fewer firms can afford to compete. In the early 2000s, 26 foundries produced 130 nm nodes, but in 2020, only 3 foundries produced 7 nm nodes. Now only TSMC is at the frontier and accounts for around 67% of global market share.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OScu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc59a8cfe-9b7b-43d3-a7bb-a6c83f8e9af3_1600x1034.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OScu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc59a8cfe-9b7b-43d3-a7bb-a6c83f8e9af3_1600x1034.png 424w, https://substackcdn.com/image/fetch/$s_!OScu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc59a8cfe-9b7b-43d3-a7bb-a6c83f8e9af3_1600x1034.png 848w, https://substackcdn.com/image/fetch/$s_!OScu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc59a8cfe-9b7b-43d3-a7bb-a6c83f8e9af3_1600x1034.png 1272w, https://substackcdn.com/image/fetch/$s_!OScu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc59a8cfe-9b7b-43d3-a7bb-a6c83f8e9af3_1600x1034.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OScu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc59a8cfe-9b7b-43d3-a7bb-a6c83f8e9af3_1600x1034.png" width="1456" height="941" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c59a8cfe-9b7b-43d3-a7bb-a6c83f8e9af3_1600x1034.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:941,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OScu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc59a8cfe-9b7b-43d3-a7bb-a6c83f8e9af3_1600x1034.png 424w, https://substackcdn.com/image/fetch/$s_!OScu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc59a8cfe-9b7b-43d3-a7bb-a6c83f8e9af3_1600x1034.png 848w, https://substackcdn.com/image/fetch/$s_!OScu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc59a8cfe-9b7b-43d3-a7bb-a6c83f8e9af3_1600x1034.png 1272w, https://substackcdn.com/image/fetch/$s_!OScu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc59a8cfe-9b7b-43d3-a7bb-a6c83f8e9af3_1600x1034.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Consolidation if scaling laws hold</strong></h3><p>If we look at the main funders of frontier AI labs, they're overwhelmingly big tech companies. Their capacity to finance these efforts can be roughly estimated from free cash flow and debt capacity (assuming a debt-to-EBITDA of 3):</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DsW5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F291cb0f3-f609-4c0f-82b8-b8a3c4e255b6_1694x662.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DsW5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F291cb0f3-f609-4c0f-82b8-b8a3c4e255b6_1694x662.png 424w, https://substackcdn.com/image/fetch/$s_!DsW5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F291cb0f3-f609-4c0f-82b8-b8a3c4e255b6_1694x662.png 848w, https://substackcdn.com/image/fetch/$s_!DsW5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F291cb0f3-f609-4c0f-82b8-b8a3c4e255b6_1694x662.png 1272w, https://substackcdn.com/image/fetch/$s_!DsW5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F291cb0f3-f609-4c0f-82b8-b8a3c4e255b6_1694x662.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DsW5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F291cb0f3-f609-4c0f-82b8-b8a3c4e255b6_1694x662.png" width="1456" height="569" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/291cb0f3-f609-4c0f-82b8-b8a3c4e255b6_1694x662.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:569,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:113509,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://tmychow.substack.com/i/160913056?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F291cb0f3-f609-4c0f-82b8-b8a3c4e255b6_1694x662.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DsW5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F291cb0f3-f609-4c0f-82b8-b8a3c4e255b6_1694x662.png 424w, https://substackcdn.com/image/fetch/$s_!DsW5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F291cb0f3-f609-4c0f-82b8-b8a3c4e255b6_1694x662.png 848w, https://substackcdn.com/image/fetch/$s_!DsW5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F291cb0f3-f609-4c0f-82b8-b8a3c4e255b6_1694x662.png 1272w, https://substackcdn.com/image/fetch/$s_!DsW5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F291cb0f3-f609-4c0f-82b8-b8a3c4e255b6_1694x662.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The projected 2028 capex estimate of $152B to $304B presents significant financial challenges, even for big tech companies.</p><p>Consider Microsoft: the lower estimate of $152B would consume their entire cash reserves plus a full year's free cash flow (FCF). The upper estimate of $304B would require either all cash on hand plus four years of accumulated FCF, or alternatively, using up all their debt capacity.</p><p>Continuing the patronage for AI labs is becoming increasingly costly. If you believe that a handful of frontier labs will make it and don&#8217;t believe the oligopoly story, you are implying that all big tech companies will leverage their entire business just to sustain a handful of frontier AI labs. That&#8217;s not a straightforward bet to make.</p><h2><strong>Consumer</strong></h2><p>So far we&#8217;ve focused on the API market. The other half of frontier AI usage, and indeed the breakout product of ChatGPT, has been in consumer chatbots. This sub-industry looks even worse for oligopolies, since there is even less differentiation between the chatbots of labs.</p><p>Instead, consumers tend to stay with the company they know best. Right now, that&#8217;s OpenAI, and that&#8217;s what we see when we take the number of reviews in the app stores as a proxy for market share.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4WVN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55bc1665-a43c-49b9-a092-34d3b61c9ebd_1110x666.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4WVN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55bc1665-a43c-49b9-a092-34d3b61c9ebd_1110x666.png 424w, https://substackcdn.com/image/fetch/$s_!4WVN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55bc1665-a43c-49b9-a092-34d3b61c9ebd_1110x666.png 848w, https://substackcdn.com/image/fetch/$s_!4WVN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55bc1665-a43c-49b9-a092-34d3b61c9ebd_1110x666.png 1272w, https://substackcdn.com/image/fetch/$s_!4WVN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55bc1665-a43c-49b9-a092-34d3b61c9ebd_1110x666.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4WVN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55bc1665-a43c-49b9-a092-34d3b61c9ebd_1110x666.png" width="1110" height="666" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/55bc1665-a43c-49b9-a092-34d3b61c9ebd_1110x666.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:666,&quot;width&quot;:1110,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:80057,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://tmychow.substack.com/i/160913056?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55bc1665-a43c-49b9-a092-34d3b61c9ebd_1110x666.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4WVN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55bc1665-a43c-49b9-a092-34d3b61c9ebd_1110x666.png 424w, https://substackcdn.com/image/fetch/$s_!4WVN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55bc1665-a43c-49b9-a092-34d3b61c9ebd_1110x666.png 848w, https://substackcdn.com/image/fetch/$s_!4WVN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55bc1665-a43c-49b9-a092-34d3b61c9ebd_1110x666.png 1272w, https://substackcdn.com/image/fetch/$s_!4WVN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F55bc1665-a43c-49b9-a092-34d3b61c9ebd_1110x666.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Search is a consumer monopoly</strong></h3><p>This is not surprising. ChatGPT shares many characteristics with one of the clearest consumer monopolies today: web search.</p><p>User behaviour is sticky and habitual. Importantly, consumers don&#8217;t query the same thing across five search engines, but pick a search engine and stay for the long run. They do not have multi-vendor preferences.</p><p>It is then not surprising that there have only been two search monopolies thus far: Yahoo and Google. The Google monopoly formed because it was obvious to any consumer that Google was superior and they were already used to seeing Google results from the time where Yahoo used Google for search<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6JwR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1159edfe-cad3-4579-abbc-cfcc8464facd_482x279.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6JwR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1159edfe-cad3-4579-abbc-cfcc8464facd_482x279.png 424w, https://substackcdn.com/image/fetch/$s_!6JwR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1159edfe-cad3-4579-abbc-cfcc8464facd_482x279.png 848w, https://substackcdn.com/image/fetch/$s_!6JwR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1159edfe-cad3-4579-abbc-cfcc8464facd_482x279.png 1272w, https://substackcdn.com/image/fetch/$s_!6JwR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1159edfe-cad3-4579-abbc-cfcc8464facd_482x279.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6JwR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1159edfe-cad3-4579-abbc-cfcc8464facd_482x279.png" width="482" height="279" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1159edfe-cad3-4579-abbc-cfcc8464facd_482x279.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:279,&quot;width&quot;:482,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6JwR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1159edfe-cad3-4579-abbc-cfcc8464facd_482x279.png 424w, https://substackcdn.com/image/fetch/$s_!6JwR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1159edfe-cad3-4579-abbc-cfcc8464facd_482x279.png 848w, https://substackcdn.com/image/fetch/$s_!6JwR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1159edfe-cad3-4579-abbc-cfcc8464facd_482x279.png 1272w, https://substackcdn.com/image/fetch/$s_!6JwR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1159edfe-cad3-4579-abbc-cfcc8464facd_482x279.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>For AI, that means that if a model company manages to build up a persistent lead that is so obvious to consumers on first try that you don&#8217;t need to rely on benchmarks, there is a chance of it taking the monopolist&#8217;s position.</p><h2><strong>The model is the market structure</strong></h2><p>We can see that there are many different narratives for what happens to the market for frontier AI labs. These can be mapped onto different expectations about model capabilities.</p><ol><li><p>If model capabilities stagnate at their current level, mostly products will be the ones which capture value, since open source models will commoditize the AI labs. This means that the value goes to the product and distribution, with the most value captured by a consumer monopoly.</p></li><li><p>If the scaling laws bend e.g. RL doesn&#8217;t generalize, model capabilities may continue to grow, but slower and only as basic research bets pay off. This means that each lab may end up finding their own niche for the type of research they are good at, and there will be an oligopoly of model providers who capture most of the value in each of their niches.</p></li><li><p>If the scaling laws hold until AGI, we can translate <a href="https://x.com/DavidSHolz/status/1904173845998882984">exponentially more compute into linear increases in intelligence</a> and these linear increases into exponential or super-exponential economic returns. This creates strong incentives for more capex spend, and in turn, structural pressure to consolidate. Only a few of the labs will make it, with most bets on frontier AI labs being written to zero.</p></li><li><p>If we get beyond AGI to super-intelligence, then even research labs which are relatively commoditized will capture enormous value, owing to the size of the lightcone. However, it&#8217;s less clear if the current notions of property rights, value and returns are still as relevant.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XUVJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77979d09-828d-49e0-9fd1-602b839bd6f1_1975x1187.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XUVJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77979d09-828d-49e0-9fd1-602b839bd6f1_1975x1187.png 424w, https://substackcdn.com/image/fetch/$s_!XUVJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77979d09-828d-49e0-9fd1-602b839bd6f1_1975x1187.png 848w, https://substackcdn.com/image/fetch/$s_!XUVJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77979d09-828d-49e0-9fd1-602b839bd6f1_1975x1187.png 1272w, https://substackcdn.com/image/fetch/$s_!XUVJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77979d09-828d-49e0-9fd1-602b839bd6f1_1975x1187.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XUVJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77979d09-828d-49e0-9fd1-602b839bd6f1_1975x1187.png" width="1456" height="875" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/77979d09-828d-49e0-9fd1-602b839bd6f1_1975x1187.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:875,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:103842,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://tmychow.substack.com/i/160913056?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77979d09-828d-49e0-9fd1-602b839bd6f1_1975x1187.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XUVJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77979d09-828d-49e0-9fd1-602b839bd6f1_1975x1187.png 424w, https://substackcdn.com/image/fetch/$s_!XUVJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77979d09-828d-49e0-9fd1-602b839bd6f1_1975x1187.png 848w, https://substackcdn.com/image/fetch/$s_!XUVJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77979d09-828d-49e0-9fd1-602b839bd6f1_1975x1187.png 1272w, https://substackcdn.com/image/fetch/$s_!XUVJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77979d09-828d-49e0-9fd1-602b839bd6f1_1975x1187.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>At a portfolio level, the best case for the current spray-and-pray deployment of venture capital is case 2: indefinite optimism in some sort of AI progress, but without any clear conviction and with a long-term bet against scaling.</p><p>We would not bet against scaling! If you take scaling seriously, you end up in case 3: many of the enormous investments into frontier AI labs should be thought of as having the risk profile of early-stage investments but with the capital requirements of growth-stage investments. Diversifying on that basis isn&#8217;t prudent; it&#8217;s simply indecisive.</p><p><em>Thanks to <a href="https://x.com/ZiChengCaoHuang">Sam</a>, <a href="https://x.com/zeelmpatel">Zeel</a>, <a href="https://x.com/AtiyuM">Atiyu</a>, <a href="https://x.com/arden_eth">Arden</a> and <a href="https://x.com/Moh1tAgarwal">Mohit</a> for reading drafts of this.</em> </p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>We expect there will be some technological improvements to reduce the cost of physical compute, but not enough to materially change the conclusions we draw here.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>Today, a cutting-edge fab costs typically around <a href="https://www.construction-physics.com/p/how-to-build-a-20-billion-semiconductor">$20 billion</a> or more, and the TSMC fab in Arizona is projected to cost <a href="https://pcoutlet.com/parts/cpus/tsmcs-2025-deadline-will-arizonas-first-advanced-chip-fab-deliver">$65bn</a>. In the late 1960s, a frontier fab cost around $4 million (~$31 million in 2025 dollars).</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p><a href="https://www.fool.com/investing/general/2013/05/20/google-stock-buy-it-and-hold-it-for-life.aspx">https://www.fool.com/investing/general/2013/05/20/google-stock-buy-it-and-hold-it-for-life.aspx</a></p></div></div>]]></content:encoded></item><item><title><![CDATA[From Apples to Strawberries]]></title><description><![CDATA[The ChatGPT moment, redux]]></description><link>https://tmychow.substack.com/p/from-apples-to-strawberries</link><guid isPermaLink="false">https://tmychow.substack.com/p/from-apples-to-strawberries</guid><dc:creator><![CDATA[Trevor Chow]]></dc:creator><pubDate>Fri, 04 Oct 2024 15:51:44 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/0c879d1a-b3cd-4f82-8be1-34c43e35682a_1440x786.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Three weeks ago, OpenAI released the <a href="https://openai.com/o1/">o1 series</a>.</p><p>This publicly marks the start of the &#8220;search&#8221; paradigm in modern ML, just as ChatGPT&#8217;s launch in 2022 marked the arrival of the &#8220;learning&#8221; paradigm.&nbsp;</p><p>With this new paradigm, you should expect progress in ML performance over the next two years to be at least as fast as it was in the last two years. In fact, there are reasons to expect it to be even faster.</p><p>Here&#8217;s why!</p><h3><strong>Bitter lesson 2: electric boogaloo</strong></h3><p>When I talk about &#8220;learning&#8221; and &#8220;search&#8221;, I mean the two &#8220;general methods that leverage computation&#8221; which Rich Sutton named in <em><a href="http://www.incompleteideas.net/IncIdeas/BitterLesson.html">The Bitter Lesson</a></em>. Loosely, &#8220;learning&#8221; is fitting to patterns in the world, while &#8220;search&#8221; is finding the best option in a space of possibilities.</p><p><em>The Bitter Lesson</em> explains why a single model launch, like o1&#8217;s, can catalyse such rapid progress. Since &#8220;the most effective&#8221; methods in ML are these general methods, progress in ML doesn&#8217;t occur steadily, but instead comes in fits and spurts. This is because it relies on researchers finding a technique which gets predictably better with more computing power (&#8220;compute&#8221;). Once they have conviction in it, they can scale that technique with more compute. A model launch is simply a lagging indicator of this conviction.</p><p>With &#8220;learning&#8221;, some researchers had the realisation in 2020 that they could use more compute to train a model with more parameters on more data, and it would just improve. Thus they spent the next two years doing that, and in the process, learnt a lot about how to use that compute effectively. This unlocked ChatGPT in 2022, and has continued to power the last two years of improvements in AI.</p><p>(Check out <a href="https://tmychow.substack.com/p/three-kuhnian-revolutions-in-ml-training">yesterday&#8217;s post</a> if you&#8217;re curious about how this came to pass!)</p><h3><strong>Moving from training to inference</strong></h3><p>However, most of this compute has been used during the training process of the models. This is a result of how &#8220;learning&#8221; works. If you want the model to learn more during training, you can scale that pretty arbitrarily, just by making it bigger or having it see more data. By contrast, the model can only learn during inference if you give it in-context examples, but this is constrained by the size of its context window, which isn&#8217;t infinitely scalable.</p><p>This means that spending more compute on inference was always the next frontier to be tackled. Many users of these models have tried to do clever prompting or build convoluted wrappers around these models to take advantage of that. You may have heard of techniques like &#8220;in-context learning&#8221;, &#8220;chain of thought&#8221; and so on.</p><p>This is a mistake.</p><p>By handcrafting these scaffolds, they are ignoring <em>The Bitter Lesson</em> at their own peril. What they needed to do instead was to find a method that could scalably absorb compute at inference time, without their intervention. Many researchers at top AI labs have long suspected that &#8220;search&#8221; fits this description:</p><ul><li><p><a href="https://x.com/polynoamial/status/1676971503261454340">Noam Brown (July 2023)</a>: &#8220;I&#8217;ll now investigate how to make these [self-play and reasoning] methods truly general&#8221;</p></li><li><p><a href="https://www.theinformation.com/articles/openai-made-an-ai-breakthrough-before-altman-firing-stoking-excitement-and-concern">The Information (November 2023)</a>: &#8220;among the techniques the [OpenAI] team experimented with was &#8230; test-time computation&#8221;</p></li><li><p><a href="https://www.dwarkeshpatel.com/p/will-scaling-work">Dwarkesh Patel (December 2023)</a>: &#8220;almost all the researchers I talk to in the big AI labs are quite confident they&#8217;ll get self-play to work&#8221;</p></li></ul><p>The basic idea behind &#8220;search&#8221; is that you can solve hard problems by trying many different paths and seeing which gets you the best outcome. Since you can try arbitrarily many paths if you&#8217;re willing to spend the compute, this is something that can pretty naturally scale at inference time.</p><h3><strong>Verification is easier than generation</strong></h3><p>There&#8217;s a long history of &#8220;search&#8221; working in other domains. One notable example is <a href="https://deepmind.google/discover/blog/alphago-zero-starting-from-scratch/">AlphaGo Zero</a>, which was DeepMind&#8217;s system for playing Go. This system had two parts: a neural network and a Monte Carlo Tree Search algorithm.</p><p>The neural net generates:</p><ul><li><p>An estimate of how good the current board state is</p></li><li><p>The probability of different next moves given the current board state</p></li></ul><p>The MCTS algorithm looks at all possible next moves and picks a move. Early in the game, it tries to pick moves it hasn&#8217;t encountered yet. As it becomes more certain about which move is good or bad, it focuses on picking the best possible move. Essentially, it&#8217;s trying to sensibly trade off &#8220;exploration&#8221; and &#8220;exploitation&#8221;. After the MCTS algorithm has picked a move, the neural net evaluates the board state conditional on that move being made and updates the algorithm&#8217;s knowledge of how good each move is.</p><p>To improve this system, they got it to play against itself. At the end of each game, the self-play produced a series of data about what the board state was after each move, what the MCTS algorithm thought it should do and who ended up winning the game. This data could then be used to train the neural network, making it better and better each time.&nbsp;</p><p>At inference, the combination of the trained neural net and the MCTS algorithm produced the superhuman performance of AlphaGo Zero system. This was rated at over 1000 Elo points above the AlphaGo Lee system that had famously <a href="https://www.youtube.com/watch?v=WXuK6gekU1Y">beat Lee Sedol</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!50KE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9245057a-4ccd-429d-a395-84500bcb08df_810x830.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!50KE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9245057a-4ccd-429d-a395-84500bcb08df_810x830.png 424w, https://substackcdn.com/image/fetch/$s_!50KE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9245057a-4ccd-429d-a395-84500bcb08df_810x830.png 848w, https://substackcdn.com/image/fetch/$s_!50KE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9245057a-4ccd-429d-a395-84500bcb08df_810x830.png 1272w, https://substackcdn.com/image/fetch/$s_!50KE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9245057a-4ccd-429d-a395-84500bcb08df_810x830.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!50KE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9245057a-4ccd-429d-a395-84500bcb08df_810x830.png" width="810" height="830" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9245057a-4ccd-429d-a395-84500bcb08df_810x830.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:830,&quot;width&quot;:810,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!50KE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9245057a-4ccd-429d-a395-84500bcb08df_810x830.png 424w, https://substackcdn.com/image/fetch/$s_!50KE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9245057a-4ccd-429d-a395-84500bcb08df_810x830.png 848w, https://substackcdn.com/image/fetch/$s_!50KE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9245057a-4ccd-429d-a395-84500bcb08df_810x830.png 1272w, https://substackcdn.com/image/fetch/$s_!50KE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9245057a-4ccd-429d-a395-84500bcb08df_810x830.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Notice that search is key!</p><p>When doing inference without MCTS (&#8220;raw network&#8221;), it was over 2000 Elo points below the version with MCTS, and actually underperformed even AlphaGo Lee. In fact, if we wanted to bridge the gap between the raw network and the network with MCTS, but without doing any search, the neural network would need to be 100,000x bigger.</p><p>This training-inference trade-off is replicated across other games. For example, <a href="https://arxiv.org/abs/2104.03113">Jones</a> found that you can attain the same Elo in Hex on a model with 10x less training compute by spending 15x on inference compute:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!knbi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51a94575-dd7d-4c1c-8051-2673b29ec330_984x646.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!knbi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51a94575-dd7d-4c1c-8051-2673b29ec330_984x646.png 424w, https://substackcdn.com/image/fetch/$s_!knbi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51a94575-dd7d-4c1c-8051-2673b29ec330_984x646.png 848w, https://substackcdn.com/image/fetch/$s_!knbi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51a94575-dd7d-4c1c-8051-2673b29ec330_984x646.png 1272w, https://substackcdn.com/image/fetch/$s_!knbi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51a94575-dd7d-4c1c-8051-2673b29ec330_984x646.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!knbi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51a94575-dd7d-4c1c-8051-2673b29ec330_984x646.png" width="984" height="646" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/51a94575-dd7d-4c1c-8051-2673b29ec330_984x646.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:646,&quot;width&quot;:984,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!knbi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51a94575-dd7d-4c1c-8051-2673b29ec330_984x646.png 424w, https://substackcdn.com/image/fetch/$s_!knbi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51a94575-dd7d-4c1c-8051-2673b29ec330_984x646.png 848w, https://substackcdn.com/image/fetch/$s_!knbi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51a94575-dd7d-4c1c-8051-2673b29ec330_984x646.png 1272w, https://substackcdn.com/image/fetch/$s_!knbi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51a94575-dd7d-4c1c-8051-2673b29ec330_984x646.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In general, the &#8220;search&#8221; process follows a certain loop:</p><ol><li><p>Get the model to play against itself</p></li><li><p>Break down the &#8220;play&#8221; into steps</p></li><li><p>Have a search algorithm explore over the steps</p></li><li><p>Verify if the end result was good</p></li><li><p>Tell the search algorithm so it learns which steps are good</p></li><li><p>Train the model based on what the search algorithm has learnt</p></li></ol><p>In the case of games like Go and Hex, the existence of pre-defined rules, moves and win conditions mean that this loop is pretty easy to do. However, what about domains that are less well specified?</p><h3><strong>Combining learning and search</strong></h3><p>To start with, researchers have focused on fields like mathematics or coding, where it is possible to do verification to some degree e.g. with theorem provers or unit tests. Much of the exciting research in the past two years has been about bringing &#8220;search&#8221; to these areas.</p><p>The simplest version of &#8220;search&#8221; is one step deep. That means doing many single step generations and then picking the best one. One way of picking is to take the most popular answer. It&#8217;s a bit like the &#8220;Ask the Audience&#8221; lifeline on &#8220;Who Wants To Be A Millionaire&#8221;!</p><p>This is the majority voting approach from <a href="https://arxiv.org/abs/2206.14858">Lewkowycz et al.</a>, and doing so (&#8220;maj1@k&#8221;) with the Minerva model shows dramatic improvements relative to just using model itself across STEM benchmarks:</p><p>A more sophisticated approach is to swap about majority voting for a neural network that tries to pick the best answer. This is known as an &#8220;outcome reward model&#8221;, and <a href="https://arxiv.org/abs/2305.20050">Lightman et al.</a> find that doing this improves their results on the MATH benchmark from 69.6% accuracy with majority voting to 72.4%.</p><p>To make it even better, they break down the problem into multiple steps, and train a &#8220;process reward model&#8221;. This model looks at every step of the reasoning process, instead of just the output, and picking the answer using this model brings accuracy to 78.2%.</p><p>This is shaping up to a similar loop as before:</p><ol><li><p>Get the language model to respond to a prompt</p></li><li><p>Each &#8220;move&#8221; is the model generating some reasoning steps</p></li><li><p>A search algorithm picks whether to keep taking more &#8220;moves&#8221;</p></li><li><p>Verify if the eventual answer is correct</p></li><li><p>Tell the search algorithm so it learns which &#8220;moves&#8221; were good</p></li><li><p>Train the model based on what the search algorithm has learnt</p></li></ol><p>By repeating this many times per prompt for billions of prompts, you get a language model that is much better at generating reasoning steps, as well as a search algorithm that knows if a particular set of reasoning steps is good. At inference, you can use this by generating some reasoning steps, and then either re-generating it or taking the next set of reasoning steps, depending on whether the search algorithm thinks this was a good &#8220;move&#8221;.</p><p>Then you can scale this in two ways: by spending more time on training compute (e.g. more prompts, more attempts per prompt) or by spending more time on inference compute (e.g. setting a higher threshold for what a good &#8220;move&#8221; is, trying many &#8220;moves&#8221;).</p><h3><strong>Scaling laws, redux</strong></h3><p>This is an interesting hypothesis. Does it actually let us scale inference compute though?</p><p>Here are a few reasons it might not:</p><ol><li><p>Maybe none of the &#8220;moves&#8221; are good enough i.e. if you asked a 3 year old an advanced college maths question, it doesn&#8217;t matter how many times you ask them and how well you can search through their answers: they simply aren&#8217;t smart enough</p></li><li><p>Maybe verification is hard enough that the search algorithm doesn&#8217;t learn much and can&#8217;t help us</p></li><li><p>Even if the first two aren&#8217;t completely true, they may be true enough that it is not economical to use inference compute, relative to training compute</p></li></ol><p>Let&#8217;s tackle each in order.</p><p>It&#8217;s not super surprising that if you do more sampling, the probability that at least one of the samples will be the correct answer (&#8220;coverage&#8221;) increases, at least a little bit. What <a href="https://arxiv.org/abs/2407.21787">Brown et al.</a> show us is that this scales to an unbelievable degree (i.e. to tens of thousands of samples) and scales following a predictable power law:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SaM6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3707c844-80f2-47b1-90e9-9580df2403f9_1600x432.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SaM6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3707c844-80f2-47b1-90e9-9580df2403f9_1600x432.png 424w, https://substackcdn.com/image/fetch/$s_!SaM6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3707c844-80f2-47b1-90e9-9580df2403f9_1600x432.png 848w, https://substackcdn.com/image/fetch/$s_!SaM6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3707c844-80f2-47b1-90e9-9580df2403f9_1600x432.png 1272w, https://substackcdn.com/image/fetch/$s_!SaM6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3707c844-80f2-47b1-90e9-9580df2403f9_1600x432.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SaM6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3707c844-80f2-47b1-90e9-9580df2403f9_1600x432.png" width="1456" height="393" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3707c844-80f2-47b1-90e9-9580df2403f9_1600x432.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:393,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!SaM6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3707c844-80f2-47b1-90e9-9580df2403f9_1600x432.png 424w, https://substackcdn.com/image/fetch/$s_!SaM6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3707c844-80f2-47b1-90e9-9580df2403f9_1600x432.png 848w, https://substackcdn.com/image/fetch/$s_!SaM6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3707c844-80f2-47b1-90e9-9580df2403f9_1600x432.png 1272w, https://substackcdn.com/image/fetch/$s_!SaM6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3707c844-80f2-47b1-90e9-9580df2403f9_1600x432.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>By simply sampling many times and picking the answer based on an automated verifier, this increase in &#8220;coverage&#8221; translates directly into better performance. In fact, they manage to beat the state-of-the-art SWE-bench Lite results by a staggering 13pp (and as of this post, would still top the leaderboard):</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Pvdl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84b69e02-9739-4e38-bc55-7e53e73749e7_1600x649.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Pvdl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84b69e02-9739-4e38-bc55-7e53e73749e7_1600x649.png 424w, https://substackcdn.com/image/fetch/$s_!Pvdl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84b69e02-9739-4e38-bc55-7e53e73749e7_1600x649.png 848w, https://substackcdn.com/image/fetch/$s_!Pvdl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84b69e02-9739-4e38-bc55-7e53e73749e7_1600x649.png 1272w, https://substackcdn.com/image/fetch/$s_!Pvdl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84b69e02-9739-4e38-bc55-7e53e73749e7_1600x649.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Pvdl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84b69e02-9739-4e38-bc55-7e53e73749e7_1600x649.png" width="1456" height="591" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/84b69e02-9739-4e38-bc55-7e53e73749e7_1600x649.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:591,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!Pvdl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84b69e02-9739-4e38-bc55-7e53e73749e7_1600x649.png 424w, https://substackcdn.com/image/fetch/$s_!Pvdl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84b69e02-9739-4e38-bc55-7e53e73749e7_1600x649.png 848w, https://substackcdn.com/image/fetch/$s_!Pvdl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84b69e02-9739-4e38-bc55-7e53e73749e7_1600x649.png 1272w, https://substackcdn.com/image/fetch/$s_!Pvdl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84b69e02-9739-4e38-bc55-7e53e73749e7_1600x649.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Thus, issue 1 isn&#8217;t a problem. However, they also show that for maths problems without theorem provers, verification isn&#8217;t that easy. Whether it is with majority voting or using a reward model to search over the samples, it seems that this plateaus quickly, causing a gap between &#8220;coverage&#8221; and the actual success rate:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nxSy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7b223d-0e3e-47db-a884-c8799bf53295_1600x541.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nxSy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7b223d-0e3e-47db-a884-c8799bf53295_1600x541.png 424w, https://substackcdn.com/image/fetch/$s_!nxSy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7b223d-0e3e-47db-a884-c8799bf53295_1600x541.png 848w, https://substackcdn.com/image/fetch/$s_!nxSy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7b223d-0e3e-47db-a884-c8799bf53295_1600x541.png 1272w, https://substackcdn.com/image/fetch/$s_!nxSy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7b223d-0e3e-47db-a884-c8799bf53295_1600x541.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nxSy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7b223d-0e3e-47db-a884-c8799bf53295_1600x541.png" width="1456" height="492" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6b7b223d-0e3e-47db-a884-c8799bf53295_1600x541.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:492,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!nxSy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7b223d-0e3e-47db-a884-c8799bf53295_1600x541.png 424w, https://substackcdn.com/image/fetch/$s_!nxSy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7b223d-0e3e-47db-a884-c8799bf53295_1600x541.png 848w, https://substackcdn.com/image/fetch/$s_!nxSy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7b223d-0e3e-47db-a884-c8799bf53295_1600x541.png 1272w, https://substackcdn.com/image/fetch/$s_!nxSy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7b223d-0e3e-47db-a884-c8799bf53295_1600x541.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This means that issue 2 is only half-solved, for certain domains with easy verifiers. What does this mean for issue 3 i.e. trading off training and inference compute?</p><p>This is where <a href="https://arxiv.org/abs/2408.03314">Snell et al.</a> come in. They ask the question: &#8220;if you can 10x the combined amount of training and inference compute, how much should you spend on each?&#8221;. They find that the two types of compute are substitutable, but not perfectly, and inference compute is best spent on easier problems.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4taA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7e950f-26ee-4bac-843c-ba1d24e78b3e_1048x688.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4taA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7e950f-26ee-4bac-843c-ba1d24e78b3e_1048x688.png 424w, https://substackcdn.com/image/fetch/$s_!4taA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7e950f-26ee-4bac-843c-ba1d24e78b3e_1048x688.png 848w, https://substackcdn.com/image/fetch/$s_!4taA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7e950f-26ee-4bac-843c-ba1d24e78b3e_1048x688.png 1272w, https://substackcdn.com/image/fetch/$s_!4taA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7e950f-26ee-4bac-843c-ba1d24e78b3e_1048x688.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4taA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7e950f-26ee-4bac-843c-ba1d24e78b3e_1048x688.png" width="1048" height="688" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1d7e950f-26ee-4bac-843c-ba1d24e78b3e_1048x688.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:688,&quot;width&quot;:1048,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!4taA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7e950f-26ee-4bac-843c-ba1d24e78b3e_1048x688.png 424w, https://substackcdn.com/image/fetch/$s_!4taA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7e950f-26ee-4bac-843c-ba1d24e78b3e_1048x688.png 848w, https://substackcdn.com/image/fetch/$s_!4taA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7e950f-26ee-4bac-843c-ba1d24e78b3e_1048x688.png 1272w, https://substackcdn.com/image/fetch/$s_!4taA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7e950f-26ee-4bac-843c-ba1d24e78b3e_1048x688.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>The unhobbling is coming&nbsp;&nbsp;</strong></h3><p>Let&#8217;s take a step back.</p><p>Long before ChatGPT came out, and before even OpenAI existed, Ilya Sutskever had said that &#8220;<a href="https://open.substack.com/pub/dwarkesh/p/dario-amodei?r=zf8d&amp;selection=c2a78293-2702-4a50-8b07-ea908a2109dc&amp;utm_campaign=post-share-selection&amp;utm_medium=web">the models, they just want to learn</a>&#8221;. The belief that you could pour more training compute and scale up &#8220;learning&#8221; is not a new one. Yet it took ChatGPT to turn it into a public and undeniable fact.</p><p>For the past two years, we have been in the pre-ChatGPT era with &#8220;search&#8221;. While all of the research I&#8217;ve mentioned above kept pointing in the direction of scaling inference compute, no one had productionised and done it in a way that made it public and undeniable.</p><p>On September 12th, o1 changed that:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Cc5_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28c06891-2af2-4e27-bb7e-3fef8207cdec_1418x538.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Cc5_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28c06891-2af2-4e27-bb7e-3fef8207cdec_1418x538.png 424w, https://substackcdn.com/image/fetch/$s_!Cc5_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28c06891-2af2-4e27-bb7e-3fef8207cdec_1418x538.png 848w, https://substackcdn.com/image/fetch/$s_!Cc5_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28c06891-2af2-4e27-bb7e-3fef8207cdec_1418x538.png 1272w, https://substackcdn.com/image/fetch/$s_!Cc5_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28c06891-2af2-4e27-bb7e-3fef8207cdec_1418x538.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Cc5_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28c06891-2af2-4e27-bb7e-3fef8207cdec_1418x538.png" width="1418" height="538" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/28c06891-2af2-4e27-bb7e-3fef8207cdec_1418x538.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:538,&quot;width&quot;:1418,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!Cc5_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28c06891-2af2-4e27-bb7e-3fef8207cdec_1418x538.png 424w, https://substackcdn.com/image/fetch/$s_!Cc5_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28c06891-2af2-4e27-bb7e-3fef8207cdec_1418x538.png 848w, https://substackcdn.com/image/fetch/$s_!Cc5_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28c06891-2af2-4e27-bb7e-3fef8207cdec_1418x538.png 1272w, https://substackcdn.com/image/fetch/$s_!Cc5_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28c06891-2af2-4e27-bb7e-3fef8207cdec_1418x538.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>By themselves, these results on competition mathematics and coding problems are already impressive. These are hard problems where o1 blows GPT-4o out of the water, but in some ways, this can be expected, since maths and code can have very good verifiers that provide a clear reward signal to the search algorithm.</p><p>What is more staggering is the performance on other domains:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tNaJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3936866-24df-473d-91f4-911df573a8de_1478x1042.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tNaJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3936866-24df-473d-91f4-911df573a8de_1478x1042.png 424w, https://substackcdn.com/image/fetch/$s_!tNaJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3936866-24df-473d-91f4-911df573a8de_1478x1042.png 848w, https://substackcdn.com/image/fetch/$s_!tNaJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3936866-24df-473d-91f4-911df573a8de_1478x1042.png 1272w, https://substackcdn.com/image/fetch/$s_!tNaJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3936866-24df-473d-91f4-911df573a8de_1478x1042.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tNaJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3936866-24df-473d-91f4-911df573a8de_1478x1042.png" width="1456" height="1026" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b3936866-24df-473d-91f4-911df573a8de_1478x1042.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1026,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!tNaJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3936866-24df-473d-91f4-911df573a8de_1478x1042.png 424w, https://substackcdn.com/image/fetch/$s_!tNaJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3936866-24df-473d-91f4-911df573a8de_1478x1042.png 848w, https://substackcdn.com/image/fetch/$s_!tNaJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3936866-24df-473d-91f4-911df573a8de_1478x1042.png 1272w, https://substackcdn.com/image/fetch/$s_!tNaJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3936866-24df-473d-91f4-911df573a8de_1478x1042.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>For example, <a href="https://arxiv.org/abs/2311.12022">GPQA</a> is a &#8220;Google-proof&#8221; science benchmark that even domain PhDs struggle with, and yet o1 beats them and gpt-4o very handily. Nor is this limited to STEM: o1 also improves on gpt-4o&#8217;s performance on the LSATs, econometrics etc.&nbsp;</p><p>In addition to validating how search can work on more general domains, o1 also shows that it can scale, across both training and inference compute:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wNLJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ad8355c-a3b6-46d0-85bd-412268519192_1440x786.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wNLJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ad8355c-a3b6-46d0-85bd-412268519192_1440x786.png 424w, https://substackcdn.com/image/fetch/$s_!wNLJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ad8355c-a3b6-46d0-85bd-412268519192_1440x786.png 848w, https://substackcdn.com/image/fetch/$s_!wNLJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ad8355c-a3b6-46d0-85bd-412268519192_1440x786.png 1272w, https://substackcdn.com/image/fetch/$s_!wNLJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ad8355c-a3b6-46d0-85bd-412268519192_1440x786.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wNLJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ad8355c-a3b6-46d0-85bd-412268519192_1440x786.png" width="1440" height="786" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0ad8355c-a3b6-46d0-85bd-412268519192_1440x786.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:786,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!wNLJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ad8355c-a3b6-46d0-85bd-412268519192_1440x786.png 424w, https://substackcdn.com/image/fetch/$s_!wNLJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ad8355c-a3b6-46d0-85bd-412268519192_1440x786.png 848w, https://substackcdn.com/image/fetch/$s_!wNLJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ad8355c-a3b6-46d0-85bd-412268519192_1440x786.png 1272w, https://substackcdn.com/image/fetch/$s_!wNLJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ad8355c-a3b6-46d0-85bd-412268519192_1440x786.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Just as ChatGPT fired the starting gun for the race to scale &#8220;learning&#8221;, o1 has done the same for &#8220;search&#8221;!</p><p>Why does this matter?</p><p>Scaling &#8220;search&#8221; on its own will improve domain-specific capabilities. Instead of needing to re-train a model, you can simulate what a larger model&#8217;s performance would be by simply spending more inference compute to solve your problem. This pulls forward the ability to deploy AI systems into real world problems, and makes the price-performance trade-off less discontinuous and dependent on research labs releasing new models.</p><p>Just to give a sense of price, <a href="https://x.com/hughbzhang/status/1838288923656941860">Zhang&#8217;s</a> replication of the scaling plot lets us estimate that the maximum accuracy displayed can be achieved with $1.6 of inference compute per problem. That&#8217;s almost certainly cheaper than how much it would have cost to train a model that gets the same accuracy without &#8220;search&#8221;.</p><p>This alone would make scaling &#8220;search&#8221; as exciting as scaling &#8220;learning&#8221; has been. However, it turns out that "search" will also make "learning" better.</p><p>Already, lots of training data is synthetic i.e. generated by an AI. Now, every single set of reasoning steps by o1 can be used to train the next model. Since o1 is smarter, the quality of these reasoning traces is higher, and it can filter out the bad sets of reasoning steps better than before.</p><p>The reason we couldn&#8217;t make this data flywheel in the past is because the models weren&#8217;t smart enough to generate useful reasoning traces, or filter out the bad ones. Now that we seem to have reached that threshold, we can follow the same playbook which made AlphaGo superhuman: bootstrapping the next model from the outputs of the previous one.</p><p>That&#8217;s why <a href="https://ia.samaltman.com/">Sam Altman</a> is so confident that &#8220;deep learning worked&#8221;, and why you should expect even faster progress soon. Remember, the models are the worst they&#8217;ll ever be!</p><p><em>Thanks to <a href="https://kevinniechen.com/">Kevin Niechen</a>, <a href="https://zhengdongwang.com/">Zhengdong Wang</a>, <a href="https://www.jannikschilling.com/">Jannik Schilling</a>, <a href="https://bradleyhsu.com/">Bradley Hsu</a>, <a href="https://devanshpanda.com/">Devansh Pandey</a>, <a href="https://jzmazlish.substack.com/">Zach Mazlish</a> and <a href="https://basilhalperin.com/">Basil Halperin</a> for discussion and feedback.</em></p>]]></content:encoded></item><item><title><![CDATA[Three Kuhnian Revolutions in ML Training]]></title><description><![CDATA[From Kaplan to Chinchilla and beyond]]></description><link>https://tmychow.substack.com/p/three-kuhnian-revolutions-in-ml-training</link><guid isPermaLink="false">https://tmychow.substack.com/p/three-kuhnian-revolutions-in-ml-training</guid><dc:creator><![CDATA[Trevor Chow]]></dc:creator><pubDate>Thu, 03 Oct 2024 15:30:38 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!QRi1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a6fca9f-2b4d-424b-bc3a-1fe20eebf7dd_946x812.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Parameters and data.</p><p>These are the two ingredients of training ML models. The total amount of computation (&#8220;compute&#8221;) you need to do to train a model is proportional to the number of parameters multiplied by the amount of data (measured in &#8220;tokens&#8221;).</p><p>Four years ago, it was well-known that if you had more compute to train a model, you should spend most of it on parameters.</p><p>Two years ago, everyone changed their mind and believed you should spend it equally on parameters and data.</p><p>Just last year, it became widely accepted that you should spend orders-of-magnitude more on data than anyone had previously thought.</p><p>Why have the recipes to training these models changed so much, and so frequently? To understand this, we need to take a walk through the intellectual history of scaling modern ML models.</p><h3><strong>The Bitter Lesson</strong></h3><p>The science of training large models starts with <a href="https://arxiv.org/abs/2001.08361">Kaplan et al.</a> in early 2020. In the paper, the OpenAI team tested different configurations of transformer language models. The two most important axes they varied were the number of model parameters (768 to 1.5B) and number of tokens in the dataset (22M to 23B). Implicitly, this also varied the total amount of compute used to train the model, since compute ~ parameters x tokens.</p><p>They found a stable and predictable power-law relationship between compute and the performance of the model. Here, performance is measured by the &#8220;loss&#8221;, which is simply the model&#8217;s error in predicting the next token across all the text it is trained on:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QRi1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a6fca9f-2b4d-424b-bc3a-1fe20eebf7dd_946x812.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QRi1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a6fca9f-2b4d-424b-bc3a-1fe20eebf7dd_946x812.png 424w, https://substackcdn.com/image/fetch/$s_!QRi1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a6fca9f-2b4d-424b-bc3a-1fe20eebf7dd_946x812.png 848w, https://substackcdn.com/image/fetch/$s_!QRi1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a6fca9f-2b4d-424b-bc3a-1fe20eebf7dd_946x812.png 1272w, https://substackcdn.com/image/fetch/$s_!QRi1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a6fca9f-2b4d-424b-bc3a-1fe20eebf7dd_946x812.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QRi1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a6fca9f-2b4d-424b-bc3a-1fe20eebf7dd_946x812.png" width="946" height="812" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4a6fca9f-2b4d-424b-bc3a-1fe20eebf7dd_946x812.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:812,&quot;width&quot;:946,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!QRi1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a6fca9f-2b4d-424b-bc3a-1fe20eebf7dd_946x812.png 424w, https://substackcdn.com/image/fetch/$s_!QRi1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a6fca9f-2b4d-424b-bc3a-1fe20eebf7dd_946x812.png 848w, https://substackcdn.com/image/fetch/$s_!QRi1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a6fca9f-2b4d-424b-bc3a-1fe20eebf7dd_946x812.png 1272w, https://substackcdn.com/image/fetch/$s_!QRi1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a6fca9f-2b4d-424b-bc3a-1fe20eebf7dd_946x812.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>They also gave a formula for how to use your compute efficiently, since for any fixed amount of compute, there is a trade-off between making the model bigger and training it on more data. Kaplan showed that if you could 10x the amount of compute, you should 5x the number of parameters and 2x the number of tokens it is trained on.</p><p>The punchline was that 5 months later, the OpenAI team launched <a href="https://arxiv.org/abs/2005.14165">GPT-3</a>. With a staggering 175B parameters trained on 300B tokens, this validated the Kaplan scaling laws with three orders of magnitude more compute than the biggest Kaplan model:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qTIB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadc08913-ec7f-4f11-8b91-876386e76888_1056x812.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qTIB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadc08913-ec7f-4f11-8b91-876386e76888_1056x812.png 424w, https://substackcdn.com/image/fetch/$s_!qTIB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadc08913-ec7f-4f11-8b91-876386e76888_1056x812.png 848w, https://substackcdn.com/image/fetch/$s_!qTIB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadc08913-ec7f-4f11-8b91-876386e76888_1056x812.png 1272w, https://substackcdn.com/image/fetch/$s_!qTIB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadc08913-ec7f-4f11-8b91-876386e76888_1056x812.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qTIB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadc08913-ec7f-4f11-8b91-876386e76888_1056x812.png" width="1056" height="812" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/adc08913-ec7f-4f11-8b91-876386e76888_1056x812.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:812,&quot;width&quot;:1056,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!qTIB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadc08913-ec7f-4f11-8b91-876386e76888_1056x812.png 424w, https://substackcdn.com/image/fetch/$s_!qTIB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadc08913-ec7f-4f11-8b91-876386e76888_1056x812.png 848w, https://substackcdn.com/image/fetch/$s_!qTIB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadc08913-ec7f-4f11-8b91-876386e76888_1056x812.png 1272w, https://substackcdn.com/image/fetch/$s_!qTIB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fadc08913-ec7f-4f11-8b91-876386e76888_1056x812.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>With GPT-3, OpenAI had demonstrated that the scaling law was really quite predictable. While &#8220;bigger is better&#8221; wasn&#8217;t the most surprising result, the idea that you could predict exactly how good it would be, and could get a recipe for the exact ratio of parameters-to-tokens was unprecedented.</p><p>They had also shown that size was qualitatively important: while the &#8220;loss&#8221; of the model changed steadily with size, the actual human-relevant capabilities were more discontinuous and emerged above certain thresholds. One &#8220;emergent capability&#8221; was the ability for models to reason in-context i.e. learn how to do a task accurately if you gave it one (&#8220;one-shot&#8221;) or multiple (&#8220;few-shot&#8221;) examples. By being given examples, large models gained a lot more accuracy than smaller models:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AJXb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a6fb9f7-8688-445a-bc3a-069343026144_1178x644.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AJXb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a6fb9f7-8688-445a-bc3a-069343026144_1178x644.png 424w, https://substackcdn.com/image/fetch/$s_!AJXb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a6fb9f7-8688-445a-bc3a-069343026144_1178x644.png 848w, https://substackcdn.com/image/fetch/$s_!AJXb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a6fb9f7-8688-445a-bc3a-069343026144_1178x644.png 1272w, https://substackcdn.com/image/fetch/$s_!AJXb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a6fb9f7-8688-445a-bc3a-069343026144_1178x644.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AJXb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a6fb9f7-8688-445a-bc3a-069343026144_1178x644.png" width="1178" height="644" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8a6fb9f7-8688-445a-bc3a-069343026144_1178x644.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:644,&quot;width&quot;:1178,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!AJXb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a6fb9f7-8688-445a-bc3a-069343026144_1178x644.png 424w, https://substackcdn.com/image/fetch/$s_!AJXb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a6fb9f7-8688-445a-bc3a-069343026144_1178x644.png 848w, https://substackcdn.com/image/fetch/$s_!AJXb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a6fb9f7-8688-445a-bc3a-069343026144_1178x644.png 1272w, https://substackcdn.com/image/fetch/$s_!AJXb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8a6fb9f7-8688-445a-bc3a-069343026144_1178x644.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>After an initial period of shock, everyone quickly followed suit: DeepMind released <a href="https://arxiv.org/abs/2112.11446">Gopher</a> (280B parameters, 300B tokens) in late 2021 and NVIDIA released <a href="https://arxiv.org/abs/2201.11990">Megatron</a> (530B parameters, 270B tokens) in early 2022.</p><h3><strong>Chinchilla outperforms Gopher</strong></h3><p>Yet just as the race to train big models was heating up, it would get interrupted in March 2022 by <a href="https://arxiv.org/abs/2203.15556">Hoffmann et al.</a>, better known as the Chinchilla scaling laws. In this paper, the DeepMind team revisited what the right ratio of parameters to tokens was for any fixed amount of compute. What they found was that data was far more important than people realised!</p><p>They took three approaches to this, but the second &#8220;isoFLOP&#8221; method is most intuitive:</p><ul><li><p>Take a fixed amount of compute</p></li><li><p>Pick a range of model sizes where the larger ones will be trained on less data</p></li><li><p>Plot each model&#8217;s performance against parameters and join the dots to get an isoFLOP curve, where each point has the same total compute</p></li><li><p>The lowest point on the curve has the best performance</p></li></ul><p>The Chinchilla team exhaustively trained over 400 models across 9 different isoFLOP curves:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oAt8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f99f5cc-9595-466c-8c6a-6a602fffb92d_992x866.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oAt8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f99f5cc-9595-466c-8c6a-6a602fffb92d_992x866.png 424w, https://substackcdn.com/image/fetch/$s_!oAt8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f99f5cc-9595-466c-8c6a-6a602fffb92d_992x866.png 848w, https://substackcdn.com/image/fetch/$s_!oAt8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f99f5cc-9595-466c-8c6a-6a602fffb92d_992x866.png 1272w, https://substackcdn.com/image/fetch/$s_!oAt8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f99f5cc-9595-466c-8c6a-6a602fffb92d_992x866.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oAt8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f99f5cc-9595-466c-8c6a-6a602fffb92d_992x866.png" width="992" height="866" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2f99f5cc-9595-466c-8c6a-6a602fffb92d_992x866.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:866,&quot;width&quot;:992,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!oAt8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f99f5cc-9595-466c-8c6a-6a602fffb92d_992x866.png 424w, https://substackcdn.com/image/fetch/$s_!oAt8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f99f5cc-9595-466c-8c6a-6a602fffb92d_992x866.png 848w, https://substackcdn.com/image/fetch/$s_!oAt8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f99f5cc-9595-466c-8c6a-6a602fffb92d_992x866.png 1272w, https://substackcdn.com/image/fetch/$s_!oAt8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2f99f5cc-9595-466c-8c6a-6a602fffb92d_992x866.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>By fitting a regression line to the minimum point of each isoFLOP curve, they found that a 10x increase in compute should be split equally between model size and dataset size i.e. 3.1x each of them, rather than Kaplan&#8217;s 5x and 2x split.</p><p>As with Kaplan, they validated this by training a model with even more compute than their experiments. In particular, they took Gopher&#8217;s FLOP count but trained it with the parameter-token split implied by their experiments. This was &#8220;Chinchilla&#8221;! With 70B parameters trained on 1.4T tokens, it blew past Gopher on every single benchmark e.g. the Pile, MMLU, BIG-Bench etc.</p><h3><strong>What did Kaplan get wrong?</strong></h3><p>The immediate difference between the two papers was the choice of the learning rate schedule. The learning rate determines how much the parameters of a model change when it sees each token of training data. A common LR schedule is to start training with a low LR and build it up linearly to let the model &#8220;warm up&#8221;, and then slowly decay from the maximum across the rest of training:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CiBw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe09750c0-8f0f-4310-96cf-6baead380c07_1038x766.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CiBw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe09750c0-8f0f-4310-96cf-6baead380c07_1038x766.png 424w, https://substackcdn.com/image/fetch/$s_!CiBw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe09750c0-8f0f-4310-96cf-6baead380c07_1038x766.png 848w, https://substackcdn.com/image/fetch/$s_!CiBw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe09750c0-8f0f-4310-96cf-6baead380c07_1038x766.png 1272w, https://substackcdn.com/image/fetch/$s_!CiBw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe09750c0-8f0f-4310-96cf-6baead380c07_1038x766.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CiBw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe09750c0-8f0f-4310-96cf-6baead380c07_1038x766.png" width="1038" height="766" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e09750c0-8f0f-4310-96cf-6baead380c07_1038x766.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:766,&quot;width&quot;:1038,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!CiBw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe09750c0-8f0f-4310-96cf-6baead380c07_1038x766.png 424w, https://substackcdn.com/image/fetch/$s_!CiBw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe09750c0-8f0f-4310-96cf-6baead380c07_1038x766.png 848w, https://substackcdn.com/image/fetch/$s_!CiBw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe09750c0-8f0f-4310-96cf-6baead380c07_1038x766.png 1272w, https://substackcdn.com/image/fetch/$s_!CiBw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe09750c0-8f0f-4310-96cf-6baead380c07_1038x766.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Kaplan had picked a LR schedule with 3000 steps of warmup followed by a fixed decay schedule. When Chinchilla first came out, they suggested that Kaplan&#8217;s problem was having the same LR schedule for every model, rather than matching the length of the decay to the amount of data the model was trained on.</p><p>In the past few months, however, new work from <a href="https://arxiv.org/abs/2406.19146">Porian et al.</a> has suggested that there might be more at play. By recreating the Kaplan experiments, they found that Chinchilla&#8217;s hypothesis about LR decay wasn&#8217;t that important. What mattered more was:</p><ul><li><p>Kaplan ignoring the compute used by last layer of the model</p></li><li><p>Kaplan&#8217;s fixed 3000 steps of warmup (which meant smaller models trained on less data spent relatively more time in warmup)</p></li><li><p>Kaplan using the same hyperparameters for all model sizes, rather than tuning them for each one</p></li></ul><p>All of this to say, the science of scaling is still pretty nascent! Regardless, one thing is clear: Chinchilla&#8217;s recipe set the bar for how to train large language models. Since then, every time someone says they are training a &#8220;compute optimal model&#8221;, they mean that they are following Chinchilla when deciding the parameter-token split.</p><p><strong>Llama outperforms Chinchilla</strong></p><p>Unfortunately, the description &#8220;compute optimal&#8221; ended up being rather misleading, even to researchers in the field. That&#8217;s because what Chinchilla means by &#8220;compute optimal&#8221; is &#8220;training compute optimal&#8221; i.e. the best parameter-token split if you only consider the compute you spend in training the model. However, you will also want to serve the model for inference, and larger models cost more to serve. Thus very rarely, if ever, do you actually want to train a Chinchilla optimal model.</p><p>Instead, you want to train on fewer parameters but more tokens i.e. moving left from the bottom of an isoFLOP curve (see the red arrow):</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ayJW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b8db860-6682-491a-aafc-0bddbc7d27cb_1092x944.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ayJW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b8db860-6682-491a-aafc-0bddbc7d27cb_1092x944.png 424w, https://substackcdn.com/image/fetch/$s_!ayJW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b8db860-6682-491a-aafc-0bddbc7d27cb_1092x944.png 848w, https://substackcdn.com/image/fetch/$s_!ayJW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b8db860-6682-491a-aafc-0bddbc7d27cb_1092x944.png 1272w, https://substackcdn.com/image/fetch/$s_!ayJW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b8db860-6682-491a-aafc-0bddbc7d27cb_1092x944.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ayJW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b8db860-6682-491a-aafc-0bddbc7d27cb_1092x944.png" width="1092" height="944" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0b8db860-6682-491a-aafc-0bddbc7d27cb_1092x944.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:944,&quot;width&quot;:1092,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!ayJW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b8db860-6682-491a-aafc-0bddbc7d27cb_1092x944.png 424w, https://substackcdn.com/image/fetch/$s_!ayJW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b8db860-6682-491a-aafc-0bddbc7d27cb_1092x944.png 848w, https://substackcdn.com/image/fetch/$s_!ayJW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b8db860-6682-491a-aafc-0bddbc7d27cb_1092x944.png 1272w, https://substackcdn.com/image/fetch/$s_!ayJW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b8db860-6682-491a-aafc-0bddbc7d27cb_1092x944.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This gets you the same loss as before. You have to spend more compute during training, but in return, you get a smaller model that costs less compute at inference.</p><p>While this insight isn&#8217;t especially surprising, it took Meta releasing its <a href="https://arxiv.org/abs/2302.13971">LLaMA</a> models in February 2023 to make it popular. By continuing to train models on even more data than Chinchilla implied, they were able to produce a 13B model that beat GPT-3 and a 65B model that beat Chinchilla.</p><p>Since then, subsequent LLaMA models like the <a href="https://arxiv.org/abs/2407.21783">LLaMA 3 series</a> have gone even further. For a sense of scale, the 8B model was trained on 15T tokens. That is 75x the Chinchilla-optimal amount of 200B tokens for its size. This allowed it to match Chinchilla across a wide range of benchmarks, despite being an order of magnitude smaller.</p><h3><strong>What about inference?</strong></h3><p>It&#8217;s been four years since Kaplan first came out, and at this point, the core decisions in scaling up pretraining are pretty settled. While it&#8217;s hard to tell what the closed frontier labs are doing, the open-source researchers are broadly following the LLaMA recipe and producing models which are competitive with state-of-the-art closed-source models.</p><p>One key insight unlocked by the LLaMA models is that inference compute matters too, and you can trade off training compute and inference compute. What happens when we start to scale inference compute? Come back tomorrow to find out!</p><p><em>Thanks to <a href="https://kevinniechen.com/">Kevin Niechen</a>, <a href="https://zhengdongwang.com/">Zhengdong Wang</a>, <a href="https://www.jannikschilling.com/">Jannik Schilling</a>, <a href="https://bradleyhsu.com/">Bradley Hsu</a> and <a href="https://devanshpanda.com/">Devansh Pandey</a> for discussion and feedback.</em></p>]]></content:encoded></item><item><title><![CDATA[Incidental Causes of Polysemanticity]]></title><description><![CDATA[Even in over-parameterised models, polysemantic neurons can arise incidentally.]]></description><link>https://tmychow.substack.com/p/incidental-causes-of-polysemanticity</link><guid isPermaLink="false">https://tmychow.substack.com/p/incidental-causes-of-polysemanticity</guid><dc:creator><![CDATA[Trevor Chow]]></dc:creator><pubDate>Tue, 14 Nov 2023 21:09:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!7Y0a!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60501747-5f1b-4c2f-b8c2-2944a387882b_1688x1361.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Update: check out the <a href="https://openreview.net/forum?id=AHfE6WeJLQ">paper</a> and the <a href="https://github.com/tmychow/incidental-polysemanticity">code</a>!</p><p>One obstacle to interpreting neural networks is <em>polysemanticity</em>. This where a single neuron represents multiple features.</p><p>If there are more features than neurons, it might be &#8220;necessary&#8221; for the model to be polysemantic in order to represent everything. This is the notion of &#8220;superposition&#8221; from <a href="https://arxiv.org/abs/2209.10652">Elhage et al. (2022)</a>.</p><p>Of course, a clear solution would be to train a model large enough to have at least one neuron per feature. However, what we find in <a href="https://tmychow.com/papers/poly.pdf">&#8220;What Causes Polysemanticity?&#8221;</a> is that polysemanticity can happen &#8220;incidentally&#8221; in the training process, even if we have a large enough model.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7Y0a!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60501747-5f1b-4c2f-b8c2-2944a387882b_1688x1361.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7Y0a!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60501747-5f1b-4c2f-b8c2-2944a387882b_1688x1361.png 424w, https://substackcdn.com/image/fetch/$s_!7Y0a!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60501747-5f1b-4c2f-b8c2-2944a387882b_1688x1361.png 848w, https://substackcdn.com/image/fetch/$s_!7Y0a!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60501747-5f1b-4c2f-b8c2-2944a387882b_1688x1361.png 1272w, https://substackcdn.com/image/fetch/$s_!7Y0a!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60501747-5f1b-4c2f-b8c2-2944a387882b_1688x1361.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7Y0a!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60501747-5f1b-4c2f-b8c2-2944a387882b_1688x1361.png" width="1456" height="1174" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/60501747-5f1b-4c2f-b8c2-2944a387882b_1688x1361.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1174,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!7Y0a!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60501747-5f1b-4c2f-b8c2-2944a387882b_1688x1361.png 424w, https://substackcdn.com/image/fetch/$s_!7Y0a!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60501747-5f1b-4c2f-b8c2-2944a387882b_1688x1361.png 848w, https://substackcdn.com/image/fetch/$s_!7Y0a!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60501747-5f1b-4c2f-b8c2-2944a387882b_1688x1361.png 1272w, https://substackcdn.com/image/fetch/$s_!7Y0a!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60501747-5f1b-4c2f-b8c2-2944a387882b_1688x1361.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Hypothesis</strong></h2><p>When we initialise a neural network, the weights are random. Some neurons will be more correlated with some features than other neurons, just by chance.</p><p>As training happens, the optimiser pushes the weights of those correlated neurons in the direction of the features, so that they can represent the features well. If there is pressure for sparsity, only one neuron will represent each feature.</p><p>Most likely, this is the neuron which was most correlated to the feature at initialisation. If it happened to be the most correlated neuron for multiple features, then it would end up representing multiple features.</p><p>In that case, we get polysemanticity &#8220;incidentally&#8221;.</p><h2><strong>Experiments</strong></h2><p>To test this hypothesis, we consider the simplest possible setup: over-parameterised autoencoders similar to those in <a href="https://arxiv.org/abs/2209.10652">Elhage et al. (2022)</a>. That is:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;y=ReLU(WW^T x)&quot;,&quot;id&quot;:&quot;WSJLBDKQEG&quot;}" data-component-name="LatexBlockToDOM"></div><p>where </p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;x \\in \\mathbb{R}^n, W \\in \\mathbb{R}^{n \\times m}, m \\geq n&quot;,&quot;id&quot;:&quot;AHWGQRWJZZ&quot;}" data-component-name="LatexBlockToDOM"></div><p>These models were trained on the standard basis vectors.</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;e_i \\in \\mathbb{R}^n&quot;,&quot;id&quot;:&quot;SBXWCEWXRI&quot;}" data-component-name="LatexBlockToDOM"></div><p>To induce sparsity, we take two separate approaches: introducing &#8467;1 regularisation for the model weights and adding noise after the hidden layer.</p><p>We find that:</p><ol><li><p>&#8467;1 regularisation induces sparsity</p></li><li><p>Some types of noise can induce sparsity</p></li><li><p>The amount of incidental polysemanticity can be predicted</p></li><li><p>It is due to the weight initialisations</p></li></ol><h2>Sparsity from Regularisation</h2><p><strong>Result 1</strong>: <em>&#8467;1 regularisation induces a winner-takes-all dynamic at a rate proportional to the regularisation parameter</em>.</p><h3><strong>Loss with &#8467;1</strong></h3><p>Taking the ith basis vector as the input, the output of the model is:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;(\\text{ReLU} (W_{1,:} \\cdot W_{i,:}), \\cdots, \\text{ReLU} (W_{n,:} \\cdot W_{i,:}))&quot;,&quot;id&quot;:&quot;PHUNMFPJNV&quot;}" data-component-name="LatexBlockToDOM"></div><p>With &#8467;1 regularisation, our loss function becomes:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot; \\mathcal{L} = \\sum_i (\\text{ReLU}(W_{i,:} \\cdot W_{i,:}) - 1)^2 + \\sum_i \\sum_{j \\neq i} (\\text{ReLU}(W_{i,:} \\cdot W_{j,:}) - 0)^2 + \\sum_{i} \\lambda \\| W_{i,:} \\|_1\n&quot;,&quot;id&quot;:&quot;IODXLFMBWA&quot;}" data-component-name="LatexBlockToDOM"></div><p>where &#955; is the regularisation parameter.</p><p>Thus, gradient descent pushes us in the direction:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;- \\frac{\\partial \\mathcal{L}}{\\partial W_{i,:}} = (4 \\| W_{i,:} \\|^2_2 - 1) W_{i,:} - 4 \\sum_{j \\neq i} \\text{ReLU}(W_{i,:} \\cdot W_{j,:}) W_{j,:} - \\lambda \\text{sign}(W_{i,:})&quot;,&quot;id&quot;:&quot;RPFBTMWZCQ&quot;}" data-component-name="LatexBlockToDOM"></div><h3><strong>Forces for sparsity</strong></h3><p>We can split this into the three terms that push</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;W_{i,:}&quot;,&quot;id&quot;:&quot;OYQMIRDOWL&quot;}" data-component-name="LatexBlockToDOM"></div><ol><li><p>Feature benefit: pushes it to be unit length</p></li><li><p>Interference: pushes it to be orthogonal to other ones</p></li><li><p>Regularisation: pushes it to be sparse, if non-zero</p></li></ol><p>The feature benefit and regularisation forces are in competition. Since the regularisation force has a constant value while the feature benefit force is proportional to its length, the regularisation force will dominate for small values and the feature benefit force will dominate for large values.</p><p>Thus W_{i,k} will be pushed to 0 if it is below some threshold &#952;. Leaving the derivations for the paper, we find the net effect on sparsity is proportional to how far the weight is from this threshold:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\frac{d \\vert W_{i,k} \\vert}{dt} = (1 - \\| W_{i,:} \\|^2_2) (\\vert W_{i,k} \\vert - \\theta)&quot;,&quot;id&quot;:&quot;VIOKILVGKY&quot;}" data-component-name="LatexBlockToDOM"></div><h3><strong>Speed of sparsification</strong></h3><p>In fact, we can quantify the speed at which sparsity is induced. Again, leaving the maths for the paper, it follows from above that the &#8467;1 norm at time t is inversely proportional to &#955;t:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\| W_{i,:}(t) \\|_1 = \\frac{1}{\\Theta(\\frac{1}{m} + \\lambda t)}&quot;,&quot;id&quot;:&quot;OABFQDSMPG&quot;}" data-component-name="LatexBlockToDOM"></div><p>Since |W_{i,:}(t)|&#8776;1 throughout training, the m&#8242; non-zero values at any particular point should have a magnitude of around:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\frac{1}{\\sqrt{m'}}&quot;,&quot;id&quot;:&quot;TEJOBBFEMZ&quot;}" data-component-name="LatexBlockToDOM"></div><p>thus giving:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\| W_{i,:}(t) \\|_1 \\approx m' \\frac{1}{\\sqrt{m'}} = \\sqrt{m'}&quot;,&quot;id&quot;:&quot;DZHTQDIFUA&quot;}" data-component-name="LatexBlockToDOM"></div><p>The weights should go from &#920;(m) to &#920;(1&#955;t) as t&#8805;1/(&#955; sqrt{m}) and &#920;(1) as t&#8805;1/&#955;. This is exactly what we see:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IV5T!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc833f94-d535-4e7a-b145-01b16f04ed46_1314x862.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IV5T!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc833f94-d535-4e7a-b145-01b16f04ed46_1314x862.png 424w, https://substackcdn.com/image/fetch/$s_!IV5T!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc833f94-d535-4e7a-b145-01b16f04ed46_1314x862.png 848w, https://substackcdn.com/image/fetch/$s_!IV5T!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc833f94-d535-4e7a-b145-01b16f04ed46_1314x862.png 1272w, https://substackcdn.com/image/fetch/$s_!IV5T!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc833f94-d535-4e7a-b145-01b16f04ed46_1314x862.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IV5T!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc833f94-d535-4e7a-b145-01b16f04ed46_1314x862.png" width="1314" height="862" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc833f94-d535-4e7a-b145-01b16f04ed46_1314x862.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:862,&quot;width&quot;:1314,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!IV5T!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc833f94-d535-4e7a-b145-01b16f04ed46_1314x862.png 424w, https://substackcdn.com/image/fetch/$s_!IV5T!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc833f94-d535-4e7a-b145-01b16f04ed46_1314x862.png 848w, https://substackcdn.com/image/fetch/$s_!IV5T!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc833f94-d535-4e7a-b145-01b16f04ed46_1314x862.png 1272w, https://substackcdn.com/image/fetch/$s_!IV5T!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc833f94-d535-4e7a-b145-01b16f04ed46_1314x862.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Sparsity from Noise</strong></h2><p><strong>Result 2</strong>: <em>noise drawn from a distribution with excess kurtosis induces sparsity</em>.</p><h3><strong>Implicit regularisation</strong></h3><p>In practice, we don&#8217;t get sparsity in neural networks because of &#8467;1 regularisation. A more realistic cause is via noise in the hidden layer, a la <a href="https://openreview.net/forum?id=cxYaBAXVKg">Bricken et al. (2023)</a>:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;y = \\text{ReLU}(W (W^T x + \\xi))&quot;,&quot;id&quot;:&quot;PRJPAGYGTI&quot;}" data-component-name="LatexBlockToDOM"></div><p>for </p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\xi \\in \\mathbb{R}^m, \\xi \\sim \\mathcal{D}&quot;,&quot;id&quot;:&quot;CVXVTBJJET&quot;}" data-component-name="LatexBlockToDOM"></div><p>Having removed the regularisation term, the loss is rotationally symmetric with respect to the hidden layer (excluding the noise). That means there is no privileged basis, and no particular reason for features to be represented by a single neuron, as opposed to a linear combination of features.</p><p>However, if we take the noise into account, we find that one term in the loss is:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\| W_{i, :} \\|_4^4 (\\frac{\\mu_4}{\\sigma^4} - 3)&quot;,&quot;id&quot;:&quot;INYHGHIOVD&quot;}" data-component-name="LatexBlockToDOM"></div><p>where &#956;_4 is the fourth moment of D, and &#956;_4/&#963;_4&#8722;3 is the excess kurtosis.</p><p>Thus, when D has negative excess kurtosis, this component of the loss will push to increase |W_{i,:}|_4.</p><p>We also have the constraint that |W_{i,:}|_2=1 from before.</p><p>This incentivises W_{i,j}=&#177;1 for some j.</p><p>We also have W_{i,k}=0 for k&#8800;j.</p><h3><strong>Bernoulli vs. Gaussian noise</strong></h3><p>Bernoulli noise of either &#177;&#963; has excess kurtosis of &#8722;2, while Gaussian noise has excess kurtosis of 0. Thus we would expect to see the former to induce sparsity (and a fourth norm of 1), while the latter would not. As expected:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6JWn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebc9323-775e-463b-89c4-90480abfff79_1346x860.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6JWn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebc9323-775e-463b-89c4-90480abfff79_1346x860.png 424w, https://substackcdn.com/image/fetch/$s_!6JWn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebc9323-775e-463b-89c4-90480abfff79_1346x860.png 848w, https://substackcdn.com/image/fetch/$s_!6JWn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebc9323-775e-463b-89c4-90480abfff79_1346x860.png 1272w, https://substackcdn.com/image/fetch/$s_!6JWn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebc9323-775e-463b-89c4-90480abfff79_1346x860.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6JWn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebc9323-775e-463b-89c4-90480abfff79_1346x860.png" width="1346" height="860" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4ebc9323-775e-463b-89c4-90480abfff79_1346x860.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:860,&quot;width&quot;:1346,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!6JWn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebc9323-775e-463b-89c4-90480abfff79_1346x860.png 424w, https://substackcdn.com/image/fetch/$s_!6JWn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebc9323-775e-463b-89c4-90480abfff79_1346x860.png 848w, https://substackcdn.com/image/fetch/$s_!6JWn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebc9323-775e-463b-89c4-90480abfff79_1346x860.png 1272w, https://substackcdn.com/image/fetch/$s_!6JWn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ebc9323-775e-463b-89c4-90480abfff79_1346x860.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Counting Hypothesis</strong></h2><p><strong>Result 3</strong>: <em>the number of polysemantic neurons can be predicted by a simple combinatorial model</em>.</p><h3><strong>Possible model solutions</strong></h3><p>Recall that the output of the autoencoder is:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;(\\text{ReLU}(W_{1,:} \\cdot W_{i,:}), \\cdots, \\text{ReLU}(W_{n,:} \\cdot W_{i,:}))&quot;,&quot;id&quot;:&quot;RKWXJCMEOX&quot;}" data-component-name="LatexBlockToDOM"></div><p>That is, we would like the dot product of W_i,: with itself to be 1, and with all other ones to be &#8804;0.</p><p>One way to satisfy this is if W_{i,:} equals the ith standard basis vector</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\mathbb{R}^m&quot;,&quot;id&quot;:&quot;ASDXARUYCK&quot;}" data-component-name="LatexBlockToDOM"></div><p>This is because WW^T will just be the identity matrix, so</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\text{ReLU}(W W^T e_i) = e_i&quot;,&quot;id&quot;:&quot;AOZMVMYBST&quot;}" data-component-name="LatexBlockToDOM"></div><p>However, when m&gt;n, we have another solution. Take m=4 and n=2, and consider the following weight matrix:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;W = \\begin{bmatrix} 1 &amp; 0 &amp; 0 &amp; 0 \\\\ -1 &amp; 0 &amp; 0 &amp; 0 \\end{bmatrix}&quot;,&quot;id&quot;:&quot;LEEDLOHFFZ&quot;}" data-component-name="LatexBlockToDOM"></div><p>We see that ReLU(WW^Te_1)=(1,0) and ReLU(WW^Te_2)=(0,1), which still satisfies the constraints. This is a polysemantic solution!</p><h3><strong>Interference force</strong></h3><p>Knowing that it is possible, we can now ask why it occurs. One force we haven&#8217;t considered in detail is the interference force:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;- \\sum_{j \\neq i} \\text{ReLU}(W_{i,:} \\cdot W_{j,:}) W_{j,:}&quot;,&quot;id&quot;:&quot;ZENHAZBSAO&quot;}" data-component-name="LatexBlockToDOM"></div><p>up to constants.</p><p>This is only non-zero if the angle between W_{i,:} and the jth is less than &#960;/2.</p><p>Thus, we can simplify by only considering its effect in the direction of W_{i,:}. It has magnitude:</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;\\big( \\sum_{j \\neq i} \\text{ReLU}(W_{i,:} \\cdot W_{j,:}) W_{j,:} \\big) \\cdot W_{i,:} = \\sum_{j \\neq i} \\text{ReLU}(W_{i,:} \\cdot W_{j,:})^2&quot;,&quot;id&quot;:&quot;EHRMWLIMOI&quot;}" data-component-name="LatexBlockToDOM"></div><p>This means that the interference force should be weak at the start when the dot products (of different ones) are mean zero, and only kick in if they share some non-zero coordinate k. If the k coordinate for both have the same sign, the interference force will push at least one to zero. Thus, we would only expect that polysemanticity occurs when they have the opposite sign, since the ReLU will zero out the negative term and they will maintain their non-zero value.</p><h3><strong>Balls and bins</strong></h3><p>With nC2 pairs of features, 1/m probability of the most significant neuron being the same for both and 1/2 probability of them having the opposite sign, we would predict there to be</p><div class="latex-rendered" data-attrs="{&quot;persistentExpression&quot;:&quot;{n \\choose 2} \\frac{1}{2m} \\approx \\frac{n^2}{4m}&quot;,&quot;id&quot;:&quot;XEBBDQMOFI&quot;}" data-component-name="LatexBlockToDOM"></div><p>polysemantic neurons, which we find:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Xqnu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2180894-f88c-4d99-941d-a263ba803706_1298x855.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Xqnu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2180894-f88c-4d99-941d-a263ba803706_1298x855.png 424w, https://substackcdn.com/image/fetch/$s_!Xqnu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2180894-f88c-4d99-941d-a263ba803706_1298x855.png 848w, https://substackcdn.com/image/fetch/$s_!Xqnu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2180894-f88c-4d99-941d-a263ba803706_1298x855.png 1272w, https://substackcdn.com/image/fetch/$s_!Xqnu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2180894-f88c-4d99-941d-a263ba803706_1298x855.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Xqnu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2180894-f88c-4d99-941d-a263ba803706_1298x855.png" width="1298" height="855" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c2180894-f88c-4d99-941d-a263ba803706_1298x855.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:855,&quot;width&quot;:1298,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!Xqnu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2180894-f88c-4d99-941d-a263ba803706_1298x855.png 424w, https://substackcdn.com/image/fetch/$s_!Xqnu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2180894-f88c-4d99-941d-a263ba803706_1298x855.png 848w, https://substackcdn.com/image/fetch/$s_!Xqnu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2180894-f88c-4d99-941d-a263ba803706_1298x855.png 1272w, https://substackcdn.com/image/fetch/$s_!Xqnu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2180894-f88c-4d99-941d-a263ba803706_1298x855.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Initialisation Hypothesis</strong></h2><p><strong>Result 4</strong>: <em>polysemanticity occurs in m&gt;n models due to weight initialisations</em>.</p><p>If initialisations were the cause of polysemanticity, the weights at the start of training should be correlated with the weights at the end. That is, the diagonals of WW^T should be larger than the off-diagonals. As predicted:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JZ2g!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc77fd188-b959-4d69-a0e9-773dd637ef4f_805x850.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JZ2g!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc77fd188-b959-4d69-a0e9-773dd637ef4f_805x850.png 424w, https://substackcdn.com/image/fetch/$s_!JZ2g!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc77fd188-b959-4d69-a0e9-773dd637ef4f_805x850.png 848w, https://substackcdn.com/image/fetch/$s_!JZ2g!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc77fd188-b959-4d69-a0e9-773dd637ef4f_805x850.png 1272w, https://substackcdn.com/image/fetch/$s_!JZ2g!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc77fd188-b959-4d69-a0e9-773dd637ef4f_805x850.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JZ2g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc77fd188-b959-4d69-a0e9-773dd637ef4f_805x850.png" width="805" height="850" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c77fd188-b959-4d69-a0e9-773dd637ef4f_805x850.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:850,&quot;width&quot;:805,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!JZ2g!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc77fd188-b959-4d69-a0e9-773dd637ef4f_805x850.png 424w, https://substackcdn.com/image/fetch/$s_!JZ2g!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc77fd188-b959-4d69-a0e9-773dd637ef4f_805x850.png 848w, https://substackcdn.com/image/fetch/$s_!JZ2g!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc77fd188-b959-4d69-a0e9-773dd637ef4f_805x850.png 1272w, https://substackcdn.com/image/fetch/$s_!JZ2g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc77fd188-b959-4d69-a0e9-773dd637ef4f_805x850.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Future Work</strong></h2><p>The <em>incidental polysemanticity</em> we have discussed in our work is qualitatively different from <em>necessary polysemanticity</em>, because it arises from the learning dynamics inducing a privileged basis. Furthermore, the fact that it occurs all the way up to m=n^2 suggests that making the model larger may not solve the problem.</p><p>We look forward to future work which investigates this phenomenon in more fleshed-out settings, and which attempts to nudge the learning dynamics to stop it from occurring.</p>]]></content:encoded></item><item><title><![CDATA[AGI and the EMH]]></title><description><![CDATA[Markets are not expecting AGI in the next 30 years!]]></description><link>https://tmychow.substack.com/p/agi-and-the-emh</link><guid isPermaLink="false">https://tmychow.substack.com/p/agi-and-the-emh</guid><dc:creator><![CDATA[Trevor Chow]]></dc:creator><pubDate>Wed, 11 Jan 2023 23:16:00 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/bd4c03fc-4192-4bf2-b43e-68838000f34c_1200x742.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>With <a href="https://basilhalperin.com/">Basil Halperin</a> and <a href="https://jzmazlish.substack.com/">Zach Mazlish</a>; cross-posted to <a href="https://forum.effectivealtruism.org/posts/8c7LycgtkypkgYjZx/agi-and-the-emh-markets-are-not-expecting-aligned-or">EA Forum</a> and <a href="https://www.lesswrong.com/posts/ngpC5PFAgxHJMhicM/agi-and-the-emh-markets-are-not-expecting-aligned-or-1">LessWrong</a>.</p><p>Update: we have now turned this into an academic <a href="https://tmychow.com/papers/agi_emh.pdf">paper</a>!</p><div><hr></div><p>In this post, we point out that short AI timelines would cause <em>real interest rates</em> to be high, and would do so under expectations of either unaligned or aligned AI. However, 30- to 50-year real interest rates are low. We argue that this suggests one of two possibilities:</p><ol><li><p><strong>Long(er) timelines.</strong> Financial markets are often highly effective information aggregators (the &#8220;efficient market hypothesis&#8221;), and therefore real interest rates accurately reflect that transformative AI is unlikely to be developed in the next 30-50 years.</p></li><li><p><strong>Market inefficiency.</strong> Markets are radically underestimating how soon advanced AI technology will be developed, and real interest rates are therefore too low. There is thus an opportunity for philanthropists to borrow while real rates are low to cheaply do good today; and/or an opportunity for anyone to earn excess returns by betting that real rates will rise.</p></li></ol><p>In the rest of this post we flesh out this argument.</p><ol><li><p>Both intuitively and under every mainstream economic model, the &#8220;<a href="https://www.openphilanthropy.org/research/could-advanced-ai-drive-explosive-economic-growth/">explosive growth</a>&#8221; caused by <em>aligned</em> AI would cause high real interest rates.</p></li><li><p>Both intuitively and under every mainstream economic model, the existential risk caused by <em>unaligned</em> AI would cause high real interest rates.</p></li><li><p>We show that in the historical data, indeed, real interest rates have been correlated with future growth.</p></li><li><p>Plugging the <a href="https://www.lesswrong.com/posts/AfH2oPHCApdKicM4m/two-year-update-on-my-personal-ai-timelines">Cotra probabilities</a> for AI timelines into the baseline workhorse model of economic growth implies substantially higher real interest rates today.</p></li><li><p>In particular, we argue that markets are decisively rejecting the shortest possible timelines of 0-10 years.</p></li><li><p>We argue that <a href="https://twitter.com/ESYudkowsky/status/1455949250320240641">the efficient market hypothesis (EMH) is a reasonable prior</a>, and therefore one reasonable interpretation of low real rates is that since markets are simply <em>not</em> forecasting short timelines, neither should we be forecasting short timelines.</p></li><li><p>Alternatively, if you believe that financial markets are wrong, then you have the opportunity to (1) borrow cheaply today and use that money to e.g. fund AI safety work; and/or (2) earn alpha by betting that real rates will rise.</p></li></ol><p>An order-of-magnitude estimate is that, if markets are getting this wrong, then there is easily $1 trillion lying on the table in the US treasury bond market alone &#8211; setting aside the enormous implications for every other asset class.</p><p><strong>Interpretation.</strong> We view our argument as the best existing <em>outside view</em> evidence on AI timelines &#8211; but also as only <em>one</em> model among a mixture of models that you should consider when thinking about AI timelines. The logic here is a simple implication of a few basic concepts in orthodox economic theory and some supporting empirical evidence, which is important because the unprecedented nature of transformative AI makes &#8220;reference class&#8221;-based outside views difficult to construct. This outside view approach contrasts with, and complements, an inside view approach, which attempts to build a detailed structural model of the world to forecast timelines (e.g. <a href="https://www.lesswrong.com/posts/AfH2oPHCApdKicM4m/two-year-update-on-my-personal-ai-timelines">Cotra 2020</a>; see also <a href="https://nostalgebraist.tumblr.com/post/693718279721730048/on-bio-anchors">Nostalgebraist 2022</a>).</p><p><strong>Outline. </strong>If you want a short version of the argument, sections I and II (700 words) are the heart of the post. Additionally, the section titles are themselves summaries, and we use text formatting to highlight key ideas.</p><h1><strong>I. Long-term real rates would be high if the market was pricing advanced AI</strong></h1><p>Real interest rates reflect, among other things:</p><ol><li><p>Time discounting, which includes the probability of death</p></li><li><p>Expectations of future economic growth</p></li></ol><p>This claim is compactly summarized in the &#8220;<a href="https://en.wikipedia.org/wiki/Keynes%E2%80%93Ramsey_rule">Ramsey rule</a>&#8221; (and the only math that we will introduce in this post), a version of the &#8220;Euler equation&#8221; that in one form or another lies at the heart of <em>every</em> theory and model of dynamic macroeconomics:</p><p>r=&#961;+&#963;g</p><p>where:</p><ul><li><p>r is the real interest rate over a given time horizon</p></li><li><p>&#961; is time discounting over that horizon</p></li><li><p>&#963; is a (positive) preference parameter reflecting how much someone cares about smoothing consumption over time</p></li><li><p>g is the growth rate</p></li></ul><p>(Internalizing the meaning of these Greek letters is wholly not necessary.)</p><p>While more elaborate macroeconomic theories vary this equation in interesting and important ways, it is common to all of these theories that the real interest rate is higher when either (1) the time discount rate is high or (2) future growth is expected to be high.</p><p>We now provide some intuition for these claims.</p><p><strong>Time discounting and mortality risk.</strong> Time discounting refers to how much people discount the future relative to the present, which captures both (i) <em><a href="https://d101vc9winf8ln.cloudfront.net/documents/27957/original/Cowen___Parfit_-_Against_the_social_discount_rate.pdf?1523454279">intrinsic</a></em><a href="https://d101vc9winf8ln.cloudfront.net/documents/27957/original/Cowen___Parfit_-_Against_the_social_discount_rate.pdf?1523454279"> preference for the present</a> relative to the future and (ii) the probability of death.</p><p>The intuition for why the probability of death raises the real rate is the following. Suppose we expect with high probability that humanity will go extinct next year. Then there is no reason to save today: no one will be around to use the savings. This pushes up the real interest rate, since there is less money available for lending.</p><p><strong>Economic growth.</strong> To understand why higher economic growth raises the real interest rate, the intuition is similar. If we expect to be wildly rich next year, then there is also no reason to save today: we are going to be tremendously rich, so we might as well use our money today while we&#8217;re still comparatively poor.</p><p>(For the formal math of the Euler equation, <a href="https://www.brookings.edu/wp-content/uploads/2005/01/2005a_bpea_baker.pdf">Baker, Delong, and Krugman 2005</a> is a useful reference. The core intuition is that either mortality risk or the prospect of utopian abundance reduces the supply of savings, due to <em>consumption smoothing</em> logic, which pushes up real interest rates.)</p><p><strong>Transformative AI and real rates.</strong> Transformative AI would either raise the risk of extinction (if unaligned), or raise economic growth rates (if aligned).</p><p>Therefore, based on the economic logic above, the prospect of transformative AI &#8211; unaligned or aligned &#8211; will result in high real interest rates. This is the key claim of this post.</p><p>As an example in the aligned case, Davidson (2021) usefully defines AI-induced &#8220;explosive growth&#8221; as an increase in growth rates to at least 30% annually. Under a baseline calibration where &#963;=1 and &#961;=0.01, and importantly assuming growth rates are known with certainty, the Euler equation implies that moving from 2% growth to 30% growth would raise real rates from 3% to 31%!</p><p>For comparison, real rates in the data we discuss below have never gone above 5%.</p><p>(In using terms like &#8220;transformative AI&#8221; or &#8220;advanced AI&#8221;, we refer to the cluster of concepts discussed in <a href="https://intelligence.org/files/AIPosNegFactor.pdf">Yudkowsky 2008</a>, <a href="https://en.wikipedia.org/wiki/Superintelligence:_Paths,_Dangers,_Strategies">Bostrom 2014</a>, <a href="https://docs.google.com/document/d/1IJ6Sr-gPeXdSJugFulwIpvavc0atjHGM82QjIfUSBGQ/edit">Cotra 2020</a>, <a href="https://arxiv.org/abs/2206.13353">Carlsmith 2021</a>, <a href="https://www.openphilanthropy.org/research/could-advanced-ai-drive-explosive-economic-growth/">Davidson 2021</a>, <a href="https://www.cold-takes.com/most-important-century/">Karnofsky 2022</a>, and <a href="https://www.openphilanthropy.org/research/some-background-on-our-views-regarding-advanced-artificial-intelligence/#Sec1">related literature</a>: AI technology that precipitates a transition comparable to the agricultural or industrial revolutions.)</p><h1><strong>II. But: long-term real rates are low</strong></h1><p>The <a href="https://fred.stlouisfed.org/series/DFII30">US 30-year real interest rate</a> ended 2022 at 1.6%. Over the full year it averaged 0.7%, and as recently as March was below zero. Looking at a shorter time horizon, the US <a href="https://fred.stlouisfed.org/series/DFII10">10-year</a> real interest rate is 1.6%, and similarly was below <em>negative </em>one percent as recently as March.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yMph!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31d0c1d4-d2e8-4ebf-872c-8586276ab94f_1350x778.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yMph!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31d0c1d4-d2e8-4ebf-872c-8586276ab94f_1350x778.png 424w, https://substackcdn.com/image/fetch/$s_!yMph!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31d0c1d4-d2e8-4ebf-872c-8586276ab94f_1350x778.png 848w, https://substackcdn.com/image/fetch/$s_!yMph!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31d0c1d4-d2e8-4ebf-872c-8586276ab94f_1350x778.png 1272w, https://substackcdn.com/image/fetch/$s_!yMph!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31d0c1d4-d2e8-4ebf-872c-8586276ab94f_1350x778.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yMph!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31d0c1d4-d2e8-4ebf-872c-8586276ab94f_1350x778.png" width="1350" height="778" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/31d0c1d4-d2e8-4ebf-872c-8586276ab94f_1350x778.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:778,&quot;width&quot;:1350,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:286785,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://tmychow.substack.com/i/138118236?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31d0c1d4-d2e8-4ebf-872c-8586276ab94f_1350x778.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yMph!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31d0c1d4-d2e8-4ebf-872c-8586276ab94f_1350x778.png 424w, https://substackcdn.com/image/fetch/$s_!yMph!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31d0c1d4-d2e8-4ebf-872c-8586276ab94f_1350x778.png 848w, https://substackcdn.com/image/fetch/$s_!yMph!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31d0c1d4-d2e8-4ebf-872c-8586276ab94f_1350x778.png 1272w, https://substackcdn.com/image/fetch/$s_!yMph!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31d0c1d4-d2e8-4ebf-872c-8586276ab94f_1350x778.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>(Data sources used here are explained in section V.)</em></p><p>The UK in autumn 2021 sold a <em>50</em>-year real bond <a href="https://www.reuters.com/markets/europe/uk-sells-new-2073-index-linked-gilt-with-record-low-volume-2021-11-23/">with</a> a -2.4% rate at the time. Real rates on analogous bonds in other developed countries in recent years have been similarly low/negative for the longest horizons available. <a href="https://www.tradingview.com/symbols/TVC-AT100Y/">Austria</a> has a <em>100</em>-year nominal bond &#8211; being <a href="https://www.youtube.com/watch?v=VYcgcWS_NBQ">nominal</a> should make its rate higher due to expected inflation &#8211; with yields less than 3%.</p><p>Thus the conclusion previewed above: financial markets, as evidenced by real interest rates, are not expecting a high probability of either AI-induced growth acceleration or elevated existential risk, on <em>at least</em> a 30-50 year time horizon.</p><h1><strong>III. Uncertainty, takeoff speeds, inequality, and stocks</strong></h1><p>In this section we briefly consider some potentially important complications.</p><p><strong>Uncertainty. </strong>The Euler equation and the intuition described above assumed <em>certainty</em> about AI timelines, but taking into account uncertainty does not change the core logic. With uncertainty about the future economic growth rate, then the real interest rate reflects the <em>expected</em> future economic growth rate, where importantly the expectation is taken over the <a href="https://en.wikipedia.org/wiki/Risk-neutral_measure">risk-neutral measure</a>: in brief, probabilities of different states are reweighted by their marginal utility. We return to this in our quantitative model below.</p><p><strong>Takeoff speeds.</strong> Nothing in the logic above relating growth to real rates depends on <a href="https://www.lesswrong.com/posts/vwLxd6hhFvPbvKmBH/yudkowsky-and-christiano-discuss-takeoff-speeds">slow vs. fast takeoff speed</a>; the argument can be reread under either assumption and nothing changes. Likewise, when considering the case of aligned AI, rates should be elevated whether economic growth starts to rise more rapidly <em>before</em> advanced AI is developed <a href="https://www.lesswrong.com/posts/aFaKhG86tTrKvtAnT/against-gdp-as-a-metric-for-timelines-and-takeoff-speeds">or only does so afterwards</a>. What matters is that GDP &#8211; or really, consumption &#8211; <em>ends up</em> high within the time horizon under consideration. As long as future consumption will be high within the time horizon, then there is less motive to save today (&#8220;consumption smoothing&#8221;), pushing up the real rate.</p><p><strong>Inequality. </strong>The logic above assumed that the development of transformative AI affects everyone equally. This is a reasonable assumption in the case of unaligned AI, where it is thought that all of humanity will be evaporated. However, when considering <em>aligned</em> AI, it may be thought that only some will benefit, and therefore real interest rates will not move much: if only an elite Silicon Valley minority is expected to have utopian wealth next year, then everyone else may very well still choose to save today.</p><p>It is indeed the case that inequality in expected gains from transformative AI would <em>dampen</em> the impact on real rates, but this argument should not be overrated. First, asset prices can be <a href="https://www.journals.uchicago.edu/doi/abs/10.1086/680996">crudely</a> thought of as reflecting a <em>wealth-weighted average</em> across investors. Even if only an elite minority becomes fabulously wealthy, it is <em>their</em> desire for consumption smoothing which will end up dominating the determination of the real rate. Second, <em>truly</em> transformative AI leading to 30%+ economy-wide growth (&#8220;<a href="https://moores.samaltman.com/">Moore&#8217;s law for everything</a>&#8221;) would not be possible without having economy-wide benefits.</p><p><strong>Stocks</strong>. One naive objection to the argument here would be the claim that real interest rates sound like an odd, arbitrary asset price to consider; certainly stock prices are the asset price that receive the most media attention.</p><p>In appendix 1, we explain that the level of the real interest rate affects <em>every</em> asset price: stocks for instance reflect the <em>present discounted value</em> of future dividends; and real interest rates <em>determine</em> <em>the discount rate used to discount</em> those future dividends. Thus, if real interest rates are &#8216;wrong&#8217;, <em>every</em> asset price is wrong. If real interest rates are wrong, <em>a lot</em> of money is on the table, a point to which we return in section X.</p><p>We also argue that stock prices in particular are not a useful indicator of market expectations of AI timelines. Above all, high stock prices of <a href="https://twitter.com/robertwiblin/status/1577782210568003585">chipmakers</a> or companies like Alphabet (parent of DeepMind) could only reflect expectations for <em>aligned</em> AI and could not be informative of the risk of <em>unaligned</em> AI. Additionally, as we explain further in the appendix, aligned AI could even <em>lower</em> equity prices, by pushing up discount rates.</p><h1><strong>IV. Historical data on interest rates supports the theory: preliminaries</strong></h1><p>In section I, we gave theoretical <em>intuition</em> for why higher expected growth or higher existential risk would result in higher interest rates: expectations for such high growth or mortality risk would lead people to want to save less and borrow more today. In this section and the next two, we showcase some simple empirical evidence that the predicted relationships hold in the available data.</p><p><strong>Measuring real rates.</strong> To compare historical real interest rates to historical growth, we need to measure real interest rates.</p><p>Most bonds historically have been <em>nominal</em>, where the yield is not adjusted for changes in inflation. Therefore, the vast majority of research studying real interest rates starts with <em>nominal</em> interest rates, attempts to construct an estimate of expected inflation using some statistical model, and then subtracts this estimate of expected inflation from the nominal rate to get an <em>estimated</em> real interest rate. However, constructing measures of inflation expectations is extremely difficult, and as a result most papers in this literature are not very informative.</p><p>Additionally, most bonds historically have had some risk of default. Adjusting for this default premium is also extremely difficult, which in particular complicates analysis of long-run interest rate trends.</p><p>The difficulty in measuring real rates is one of the main causes, in our view, of <a href="https://marginalrevolution.com/marginalrevolution/2015/04/tyler-cowens-three-laws.html">Tyler Cowen&#8217;s Third Law</a>: &#8220;all propositions about real interest rates are wrong&#8221;. Throughout this piece, we are badly violating this (G&#246;delian) Third Law. In appendix 2, we expand on our argument that the source of Tyler&#8217;s Third Law is measurement issues in the extant literature, together with some separate, frequent conceptual errors.</p><p><strong>Our approach.</strong> We take a more direct approach.</p><p><strong>Real rates. </strong>For our primary analysis, we instead use market real interest rates from <em>inflation-linked bonds</em>. Because we use interest rates <em>directly</em> from inflation-linked bonds &#8211; instead of constructing shoddy estimates of inflation expectations to use with nominal interest rates &#8211; this approach avoids the measurement issue just discussed (and, we argue, allows us to escape Cowen&#8217;s Third Law).</p><p>To our knowledge, prior literature has not used real rates from inflation-linked bonds only because these bonds are comparatively new. Using inflation-linked bonds confines our sample to the last ~20 years in the <a href="https://www.federalreserve.gov/pubs/feds/2008/200805/200805abs.html">US</a>, the last ~30 in the <a href="https://www.bankofengland.co.uk/statistics/yield-curves">UK</a>/Australia/Canada. Before that, inflation-linked bonds didn&#8217;t exist. Other countries have data for even fewer years and less liquid bond markets.</p><p>(The yields on inflation-linked bonds are not perfect measures of real rates, because of <a href="https://www.frbsf.org/economic-research/files/wp08-34bk.pdf">risk premia</a>, <a href="https://www.basilhalperin.com/essays/efficient-restaurant-hypothesis-mental-model-finance-food.html">liquidity issues</a>, and some <a href="https://www.ijcb.org/journal/ijcb12q4a2.pdf">subtle</a> <a href="https://www.cambridge.org/core/journals/journal-of-financial-and-quantitative-analysis/article/abs/tips-from-tips-the-informational-content-of-treasury-inflationprotected-security-prices/76EBC9C8B0FEF951704DCCA4A4781CEC">issues</a> with the way these securities are structured. You can build a model and attempt to strip out these issues; here, we will just use the raw rates. If you prefer to think of these empirics as &#8220;are <em>inflation-linked bond yields</em> predictive of future real growth&#8221; rather than &#8220;are <em>real rates</em> predictive of future real growth&#8221;, that interpretation is still sufficient for the logic of this post.)</p><p><strong>Nominal rates.</strong> Because there are only 20 or 30 years of data on <em>real</em> interest rates from inflation-linked bonds, we supplement our data by also considering unadjusted <em>nominal</em> interest rates. Nominal interest rates reflect real interest rates plus inflation expectations, so it is not appropriate to compare nominal interest rates to <em>real</em> GDP growth.</p><p>Instead, analogously to comparing <em>real</em> interest rates to <em>real</em> GDP growth, we compare <em>nominal</em> interest rates to <em>nominal</em> GDP growth. The latter is not an ideal comparison under economic theory &#8211; and inflation variability could swamp real growth variability &#8211; but we argue that this approach is simple and transparent.</p><p>Looking at nominal rates allows us to have a very large sample of countries for many decades: we use <a href="https://data.oecd.org/interest/long-term-interest-rates.htm">OECD data</a> on nominal rates available for up to 70 years across 39 countries.</p><h1><strong>V. Historical data on interest rates supports the theory: graphs</strong></h1><p>The goal of this section is to show that real interest rates have correlated with future real economic growth, and secondarily, that nominal interest rates have correlated with future nominal economic growth. We also briefly discuss the state of empirical evidence on the correlation between real rates and existential risk.</p><p><strong>Real rates vs. real growth. </strong>A first cut at the data suggests that, indeed, higher real rates today predict higher real growth in the future:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IoV9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bb50c4a-e57e-4009-bda5-2462cdaaf9df_1384x710.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IoV9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bb50c4a-e57e-4009-bda5-2462cdaaf9df_1384x710.png 424w, https://substackcdn.com/image/fetch/$s_!IoV9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bb50c4a-e57e-4009-bda5-2462cdaaf9df_1384x710.png 848w, https://substackcdn.com/image/fetch/$s_!IoV9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bb50c4a-e57e-4009-bda5-2462cdaaf9df_1384x710.png 1272w, https://substackcdn.com/image/fetch/$s_!IoV9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bb50c4a-e57e-4009-bda5-2462cdaaf9df_1384x710.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IoV9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bb50c4a-e57e-4009-bda5-2462cdaaf9df_1384x710.png" width="1384" height="710" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1bb50c4a-e57e-4009-bda5-2462cdaaf9df_1384x710.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:710,&quot;width&quot;:1384,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:183564,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://tmychow.substack.com/i/138118236?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bb50c4a-e57e-4009-bda5-2462cdaaf9df_1384x710.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!IoV9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bb50c4a-e57e-4009-bda5-2462cdaaf9df_1384x710.png 424w, https://substackcdn.com/image/fetch/$s_!IoV9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bb50c4a-e57e-4009-bda5-2462cdaaf9df_1384x710.png 848w, https://substackcdn.com/image/fetch/$s_!IoV9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bb50c4a-e57e-4009-bda5-2462cdaaf9df_1384x710.png 1272w, https://substackcdn.com/image/fetch/$s_!IoV9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1bb50c4a-e57e-4009-bda5-2462cdaaf9df_1384x710.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>To see how to read these graphs, take the left-most graph (&#8220;10-year horizon&#8221;) for example. The x-axis shows the level of the real interest rate, as reflected on 10-year inflation linked bonds. The y-axis shows average real GDP growth over the <em>following</em> 10 years.</p><p>The middle and right hand graphs show the same, at the 15-year and 20-year horizons. The scatter plot shows all available data for the US (<a href="https://www.federalreserve.gov/pubs/feds/2008/200805/200805abs.html">since 1999</a>), the UK (<a href="https://www.bankofengland.co.uk/statistics/yield-curves">since 1985</a>), Australia (since 1995), and Canada (since 1991). (Data for Australia and Canada is only available at the 10-year horizon, and comes from <a href="https://augurlabs.com/curve/">Augur Labs</a>.)</p><p>Eyeballing the figure, there appears to be a strong relationship between real interest rates today and future economic growth over the next 10-20 years.</p><p>To our knowledge, this simple stylized fact is novel.</p><p><strong>Caveats. </strong>&#8220;Eyeballing it&#8221; is not a formal econometric method; but, this is a blog post not a journal article (TIABPNAJA). We do not perform any formal statistical tests here, but we do want to acknowledge some important statistical points and other caveats.</p><p>First, the data points in the scatter plot are not statistically independent: real rates and growth are both persistent variables; the data points contain overlapping periods; and growth rates in these four countries are correlated. These issues are evident even from eyeballing <a href="https://raw.githubusercontent.com/basilhalperin/agi_emh/main/r_figs/ts_r_vs_gdp_10.png">the time series</a>. Second, of course this relationship is not causally identified: we do not have exogenous variation in real growth rates. (If you have ideas for identifying the causal effect of higher real growth expectations on real rates, we would love to discuss with you.)</p><p>Relatedly, many other things are changing in the world which are likely to affect real rates. <a href="https://www.forourposterity.com/best-chad-jones-papers/">Population growth</a> is slowing, <a href="https://worthwhile.typepad.com/worthwhile_canadian_initi/2013/10/asset-prices-and-the-retirement-revolution.html">retirement</a> is lengthening, the population is <a href="http://web.stanford.edu/~aauclert/demowealth21.pdf">aging</a>. But under AI-driven &#8220;explosive&#8221; growth &#8211; again say 30%+ annual growth, following the excellent analysis of <a href="https://www.openphilanthropy.org/could-advanced-ai-drive-explosive-economic-growth">Davidson (2021)</a> &#8211; then, we might reasonably expect that this massive of an increase in the growth rate would drown out the impact of any other factors.</p><p><strong>Nominal rates vs. nominal growth.</strong> Turning now to evidence from nominal interest rates, recall that the usefulness of this exercise is that while there only exists 20 or 30 years of data on <em>real</em> interest rates for two countries, there is much more data on <em>nominal</em> interest rates.</p><p>We simply take all available data on 10-year nominal rates from the set of <a href="https://data.oecd.org/interest/long-term-interest-rates.htm">39 OECD countries since 1954</a>. The following scatterplot compares the 10-year nominal interest versus nominal GDP growth over the succeeding ten years by country:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CIrt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7ac8a0-11f6-4aee-8d2d-a84b4e725f8c_1234x1334.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CIrt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7ac8a0-11f6-4aee-8d2d-a84b4e725f8c_1234x1334.png 424w, https://substackcdn.com/image/fetch/$s_!CIrt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7ac8a0-11f6-4aee-8d2d-a84b4e725f8c_1234x1334.png 848w, https://substackcdn.com/image/fetch/$s_!CIrt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7ac8a0-11f6-4aee-8d2d-a84b4e725f8c_1234x1334.png 1272w, https://substackcdn.com/image/fetch/$s_!CIrt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7ac8a0-11f6-4aee-8d2d-a84b4e725f8c_1234x1334.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CIrt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7ac8a0-11f6-4aee-8d2d-a84b4e725f8c_1234x1334.png" width="1234" height="1334" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1a7ac8a0-11f6-4aee-8d2d-a84b4e725f8c_1234x1334.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1334,&quot;width&quot;:1234,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:466416,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://tmychow.substack.com/i/138118236?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7ac8a0-11f6-4aee-8d2d-a84b4e725f8c_1234x1334.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CIrt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7ac8a0-11f6-4aee-8d2d-a84b4e725f8c_1234x1334.png 424w, https://substackcdn.com/image/fetch/$s_!CIrt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7ac8a0-11f6-4aee-8d2d-a84b4e725f8c_1234x1334.png 848w, https://substackcdn.com/image/fetch/$s_!CIrt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7ac8a0-11f6-4aee-8d2d-a84b4e725f8c_1234x1334.png 1272w, https://substackcdn.com/image/fetch/$s_!CIrt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a7ac8a0-11f6-4aee-8d2d-a84b4e725f8c_1234x1334.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Again, there is a strong positive &#8211; if certainly not perfect &#8211; relationship. (For example, the outlier brown dots at the bottom of the graph are Greece, whose high interest rates despite negative NGDP growth reflect high default risk during an economic depression.)</p><p>The same set of nontrivial caveats apply to this analysis as above.</p><p>We consider this data from nominal rates to be significantly weaker evidence than the evidence from real rates, but corroboration nonetheless.</p><p><strong>Backing out market-implied timelines.</strong> Taking the univariate pooled OLS results from the real rate data <em>far</em> too seriously, the fact that the 10-year real rate in the US ended 2022 at 1.6% would predict average annual real GDP growth of 2.6% over the next 10 years in the US;<strong> </strong>the analogous interest rate of -0.2% in the UK would predict 0.7% annual growth over the next 10 years in the UK. Such growth rates, clearly, are not compatible with the arrival of transformative aligned AI within this horizon.</p><h1><strong>VI. Empirical evidence on real rates and mortality risk</strong></h1><p>We have argued that in the theory, real rates should be higher in the face of high economic growth or high mortality risk; empirically, so far, we have only shown a relationship between real rates and growth, but not between real rates and mortality.</p><p>Showing that real rates accurately reflect changes in existential risk is very difficult, because there is no word-of-god measurement of how existential risk has evolved over time.</p><p>We would be very interested in pursuing new empirical research examining &#8220;asset pricing under existential risk&#8221;. In appendix 3, we perform a scorched-earth literature review and find essentially zero existing empirical evidence on real rates and <em>existential</em> risks.</p><p><strong>Disaster risk. </strong>In particular, the extant literature does not study existential risks but instead &#8220;merely&#8221; <em>disaster risks</em>, under which real assets are devastated but humanity is not exterminated. <a href="https://www.aeaweb.org/articles?id=10.1257/pol.5.4.306">Disaster risks</a> do <em>not</em> necessarily raise real rates &#8211; indeed, such risks are thought to <em>lower</em> real rates due to precautionary savings. That notwithstanding, some highlights of the appendix review include <a href="https://www.jstor.org/stable/pdf/2117594.pdf?">a small set of papers</a> <a href="https://www.jstor.org/stable/pdf/2600866.pdf">finding</a> <a href="https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1465-7295.1990.tb00824.x">that</a> <a href="https://www.nber.org/system/files/working_papers/w0887/w0887.pdf">individuals</a> with a higher perceived risk of nuclear conflict during the Cold War saved less, as well as <a href="https://davidandrewfiner.com/jmp/">a paper</a> noting that equities which were headquartered in cities more likely to be targeted by Soviet missiles did worse during the Cuban missile crisis (<a href="https://www.sciencedirect.com/science/article/abs/pii/S1062976922000217">see also</a>). Our assessment is that these and the other available papers on <em>disaster</em> risks discussed in the appendix have severe limitations for the purposes here.</p><p><strong>Individual mortality risk. </strong>We judge that the best evidence on this topic comes instead from examining the relationship between <em>individual mortality risk</em> and <em>savings/investment</em> behavior. The logic we provided was that if humanity will be extinct next year, then there is no reason to save, pushing up the real rate. Similar logic says that at the <em>individual</em> level, a higher risk of death for any reason should lead to lower savings and less investment in human capital. Examples of lower savings at the <em>individual </em>level need not raise interest rates at the <em>economy-wide</em> level, but do provide evidence for the <em>mechanism</em> whereby extinction risk should lead to lower saving and thus higher interest rates.</p><p>One example comes <a href="https://www.aeaweb.org/articles?id=10.1257/app.20150369">from Malawi</a>, where the provision of a new AIDS therapy caused a significant increase in life expectancy. Using spatial and temporal variation in where and when these therapeutics were rolled out, it was found that increased life expectancy results in more savings and more human capital investment in the form of education spending. Another experiment in Malawi <a href="https://repository.upenn.edu/psc_publications/39/">provided information</a> to correct pessimistic priors about life expectancy, and found that higher life expectancy directly caused more investment in agriculture and livestock.</p><p>A third example comes from <a href="https://www.jstor.org/stable/23469683">testing for Huntington&#8217;s disease</a>, a disease which causes a meaningful drop in life expectancy to around 60 years. Using variation in when people are diagnosed with Huntington&#8217;s, it has been found that those who learn they carry the gene for Huntington&#8217;s earlier are 30 percentage points less likely to finish college, which is a significant fall in their human capital investment.</p><p>Studying the effect on savings and real rates from increased life expectancy <em>at the population level</em> is potentially intractable, but would be interesting to consider further. Again, in our assessment, the best empirical evidence available right now comes from the research on individual &#8220;existential&#8221; risks and suggests that real rates should increase with existential risk.</p><h1><strong>VII. Plugging the Cotra probabilities into a simple quantitative model of real interest rates predicts very high rates</strong></h1><p>Section VI used historical data to go from <em>the current real rate</em> to a very crude <em>market-implied forecast</em> of growth rates; in this section, we instead use a model to go from <em>existing forecasts</em> of AI timelines to <em>timeline-implied real rates.</em> We aim to show that under short AI timelines, real interest rates would be unrealistically elevated.</p><p>This is a useful exercise for three reasons. First, the historical data is only able to speak to growth forecasts, and therefore only able to provide a forecast under the possibly incorrect assumption of <em>aligned</em> AI. Second, the empirical forecast assumes a <em>linear </em>relationship between the real rate and growth, which may not be reasonable for a massive change caused by transformative AI. Third and quite important, the historical data cannot transparently tell us anything about uncertainty and the market&#8217;s beliefs about the full probability <em>distribution</em> of AI timelines.</p><p>We <a href="https://docs.google.com/spreadsheets/d/10ULqcFRKXD-UT5JYDMrQeLtUfzkyzNUxCPQvYqbEdig/edit#gid=850885346">use</a> the canonical (and nonlinear) version of the Euler equation &#8211; the model discussed in section I &#8211; but now allow for <em>uncertainty</em> on both how soon transformative AI will be developed and whether or not it will be aligned. The model takes as its key inputs (1) a probability of transformative AI each year, and (2) a probability that such technology is aligned.</p><p>The model is a simple application of the stochastic Euler equation under an <a href="https://en.wikipedia.org/wiki/Isoelastic_utility">isoelastic utility function</a>. We use the following as a baseline, before considering alternative probabilities:</p><ul><li><p>We use smoothed <strong><a href="https://www.lesswrong.com/posts/AfH2oPHCApdKicM4m/two-year-update-on-my-personal-ai-timelines">Cotra (2022)</a> probabilities for transformative AI</strong> over the next 30 years: a 2% yearly chance until 2030, a 3% yearly chance through 2036, and a 4% yearly chance through 2052.</p></li><li><p>We use the <strong>FTX Future Fund&#8217;s median <a href="https://archive.ph/9UQTA">estimate</a> of 15% for the probability that AI is </strong><em><strong>un</strong></em><strong>aligned</strong> conditional on the development of transformative AI.</p></li><li><p>With the arrival of <em>aligned</em> AI, we use the <strong>Davidson (2020) assumption of 30% annual economic growth</strong>; with the arrival of unaligned AI, we assume human extinction. In the absence of the development of transformative AI, we assume a steady 1.8% growth rate.</p></li><li><p>We calibrate the pure rate of subjective time preference to 0.01 and the consumption smoothing parameter (i.e. inverse of the elasticity of intertemporal substitution) as 1, following the economic literature.</p></li></ul><p>Thus, to summarize: by default, GDP grows at 1.8% per year. Every year, there is some probability (based on Cotra) that transformative AI is developed. If it is developed, there is a 15% probability the world ends, and an 85% chance GDP growth jumps to 30% per year.</p><p>We have built a spreadsheet <a href="https://docs.google.com/spreadsheets/d/10ULqcFRKXD-UT5JYDMrQeLtUfzkyzNUxCPQvYqbEdig/edit#gid=850885346">here</a> that allows you to tinker with the numbers yourself, such as adjusting the growth rate under aligned AI, to see what your timelines and probability of alignment would imply for the real interest rate. (It also contains the full Euler equation formula generating the results, for those who want the mathematical details.) We first estimate real rates under the baseline calibration above, before considering variations in the critical inputs.</p><p><strong>Baseline results.</strong> The model predicts that under <em>zero </em>probability of transformative AI, the real rate at any horizon would be 2.8%. In comparison, under the baseline calibration just described based on Cotra timelines, the real rate at a 30-year horizon would be pushed up to 5.9% &#8211; roughly three percentage points higher.</p><p>For comparison, the 30-year real rate in the US is currently 1.6%.</p><p>While the simple Euler equation somewhat overpredicts the level of the real interest rate even under zero probability of transformative AI &#8211; the 2.8% in the model versus the 1.6% in the data &#8211; this overprediction is explainable by the radical simplicity of the model that we use and is a known issue in the literature. Adding other factors (e.g. <a href="https://en.wikipedia.org/wiki/Precautionary_savings">precautionary savings</a>) to the model would lower the level. Changing the level does not change its <em>directional</em> predictions, which help <a href="https://ideas.repec.org/a/ijc/ijcjou/y2017q3a1.html">quantitatively</a> <a href="https://www.nber.org/papers/w30024">explain</a> the fall in real rates over the past ~30 years.</p><p>Therefore, what is most informative is the three percentage point <em>difference</em> between the real rate under Cotra timelines (5.9%) versus under no prospect of transformative AI (2.8%): Cotra timelines imply real interest rates substantially higher than their current levels.</p><p>Now, from this baseline estimate, we can also consider varying the key inputs.</p><p><strong>Varying assumptions on P(misaligned|AGI).</strong> First consider changing the assumption that advanced AI is 15% likely to be unaligned (conditional on the development of AGI). <a href="https://docs.google.com/spreadsheets/d/1YNzk_p-IHtzhV6Eupahhpo-hRSwDmTA3iAGd0ASAdCA/edit#gid=1179700904">Varying this parameter</a> does not have a large impact: moving from 0% to 100% probability of misalignment raises the model&#8217;s predicted real rate from 5.8% only to 6.3%.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DBe8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fa7a961-048d-493f-88c1-169be74f3707_1290x800.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DBe8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fa7a961-048d-493f-88c1-169be74f3707_1290x800.png 424w, https://substackcdn.com/image/fetch/$s_!DBe8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fa7a961-048d-493f-88c1-169be74f3707_1290x800.png 848w, https://substackcdn.com/image/fetch/$s_!DBe8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fa7a961-048d-493f-88c1-169be74f3707_1290x800.png 1272w, https://substackcdn.com/image/fetch/$s_!DBe8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fa7a961-048d-493f-88c1-169be74f3707_1290x800.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DBe8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fa7a961-048d-493f-88c1-169be74f3707_1290x800.png" width="1290" height="800" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9fa7a961-048d-493f-88c1-169be74f3707_1290x800.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:1290,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:117338,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://tmychow.substack.com/i/138118236?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fa7a961-048d-493f-88c1-169be74f3707_1290x800.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DBe8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fa7a961-048d-493f-88c1-169be74f3707_1290x800.png 424w, https://substackcdn.com/image/fetch/$s_!DBe8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fa7a961-048d-493f-88c1-169be74f3707_1290x800.png 848w, https://substackcdn.com/image/fetch/$s_!DBe8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fa7a961-048d-493f-88c1-169be74f3707_1290x800.png 1272w, https://substackcdn.com/image/fetch/$s_!DBe8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fa7a961-048d-493f-88c1-169be74f3707_1290x800.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Varying assumptions on timelines.</strong> Second, consider making timelines shorter or longer. In particular, consider varying the <em>probability of development by 2043</em>, which we use as a benchmark per the <a href="https://archive.ph/9UQTA">FTX Future Fund</a>.</p><p>We scale the Cotra timelines up and down to vary the probability of development by 2043. (Specifically: we target a specific <em>cumulative </em>probability of development by 2043; and, following Cotra, if the <em>annual </em>probability up until 2030 is x, then it is 1.5x in the subsequent seven years up through 2036, and it is 2x in the remaining years of the 30-year window.)</p><p>As <a href="https://docs.google.com/spreadsheets/d/1eC1xLpb_UWJ5rO-wScHjM5ERwYldw6Qxa3x0tzao_tk/edit#gid=1601822419">the next figure</a> shows and as one might expect, shorter AI timelines have a very large impact on the model&#8217;s estimate for the real rate.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mSX2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68604c54-0a61-41df-b20f-82bef6f5885f_1310x798.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mSX2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68604c54-0a61-41df-b20f-82bef6f5885f_1310x798.png 424w, https://substackcdn.com/image/fetch/$s_!mSX2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68604c54-0a61-41df-b20f-82bef6f5885f_1310x798.png 848w, https://substackcdn.com/image/fetch/$s_!mSX2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68604c54-0a61-41df-b20f-82bef6f5885f_1310x798.png 1272w, https://substackcdn.com/image/fetch/$s_!mSX2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68604c54-0a61-41df-b20f-82bef6f5885f_1310x798.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mSX2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68604c54-0a61-41df-b20f-82bef6f5885f_1310x798.png" width="1310" height="798" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/68604c54-0a61-41df-b20f-82bef6f5885f_1310x798.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:798,&quot;width&quot;:1310,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:161809,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://tmychow.substack.com/i/138118236?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68604c54-0a61-41df-b20f-82bef6f5885f_1310x798.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mSX2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68604c54-0a61-41df-b20f-82bef6f5885f_1310x798.png 424w, https://substackcdn.com/image/fetch/$s_!mSX2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68604c54-0a61-41df-b20f-82bef6f5885f_1310x798.png 848w, https://substackcdn.com/image/fetch/$s_!mSX2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68604c54-0a61-41df-b20f-82bef6f5885f_1310x798.png 1272w, https://substackcdn.com/image/fetch/$s_!mSX2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68604c54-0a61-41df-b20f-82bef6f5885f_1310x798.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" style="height:20px;width:20px" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>The original baseline parameterization from Cotra corresponds to the FTX Future Fund &#8220;upper threshold&#8221; of a 45% chance of development by 2043, which generated the 3 percentage point increase in the 30-year real rate discussed above.</p></li></ul><ul><li><p>The Future Fund&#8217;s median of a 20% probability by 2043 generates a 1.1 percentage point increase in the 30-year real rate.</p></li><li><p>The Future Fund&#8217;s &#8220;lower threshold&#8221; of a 10% probability by 2043 generates a 0.5 percentage point increase in the real rate.</p></li></ul><p>These results strongly suggest that any timeline shorter than or equal to the Cotra timeline is <em>not </em>being expected by financial markets.</p><h1><strong>VIII. Markets are decisively rejecting the shortest possible timelines</strong></h1><p>While it is not possible to back out <em>exact </em>numbers for the market&#8217;s implicit forecast for AI timelines, it is reasonable to say that the market is <em>decisively</em> rejecting &#8211; i.e., putting very low probability on &#8211; the development of transformative AI in the <em>very</em> near term, say within the next ten years.</p><p>Consider the following examples of extremely short timelines:</p><ol><li><p>Five year timelines: With a 50% probability of transformative AI by 2027, and the same yearly probability thereafter, the model predicts 13.0pp higher 30-year real rates today!</p></li><li><p>Ten year timelines: With a 50% probability of transformative AI by 2032, and the same yearly probability thereafter, the model predicts 6.5pp higher 30-year real rates today.</p></li></ol><p>Real rate movements of these magnitudes are wildly counterfactual. As previously noted, real rates in the data used above have never gone above even 5%.</p><p><strong>Stagnation.</strong> As a robustness check, <a href="https://docs.google.com/spreadsheets/d/10ULqcFRKXD-UT5JYDMrQeLtUfzkyzNUxCPQvYqbEdig/edit#gid=850885346">in the configurable spreadsheet</a> we allow you to place some yearly probability on the economy stagnating and growing at 0% per year from thereon. Even with a 20% chance of stagnation by 2053 (higher than realistic), under Cotra timelines, the model generates a 2.1% increase in 30-year rates.</p><p><strong>Recent market movements. </strong>Real rates have increased around two percentage points since the start of 2022, with the <a href="https://fred.stlouisfed.org/series/DFII30">30-year real rate</a> moving from -0.4% to 1.6%, approximately the pre-covid level. This is a large enough move to merit discussion. While this rise in long-term real rates could reflect changing market expectations for timelines, it seems much more plausible that high inflation, the Russia-Ukraine war, and monetary policy tightening have together worked to drive up short-term real rates and <a href="https://www.federalreserve.gov/econres/notes/feds-notes/tips-from-tips-update-and-discussions-20190521.html">the risk premium on long-term real rates</a>.</p><h1><strong>IX. Financial markets are the most powerful information aggregators produced by the universe (so far)</strong></h1><p>Should we update on the fact that markets are not expecting very short timelines?</p><p>Probably!</p><p>As a prior, we think that <a href="https://twitter.com/ESYudkowsky/status/1455949250320240641">market efficiency is reasonable</a>. We do not try to provide a full defense of the efficient markets hypothesis (EMH) in this piece given that it has been debated ad nauseum elsewhere, but here is a scaffolding of what such an argument would look like.</p><p>Loosely, the EMH says that the current price of any security incorporates all public information about it, and as such, <a href="https://twitter.com/ESYudkowsky/status/1278486324018704384">you should not expect to systematically make money by trading securities</a>.</p><p>This is simply a no-arbitrage condition, and certainly no more radical than supply and demand: if something is over- or under-priced, you&#8217;ll take action based on that belief until you no longer believe it. In other words, <a href="https://twitter.com/ESYudkowsky/status/1426612130975870981">you&#8217;ll buy and sell it until you think the price is right</a>. Otherwise, there would be an <a href="https://yudkowsky.tumblr.com/writing/inexploitability">unexploited opportunity</a> for profit that was being left on the table, and <a href="https://www.lesswrong.com/posts/h24JGbmweNpWZfBkM/markets-are-anti-inductive">there are no free lunches when the market is in equilibrium</a>.</p><p>As a corollary, the current price of a security should be the best available risk-adjusted predictor of its future price. Notice we didn&#8217;t say that the price is equal to the &#8220;correct&#8221; fundamental value. In fact, the current price is almost certainly wrong. What we did say is that it is the best guess, i.e. no one knows if it should be higher or lower.</p><p>Testing this hypothesis is difficult, in the same way that testing any equilibrium condition is difficult. Not only is the equilibrium always changing, there is also a joint hypothesis problem which <a href="https://www.jstor.org/stable/2325486">Fama (1970)</a> outlined: comparing actual asset prices to &#8220;correct&#8221; theoretical asset prices means you are simultaneously testing whatever asset pricing model you choose, alongside the EMH.</p><p>In this sense, it makes no sense to talk about &#8220;testing&#8221; the EMH. Rather, the question is how quickly prices converge to the limit of market efficiency. In other words, how fast is information diffusion? Our position is that for most things, this is pretty fast!</p><p>Here are a few heuristics that support our position:</p><ol><li><p>For our purposes, the earlier evidence on the link between real rates and growth is a highly relevant example of market efficiency.</p></li><li><p>There are notable examples of markets seeming to be eerily good at forecasting hard-to-anticipate events:</p><ol><li><p>In the wake of the Challenger explosion, despite no definitive public information being released, <a href="http://wisdomofcrowds.blogspot.com/2009/12/stock-market-reaction-to-challenger.html">the market</a> <a href="https://slate.com/business/2003/08/the-disaster-market.html">seems to have identified</a> which firm was responsible.</p></li><li><p><a href="https://www.sciencedirect.com/science/article/pii/S0929119914000546#f0005">Economist Armen Alchian observed</a> that the stock price of lithium producers spiked 461% following the public announcement of the first hydrogen bomb tests in 1954, while the prices of producers of other radioactive metals were flat. He circulated a paper within RAND, where he was working, identifying lithium as the material used in the tests, before the paper was suppressed by leadership who were apparently aware that indeed lithium was used. The market was prescient even though <em>zero </em>public information was released about lithium&#8217;s usage.</p></li></ol></li></ol><p>Remember: if real interest rates are wrong, <em>all</em> financial assets are mispriced. If real interest rates &#8220;should&#8221; rise three percentage points or more, that is easily hundreds of billions of dollars worth of revaluations. It is unlikely that sharp market participants are leaving billions of dollars on the table.</p><h1><strong>X. If markets are not efficient, you could be earning alpha and philanthropists could be borrowing</strong></h1><p>While our prior in favor of efficiency is fairly strong, the market could be currently<em> </em>failing to anticipate transformative AI, due to various <a href="https://scholar.harvard.edu/files/shleifer/files/limitsofarbitrage.pdf">limits to arbitrage</a>.</p><p>However, if you do believe the market is currently wrong about the probability of short timelines, then we now argue there are two courses of action you should consider taking:</p><ol><li><p>Bet on real rates rising (&#8220;get rich or die trying&#8221;)</p></li><li><p>Borrow today, including in order to fund philanthropy (&#8220;impatient philanthropy&#8221;)</p></li></ol><h2><strong>1. Bet on real rates rising (&#8220;get rich or die trying&#8221;)</strong></h2><p>Under the logic argued above, if you genuinely believe that AI timelines are short, then you should consider putting your money where your mouth is: bet that real rates will rise when the market updates, and potentially earn a lot of money if markets correct. Shorting (or going underweight) government debt is the simplest way of expressing this view.</p><p>Indeed, AI safety researcher Paul Christiano has <a href="https://forum.effectivealtruism.org/posts/KdxGwxwY3t7iw9xjB/three-impacts-of-machine-intelligence?commentId=aYCNP5PgDYsZpxsbX#aYCNP5PgDYsZpxsbX">written publicly</a> that he is (or was) short 30-year government bonds.</p><p>If short timelines are your true belief in your heart of hearts, and not merely a <a href="https://www.lesswrong.com/posts/CqyJzDZWvGhhFJ7dY/belief-in-belief">belief in a belief</a>, then you should seriously consider how much money you could earn here and what you could do with those resources.</p><p><strong>Implementing the trade.</strong> For retail investors, betting against treasuries via ETFs is perhaps simplest. Such trades can be done easily with retail brokers, like Schwab.</p><p>(i) For example, one could simply short the <a href="https://www.pimco.com/en-us/investments/etf/15-year-us-tips-index-exchange-traded-fund/">LTPZ ETF</a>, which holds long-term real US government debt (effective duration: 20 years).</p><p>(ii) Alternatively, if you would prefer to avoid engaging in shorting yourself, there are ETFs which will do the shorting for you, with nominal bonds: <a href="https://www.proshares.com/our-etfs/leveraged-and-inverse/tbf">TBF</a> is an ETF which is short 20+ year treasuries (duration: 18 years); <a href="https://www.proshares.com/our-etfs/leveraged-and-inverse/tbt">TBT</a> is the same, but levered 2x; and <a href="https://www.proshares.com/our-etfs/leveraged-and-inverse/ttt">TTT</a> is the same, but levered 3x. There are a number of <a href="https://etfdb.com/etfdb-category/inverse-bonds/">other similar options</a>. Because these ETFs do the shorting for you, all you need to do is purchase shares of the ETFs.</p><p><strong>Back of the envelope estimate.</strong> A rough estimate of how much money is on the table, just from shorting the US treasury bond market <em>alone</em>, suggests there is easily $1 trillion in value at stake from betting that rates will rise.</p><ul><li><p>In response to a 1 percentage point rise in interest rates, the price of a bond falls in percentage terms by its &#8220;<em><a href="https://www.blackrock.com/fp/documents/understanding_duration.pdf">duration</a></em>&#8221;, to a first-order approximation.</p></li><li><p>The average value-weighted duration of (privately-held) US treasuries <a href="https://www.brookings.edu/blog/up-front/2022/07/27/projecting-the-structure-of-us-treasury-debt/">is approximately 4 years</a>.</p></li><li><p>So, to a first-order approximation, if rates rise by 3 percentage points, then the value of treasuries will fall by 12% (that is, 3*4).</p></li><li><p>The market cap of (privately-held) treasuries <a href="https://www.dallasfed.org/research/econdata/govdebt#data">is approximately $17 trillion</a>.</p></li><li><p>Thus, if rates rise by 3 percentage points, then the total value of treasuries can be expected to fall by $2.04 trillion (that is, 12%*17 trillion).</p></li><li><p>Slightly more than half (55%) of the interest rate sensitivity of the treasury market <a href="https://www.brookings.edu/blog/up-front/2022/07/27/projecting-the-structure-of-us-treasury-debt/">comes from</a> bonds with maturity beyond 10 years. Assuming that the 3 percentage point rise occurs only at this horizon, and rounding down, we arrive at the $1 trillion estimate.</p></li></ul><p>Alternatively, returning to the LTPZ ETF with its duration of 20 years, a 3 percentage point rise in rates would cause its value to fall by 60%. Using the 3x levered TTT with duration of 18 years, a 3 percentage point rise in rates would imply a mouth-watering cumulative return of 162%.</p><p>While fully fleshing out the trade analysis is beyond the scope of this post, this illustration gives an idea of how large the possibilities are.</p><p>The alternative to this order-of-magnitude estimate would be to build a complete bond pricing model to estimate more precisely the expected returns of shorting treasuries. This would need to take into account e.g. the convexity of price changes with interest rate movements, the varied maturities of outstanding bonds, and the different varieties of instruments issued by the Treasury. Further refinements would include trading derivatives (e.g. interest rate futures) instead of shorting bonds directly, for capital efficiency, and using leverage to increase expected returns.</p><p>Additionally, the analysis could be extended beyond the US government debt market, again since changes to real interest rates would plausibly impact the price of every<em> </em>asset: stocks, commodities, real estate, <em>everything</em>.</p><p>(If you would be interested in fully scoping out possible trades, we would be interested in talking.)</p><p><strong>Trade risk and foom risk.</strong> We want to be clear that &#8211; unless you are risk neutral, or can borrow without penalty at the risk-free rate, or believe in short timelines with 100% probability &#8211; then such a bet would not be a free lunch: this is not an &#8220;arbitrage&#8221; in the technical sense of a <a href="https://www.lesswrong.com/posts/Afwaj6sGfxcQZrmYF/chapter-4-the-efficient-market-hypothesis">risk-free profit</a>. One risk is that the market moves in the other direction in the short term, before correcting, and that you are unable to roll over your position for liquidity reasons.</p><p>The other risk that could motivate not making this bet is the risk that the market &#8211; for some unspecified reason &#8211; never has a chance to correct, because (1) transformative AI ends up <em>unaligned </em>and (2) humanity&#8217;s conversion into paperclips occurs <em>overnight</em>. This would prevent the market from ever &#8220;waking up&#8221;.</p><p>However, to be clear, expecting this specific scenario requires both:</p><ol><li><p>Buying into specific stories about how takeoff will occur: specifically, Yudkowskian <a href="https://www.lesswrong.com/tag/the-hanson-yudkowsky-ai-foom-debate">foom</a>-type scenarios with fast takeoff.</p></li><li><p>Having a lot of skepticism about the optimization forces pushing financial markets towards informational efficiency.</p></li></ol><p>You should be sure that your beliefs are actually congruent with these requirements, if you want to refuse to bet that real rates will rise. Additionally, we will see that the second suggestion in this section (&#8220;impatient philanthropy&#8221;) is not affected by the possibility of foom scenarios.</p><h2><strong>2. Borrow today, including in order to fund philanthropy (&#8220;impatient philanthropy&#8221;)</strong></h2><p>If prevailing interest rates are lower than your subjective discount rate &#8211; which is the case if you think markets are underestimating prospects for transformative AI &#8211; then simple cost-benefit analysis says you should save less or even borrow today.</p><p><strong>An illustrative example.</strong> As an extreme example to illustrate this argument, imagine that you think that there is a 50% chance that humanity will be extinct next year, and otherwise with certainty you will have the same income next year as you do this year. Suppose the market real interest rate is 0%. That means that if you borrow $10 today, then in expectation you only need to pay $5 off, since 50% of the time you expect to be dead.</p><p>It is only if the market real rate is 100% &#8211; so that your $10 loan requires paying back $20 next year, or exactly $10 in expectation &#8211; that you are indifferent about borrowing. If the market real rate is less than 100%, then you want to borrow. If interest rates are &#8220;too low&#8221; from your perspective, then on the margin this should encourage you to borrow, or at least save less.</p><p>Note that this logic is not affected by whether or not the market will &#8220;correct&#8221; and real rates will rise before everyone dies, unlike the logic above for trading.</p><p><strong>Borrowing to fund philanthropy today.</strong> While you may want to borrow today simply to fund wild parties, a natural alternative is: borrow today, locking in &#8220;too low&#8221; interest rates, in order to fund philanthropy today. For example: to fund AI safety work.</p><p>We can call this strategy &#8220;impatient philanthropy&#8221;, in analogy to the concept of &#8220;<a href="https://80000hours.org/podcast/episodes/phil-trammell-patient-philanthropy/">patient philanthropy</a>&#8221;.</p><p>This is not a call for philanthropists to radically rethink their cost-benefit analyses. Instead, we merely point out: ensure that your financial planning properly accounts for any difference between <em>your</em> discount rate and <em>the market real rate at which you can borrow</em>. You should not be using the market real rate to do your financial planning. If you have a higher effective discount rate due to your AI timelines, that could imply that you should be borrowing today to fund philanthropic work.</p><p><strong>Relationship to impatient philanthropy.</strong> The logic here has a similar flavor to <a href="https://80000hours.org/podcast/episodes/phil-trammell-patient-philanthropy/">Phil Trammell&#8217;s &#8220;patient philanthropy&#8221; argument</a> (<a href="https://globalprioritiesinstitute.org/wp-content/uploads/Trammell-Dynamic-Public-Good-Provision-under-Time-Preference-Heterogeneity.pdf">Trammell 2021</a>) &#8211; but with a sign flipped. Longtermist philanthropists with a zero discount rate, who live in a world with a positive real interest rate, should be willing to save all of their resources for a long time to earn that interest, rather than spending those resources today on philanthropic projects. Short-timeliners have a higher discount rate than the market, and <a href="https://80000hours.org/podcast/episodes/phil-trammell-patient-philanthropy/#what-about-groups-who-have-a-particular-sense-of-urgency-014046">therefore</a> should be <em>im</em>patient philanthropists.</p><p>(The point here is not an exact analog to Trammell 2021, because the paper there considers strategic game theoretic considerations and also takes the real rate as exogenous; here, the considerations are not strategic and the endogeneity of the real rate is the critical point.)</p><h1><strong>XI. Conclusion: outside views vs. inside views &amp; future work</strong></h1><p>We do not claim to have special technical insight into forecasting the likely timeline for the development of transformative artificial intelligence: we do not present an inside view on AI timelines.</p><p>However, we do think that market efficiency provides a powerful <em>outside view</em> for forecasting AI timelines and for making financial decisions. Based on prevailing real interest rates, the market seems to be strongly rejecting timelines of less than ten years, and does not seem to be placing particularly high odds on the development of transformative AI even 30-50 years from now.</p><p>We argue that market efficiency is a reasonable benchmark, and consequently, this forecast serves as a useful prior for AI timelines. If markets are wrong, on the other hand, then there is an enormous amount of money on the table from betting that real interest rates will rise. In either case, this market-based approach offers a useful framework: either for forecasting timelines, or for asset allocation.</p><p><strong>Opportunities for future work.</strong> We could have put 1000 more hours into the empirical side or the model, but, TIABPNAJA. Future work we would be interested in collaborating on or seeing includes:</p><ol><li><p>More careful empirical analyses of the relationship between real rates and growth. In particular, (1) analysis of data samples with larger variation in growth rates (e.g. with the Industrial Revolution, China or the East Asian Tigers), where a credible measure of <em>real </em>interest rates can be used; and (2) <em>causally identified </em>estimates of the relationship between real rates and growth, rather than correlations. Measuring historical real rates is the key challenge, and the main reason why we have not tried to address these here.</p></li><li><p>Any empirical analysis of how real rates vary with changing existential risk. Measuring changes in existential risk is the key challenge.</p></li><li><p>Alternative quantitative models on the relationship between real interest rates and growth/x-risk with <a href="https://arxiv.org/abs/2201.10673">alternative preference specifications</a>, incomplete markets, or disaster risk.</p></li><li><p>Tests of market forecasting ability at longer time horizons for any outcome of significance; and comparisons of market efficiency at shorter versus longer time horizons.</p></li><li><p>Creation of sufficiently-liquid genuine <em>market </em>instruments for directly measuring outcomes we care about like long-horizon GDP growth: e.g. GDP swaps, GDP-linked bonds, or binary GDP prediction markets. (We emphasize <em>market</em> instruments to distinguish from forecasting platforms like Metaculus or play-money sites like Manifold Markets where the forceful logic of financial market efficiency simply <a href="https://basilhalperin.com/essays/metaculus-monetary-policy.html">does not hold</a>.)</p></li><li><p>An analysis of the most capital-efficient way to bet on short AI timelines and the possible expected returns (&#8220;the greatest trade of all time&#8221;).</p></li><li><p>Analysis of the informational content of infinitely-lived assets: e.g. the discount rates embedded in land prices and rental contracts. There is an existing literature related to this topic: <a href="https://academic.oup.com/qje/article/130/1/1/2337985">[1],</a> <a href="https://faculty.haas.berkeley.edu/amir/discussion/Discussion_Slides_Very_long_run_risk.pdf">[2],</a> <a href="https://www.sciencedirect.com/science/article/abs/pii/S0165176522002919?dgcid=raven_sd_via_email">[3],</a> <a href="https://www.sciencedirect.com/science/article/pii/S0197397507000574?via%3Dihub">[4]</a>, <a href="https://academic.oup.com/ej/article/128/613/1820/5088336">[5], </a><a href="https://onlinelibrary.wiley.com/doi/10.1002/jae.2867">[6],</a> <a href="https://academic.oup.com/rfs/article/34/8/3527/6187965">[7]</a>.</p><ul><li><p>This literature estimates <em>risky, nominal</em> discount rates embedded in rental contracts out as far as 1000 years, and finds surprisingly low estimates &#8211; certainly less than 10%. This is potentially extremely useful information, though this literature is not without caveats. Among many other things, we cannot have the presumption of informational efficiency in land/rental markets, unlike financial markets, due to severe frictions in these markets (e.g. inability to short sell).</p></li></ul></li></ol><div><hr></div><p>Thanks especially to<a href="https://www.forourposterity.com/"> Leopold Aschenbrenner</a>, <a href="https://thegoodblog.substack.com/">Nathan Barnard</a>, <a href="https://www.linkedin.com/in/jackson-w-barkstrom/">Jackson Barkstrom</a>, <a href="https://joel-becker.com/">Joel Becker</a>, <a href="https://danicaratelli.github.io/">Daniele Caratelli</a>, James Chartouni, <a href="https://www.tamaybesiroglu.com/">Tamay Besiroglu</a>, <a href="https://economics.mit.edu/people/phd-students/joel-flynn">Joel Flynn</a>, <a href="https://jamesrhowevi.me/">James Howe</a>, <a href="https://www.linkedin.com/in/charles-hyland/?originalSubdomain=au">Chris Hyland</a>,<a href="https://stephenmalina.com/"> Stephen Malina</a>, <a href="https://twitter.com/ptr_mcl">Peter McLaughlin</a>, <a href="https://www.jacksonmejia.com/">Jackson Mejia</a>,<a href="https://www.lauranicolae.com/"> Laura Nicolae</a>, <a href="https://www.linkedin.com/in/sam-henry-lazarus-1a6617150/">Sam Lazarus</a>, <a href="https://lehrer.substack.com/">Elliot Lehrer</a>, <a href="https://economics.mit.edu/people/phd-students/jett-pettus">Jett Pettus</a>, <a href="https://brettongoods.substack.com/">Pradyumna Prasad</a>, <a href="http://www.tejassubramaniam.com/">Tejas Subramaniam</a>, <a href="https://karthiktadepalli.com/">Karthik Tadepalli</a>, <a href="https://philiptrammell.com/">Phil Trammell</a>, and participants at <a href="https://forum.effectivealtruism.org/posts/L3WPuztkSMohBTWqZ/etgp-2022-materials-feedback-and-lessons-for-2023">ETGP 2022</a> for very useful conversations on this topic and/or feedback on drafts.</p><div><hr></div><h2><strong>Appendix 1. Against using stock prices to forecast AI timelines</strong></h2><p><a href="https://basilhalperin.com/essays/stocks-forecasting-timelines.html">Link to separate post</a></p><h2><strong>Appendix 2. Explaining Tyler Cowen&#8217;s Third Law</strong></h2><p><a href="https://basilhalperin.com/essays/cowens-third-law.html">Link to separate post</a></p><h2><strong>Appendix 3. Asset pricing under existential risk: a literature review</strong></h2><p><a href="https://docs.google.com/document/d/1sVkVxhQ179XNieu1zAiwzDsFec5r6UQn0dTKJjgCbOk/edit">Link to Google Doc</a></p><h2><strong>Appendix 4. Supplementary Figures</strong></h2><p><a href="https://docs.google.com/document/d/1CWw4t_SzQKDaEhiXGxZBtZQ7vkLddL2aGGqbqVibI-g/edit?usp=sharing">Link to Google Doc</a></p>]]></content:encoded></item></channel></rss>