<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Performance Optimization on Mia Heidenstedt</title><link>https://heidenstedt.org/tags/performance-optimization/</link><description>Recent
content
in Performance Optimization on Mia Heidenstedt</description><generator>
Hugo</generator><language>en</language><lastBuildDate>Thu, 16 Apr 2026 07:56:47 +0000</lastBuildDate><atom:link href="https://heidenstedt.org/tags/performance-optimization/index.xml" rel="self" type="application/rss+xml"/><item><title>Hyper Text Compression: Shrinking Wikipedia to 10.7% of its Size</title><link>https://heidenstedt.org/links/ai-powered-text-compression-shrinking-wikipedia-to-107-of-its-size/</link><pubDate>Mon, 11 Aug 2025 12:23:19 +0000</pubDate><guid>https://heidenstedt.org/links/ai-powered-text-compression-shrinking-wikipedia-to-107-of-its-size/</guid><description><![CDATA[<p>
      <em>Best viewed on the <a href="https://heidenstedt.org/links/ai-powered-text-compression-shrinking-wikipedia-to-107-of-its-size/">original page</a>, where extended functionality like the
    footnote helper is available.</em>
    </p><p>This is a super cool leaderboard for lossless text compression via NLP (and yes, that includes AI)! The top solution manages to compress the first GB of the English Wikipedia to a whopping 10.7% of its original size, including the compression program itself!</p>
<p><a href="https://www.mattmahoney.net/dc/text.html">Hyper Text Compression: Shrinking Wikipedia to 10.7% of its Size!</a></p>
<h2 id="automatic-tldr-by-gemini-25-pro"><a href="https://heidenstedt.org/links/ai-powered-text-compression-shrinking-wikipedia-to-107-of-its-size/#automatic-tldr-by-gemini-25-pro">Automatic TLDR by Gemini 2.5 Pro:</a></h2><p>This page describes the <strong>Large Text Compression Benchmark</strong>, an open competition that ranks lossless data compression programs. The primary goal is to encourage research in artificial intelligence (AI) and natural language processing (NLP) by treating text compression as a language modeling problem.</p>
<hr>
<h3 id="benchmark-overview"><a href="https://heidenstedt.org/links/ai-powered-text-compression-shrinking-wikipedia-to-107-of-its-size/#benchmark-overview">Benchmark Overview</a></h3><ul>
<li><strong>Test Data</strong>: The benchmark uses the first $10^9$ bytes (1 GB) of an English Wikipedia XML dump from March 3, 2006, known as <code>enwik9</code>.</li>
<li><strong>Ranking Metric</strong>: Programs are ranked solely by the <strong>total size</strong>, which is the sum of the compressed <code>enwik9</code> file size and the size of the zipped decompresser program. A smaller total size is better.</li>
<li><strong>Secondary Information</strong>: Data such as compression/decompression speed and memory usage are provided for informational purposes but do not influence the rankings.</li>
<li><strong>Goal</strong>: The benchmark&rsquo;s main purpose is not to find the best general-purpose compressor but to push the boundaries of data modeling, a fundamental challenge in both AI and compression.</li>
</ul>
<hr>
<h3 id="key-findings-and-algorithms"><a href="https://heidenstedt.org/links/ai-powered-text-compression-shrinking-wikipedia-to-107-of-its-size/#key-findings-and-algorithms">Key Findings and Algorithms</a></h3><p>The results table shows a wide variety of compression programs, ranked from the best compression ratio to the worst. A clear trend emerges from the top-performing entries:</p>
<ul>
<li><strong>Dominance of AI Models</strong>: The highest-ranking compressors, such as <strong>nncp</strong> and <strong>cmix</strong>, utilize sophisticated AI-based algorithms. These include neural network models like <strong>Transformers (Tr)</strong> and <strong>Long Short-Term Memory (LSTM)</strong>, as well as advanced <strong>Context Mixing (CM)</strong> techniques. These methods excel at modeling the complex patterns in natural language text, resulting in superior compression ratios.</li>
<li><strong>Trade-offs</strong>: There is a significant trade-off between compression ratio, speed, and memory. The top-ranked AI-driven compressors are extremely slow and require vast amounts of memory (often many gigabytes) and, in some cases, specialized hardware like GPUs.</li>
<li><strong>Traditional Algorithms</strong>: More conventional algorithms like <strong>Lempel-Ziv (LZ)</strong>, <strong>Burrows-Wheeler Transform (BWT)</strong>, and <strong>Prediction by Partial Match (PPM)</strong> are found further down the list. While they are generally much faster and use less memory, they cannot achieve the same level of compression as the leading AI models on this specific text-based task.</li>
</ul>
<hr>
<h3 id="hutter-prize"><a href="https://heidenstedt.org/links/ai-powered-text-compression-shrinking-wikipedia-to-107-of-its-size/#hutter-prize">Hutter Prize</a></h3><p>The benchmark is closely related to the <strong>Hutter Prize</strong>, which offers prize money for open-source compression improvements on a smaller subset of the data (<code>enwik8</code>, the first $10^8$ bytes). This prize has specific hardware and time constraints, encouraging practical advancements in the field.</p>
]]></description><content:encoded><![CDATA[<p>
      <em>Best viewed on the <a href="https://heidenstedt.org/links/ai-powered-text-compression-shrinking-wikipedia-to-107-of-its-size/">original page</a>, where extended functionality like the
    footnote helper is available.</em>
    </p><p>This is a super cool leaderboard for lossless text compression via NLP (and yes, that includes AI)! The top solution manages to compress the first GB of the English Wikipedia to a whopping 10.7% of its original size, including the compression program itself!</p>
<p><a href="https://www.mattmahoney.net/dc/text.html">Hyper Text Compression: Shrinking Wikipedia to 10.7% of its Size!</a></p>
<h2 id="automatic-tldr-by-gemini-25-pro"><a href="https://heidenstedt.org/links/ai-powered-text-compression-shrinking-wikipedia-to-107-of-its-size/#automatic-tldr-by-gemini-25-pro">Automatic TLDR by Gemini 2.5 Pro:</a></h2><p>This page describes the <strong>Large Text Compression Benchmark</strong>, an open competition that ranks lossless data compression programs. The primary goal is to encourage research in artificial intelligence (AI) and natural language processing (NLP) by treating text compression as a language modeling problem.</p>
<hr>
<h3 id="benchmark-overview"><a href="https://heidenstedt.org/links/ai-powered-text-compression-shrinking-wikipedia-to-107-of-its-size/#benchmark-overview">Benchmark Overview</a></h3><ul>
<li><strong>Test Data</strong>: The benchmark uses the first $10^9$ bytes (1 GB) of an English Wikipedia XML dump from March 3, 2006, known as <code>enwik9</code>.</li>
<li><strong>Ranking Metric</strong>: Programs are ranked solely by the <strong>total size</strong>, which is the sum of the compressed <code>enwik9</code> file size and the size of the zipped decompresser program. A smaller total size is better.</li>
<li><strong>Secondary Information</strong>: Data such as compression/decompression speed and memory usage are provided for informational purposes but do not influence the rankings.</li>
<li><strong>Goal</strong>: The benchmark&rsquo;s main purpose is not to find the best general-purpose compressor but to push the boundaries of data modeling, a fundamental challenge in both AI and compression.</li>
</ul>
<hr>
<h3 id="key-findings-and-algorithms"><a href="https://heidenstedt.org/links/ai-powered-text-compression-shrinking-wikipedia-to-107-of-its-size/#key-findings-and-algorithms">Key Findings and Algorithms</a></h3><p>The results table shows a wide variety of compression programs, ranked from the best compression ratio to the worst. A clear trend emerges from the top-performing entries:</p>
<ul>
<li><strong>Dominance of AI Models</strong>: The highest-ranking compressors, such as <strong>nncp</strong> and <strong>cmix</strong>, utilize sophisticated AI-based algorithms. These include neural network models like <strong>Transformers (Tr)</strong> and <strong>Long Short-Term Memory (LSTM)</strong>, as well as advanced <strong>Context Mixing (CM)</strong> techniques. These methods excel at modeling the complex patterns in natural language text, resulting in superior compression ratios.</li>
<li><strong>Trade-offs</strong>: There is a significant trade-off between compression ratio, speed, and memory. The top-ranked AI-driven compressors are extremely slow and require vast amounts of memory (often many gigabytes) and, in some cases, specialized hardware like GPUs.</li>
<li><strong>Traditional Algorithms</strong>: More conventional algorithms like <strong>Lempel-Ziv (LZ)</strong>, <strong>Burrows-Wheeler Transform (BWT)</strong>, and <strong>Prediction by Partial Match (PPM)</strong> are found further down the list. While they are generally much faster and use less memory, they cannot achieve the same level of compression as the leading AI models on this specific text-based task.</li>
</ul>
<hr>
<h3 id="hutter-prize"><a href="https://heidenstedt.org/links/ai-powered-text-compression-shrinking-wikipedia-to-107-of-its-size/#hutter-prize">Hutter Prize</a></h3><p>The benchmark is closely related to the <strong>Hutter Prize</strong>, which offers prize money for open-source compression improvements on a smaller subset of the data (<code>enwik8</code>, the first $10^8$ bytes). This prize has specific hardware and time constraints, encouraging practical advancements in the field.</p>
]]></content:encoded></item><item><title>Releasing: GoQueueBench</title><link>https://heidenstedt.org/posts/2025/releasing-goqueuebench/</link><pubDate>Tue, 25 Mar 2025 15:44:38 +0000</pubDate><guid>https://heidenstedt.org/posts/2025/releasing-goqueuebench/</guid><description><![CDATA[<p>
      <em>Best viewed on the <a href="https://heidenstedt.org/posts/2025/releasing-goqueuebench/">original page</a>, where extended functionality like the
    footnote helper is available.</em>
    </p><p>As i coded on <a href="https://github.com/i5heu/ouroboros-db">OuroborosDB</a> i noticed that i need a very fast queue for a rather unique architectural design decision.<br>
I try to build the network module in such a way that i can test the behavior completely deterministic while &ldquo;simulating&rdquo; entire clusters in a single process.</p>
<p>So i build a test prototype of my global network queue with go&rsquo;s channels and noticed that it was a major performance bottleneck, after writing 2 different ring buffer queue implementations it became clear that some queues behave completely different under different congestion levels and core counts - some so unpredictable that i just did not wanted to use them in my project.</p>
<p>This prompted me to take a relatively large chunk out of my free time and write a suite to benchmark different queue implementations i build under different conditions and score them based on their performance and predictability.</p>
<p>The result of this work is <a href="https://github.com/i5heu/GoQueueBench">GoQueueBench</a></p>
<p>These are the results of the benchmark suite:</p>
<table>
<thead>
<tr>
<th>Implementation</th>
<th>Overall Score</th>
<th>Throughput Light Load</th>
<th>Throughput Heavy Load</th>
<th>Throughput Average</th>
<th>Stability Ratio</th>
<th>Homogeneity Factor</th>
<th>Uncertainty</th>
<th>Total Tests</th>
</tr>
</thead>
<tbody>
<tr>
<td>VortexQueue</td>
<td><strong>11341466</strong></td>
<td>6926449</td>
<td><strong>5502925</strong></td>
<td><strong>8776309</strong></td>
<td><strong>1.15</strong></td>
<td>0.87</td>
<td><strong>0.25</strong></td>
<td>681</td>
</tr>
<tr>
<td>LightningQueue</td>
<td>9631771</td>
<td>6638213</td>
<td>4627690</td>
<td>6036728</td>
<td>0.99</td>
<td><strong>0.95</strong></td>
<td>0.31</td>
<td>681</td>
</tr>
<tr>
<td>FastMPMCQueue</td>
<td>9384067</td>
<td>6870924</td>
<td>4598620</td>
<td>6070151</td>
<td>0.96</td>
<td>0.93</td>
<td>0.28</td>
<td>681</td>
</tr>
<tr>
<td>OptimizedMPMCQueue</td>
<td>9105262</td>
<td>6436385</td>
<td>4379823</td>
<td>5838555</td>
<td>0.97</td>
<td>0.94</td>
<td>0.32</td>
<td>681</td>
</tr>
<tr>
<td>OptimizedMPMCQueueSharded</td>
<td>8130197</td>
<td>6369891</td>
<td>3834140</td>
<td>6781865</td>
<td>0.84</td>
<td>0.88</td>
<td>0.39</td>
<td>681</td>
</tr>
<tr>
<td>MultiHeadQueue</td>
<td>7391203</td>
<td>4363332</td>
<td>3492068</td>
<td>5558849</td>
<td>1.12</td>
<td>0.91</td>
<td>0.36</td>
<td>681</td>
</tr>
<tr>
<td>BasicMPMCQueue</td>
<td>5599252</td>
<td>4370889</td>
<td>2669612</td>
<td>3667715</td>
<td>0.89</td>
<td>0.93</td>
<td>0.30</td>
<td>681</td>
</tr>
<tr>
<td>Golang Buffered Channel</td>
<td>5312485</td>
<td>6667828</td>
<td>2760985</td>
<td>4312720</td>
<td>0.54</td>
<td>0.82</td>
<td>0.66</td>
<td>681</td>
</tr>
<tr>
<td>FastMPMCQueueTicket</td>
<td>3229780</td>
<td><strong>7705164</strong></td>
<td>1203924</td>
<td>5803821</td>
<td>0.21</td>
<td>0.64</td>
<td>1.19</td>
<td>681</td>
</tr>
</tbody>
</table>
<p>Please note that i build the package so that all queue adhere to the same interface and can be swapped out easily.</p>
]]></description><content:encoded><![CDATA[<p>
      <em>Best viewed on the <a href="https://heidenstedt.org/posts/2025/releasing-goqueuebench/">original page</a>, where extended functionality like the
    footnote helper is available.</em>
    </p><p>As i coded on <a href="https://github.com/i5heu/ouroboros-db">OuroborosDB</a> i noticed that i need a very fast queue for a rather unique architectural design decision.<br>
I try to build the network module in such a way that i can test the behavior completely deterministic while &ldquo;simulating&rdquo; entire clusters in a single process.</p>
<p>So i build a test prototype of my global network queue with go&rsquo;s channels and noticed that it was a major performance bottleneck, after writing 2 different ring buffer queue implementations it became clear that some queues behave completely different under different congestion levels and core counts - some so unpredictable that i just did not wanted to use them in my project.</p>
<p>This prompted me to take a relatively large chunk out of my free time and write a suite to benchmark different queue implementations i build under different conditions and score them based on their performance and predictability.</p>
<p>The result of this work is <a href="https://github.com/i5heu/GoQueueBench">GoQueueBench</a></p>
<p>These are the results of the benchmark suite:</p>
<table>
<thead>
<tr>
<th>Implementation</th>
<th>Overall Score</th>
<th>Throughput Light Load</th>
<th>Throughput Heavy Load</th>
<th>Throughput Average</th>
<th>Stability Ratio</th>
<th>Homogeneity Factor</th>
<th>Uncertainty</th>
<th>Total Tests</th>
</tr>
</thead>
<tbody>
<tr>
<td>VortexQueue</td>
<td><strong>11341466</strong></td>
<td>6926449</td>
<td><strong>5502925</strong></td>
<td><strong>8776309</strong></td>
<td><strong>1.15</strong></td>
<td>0.87</td>
<td><strong>0.25</strong></td>
<td>681</td>
</tr>
<tr>
<td>LightningQueue</td>
<td>9631771</td>
<td>6638213</td>
<td>4627690</td>
<td>6036728</td>
<td>0.99</td>
<td><strong>0.95</strong></td>
<td>0.31</td>
<td>681</td>
</tr>
<tr>
<td>FastMPMCQueue</td>
<td>9384067</td>
<td>6870924</td>
<td>4598620</td>
<td>6070151</td>
<td>0.96</td>
<td>0.93</td>
<td>0.28</td>
<td>681</td>
</tr>
<tr>
<td>OptimizedMPMCQueue</td>
<td>9105262</td>
<td>6436385</td>
<td>4379823</td>
<td>5838555</td>
<td>0.97</td>
<td>0.94</td>
<td>0.32</td>
<td>681</td>
</tr>
<tr>
<td>OptimizedMPMCQueueSharded</td>
<td>8130197</td>
<td>6369891</td>
<td>3834140</td>
<td>6781865</td>
<td>0.84</td>
<td>0.88</td>
<td>0.39</td>
<td>681</td>
</tr>
<tr>
<td>MultiHeadQueue</td>
<td>7391203</td>
<td>4363332</td>
<td>3492068</td>
<td>5558849</td>
<td>1.12</td>
<td>0.91</td>
<td>0.36</td>
<td>681</td>
</tr>
<tr>
<td>BasicMPMCQueue</td>
<td>5599252</td>
<td>4370889</td>
<td>2669612</td>
<td>3667715</td>
<td>0.89</td>
<td>0.93</td>
<td>0.30</td>
<td>681</td>
</tr>
<tr>
<td>Golang Buffered Channel</td>
<td>5312485</td>
<td>6667828</td>
<td>2760985</td>
<td>4312720</td>
<td>0.54</td>
<td>0.82</td>
<td>0.66</td>
<td>681</td>
</tr>
<tr>
<td>FastMPMCQueueTicket</td>
<td>3229780</td>
<td><strong>7705164</strong></td>
<td>1203924</td>
<td>5803821</td>
<td>0.21</td>
<td>0.64</td>
<td>1.19</td>
<td>681</td>
</tr>
</tbody>
</table>
<p>Please note that i build the package so that all queue adhere to the same interface and can be swapped out easily.</p>
]]></content:encoded></item></channel></rss>