<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>OuroborosDB on Mia Heidenstedt</title><link>https://heidenstedt.org/tags/ouroborosdb/</link><description>Recent
content
in OuroborosDB on Mia Heidenstedt</description><generator>
Hugo</generator><language>en</language><lastBuildDate>Sun, 06 Jul 2025 15:49:51 +0000</lastBuildDate><atom:link href="https://heidenstedt.org/tags/ouroborosdb/index.xml" rel="self" type="application/rss+xml"/><item><title>Releasing: GoQueueBench</title><link>https://heidenstedt.org/posts/2025/releasing-goqueuebench/</link><pubDate>Tue, 25 Mar 2025 15:44:38 +0000</pubDate><guid>https://heidenstedt.org/posts/2025/releasing-goqueuebench/</guid><description><![CDATA[<p>
      <em>Best viewed on the <a href="https://heidenstedt.org/posts/2025/releasing-goqueuebench/">original page</a>, where extended functionality like the
    footnote helper is available.</em>
    </p><p>As i coded on <a href="https://github.com/i5heu/ouroboros-db">OuroborosDB</a> i noticed that i need a very fast queue for a rather unique architectural design decision.<br>
I try to build the network module in such a way that i can test the behavior completely deterministic while &ldquo;simulating&rdquo; entire clusters in a single process.</p>
<p>So i build a test prototype of my global network queue with go&rsquo;s channels and noticed that it was a major performance bottleneck, after writing 2 different ring buffer queue implementations it became clear that some queues behave completely different under different congestion levels and core counts - some so unpredictable that i just did not wanted to use them in my project.</p>
<p>This prompted me to take a relatively large chunk out of my free time and write a suite to benchmark different queue implementations i build under different conditions and score them based on their performance and predictability.</p>
<p>The result of this work is <a href="https://github.com/i5heu/GoQueueBench">GoQueueBench</a></p>
<p>These are the results of the benchmark suite:</p>
<table>
<thead>
<tr>
<th>Implementation</th>
<th>Overall Score</th>
<th>Throughput Light Load</th>
<th>Throughput Heavy Load</th>
<th>Throughput Average</th>
<th>Stability Ratio</th>
<th>Homogeneity Factor</th>
<th>Uncertainty</th>
<th>Total Tests</th>
</tr>
</thead>
<tbody>
<tr>
<td>VortexQueue</td>
<td><strong>11341466</strong></td>
<td>6926449</td>
<td><strong>5502925</strong></td>
<td><strong>8776309</strong></td>
<td><strong>1.15</strong></td>
<td>0.87</td>
<td><strong>0.25</strong></td>
<td>681</td>
</tr>
<tr>
<td>LightningQueue</td>
<td>9631771</td>
<td>6638213</td>
<td>4627690</td>
<td>6036728</td>
<td>0.99</td>
<td><strong>0.95</strong></td>
<td>0.31</td>
<td>681</td>
</tr>
<tr>
<td>FastMPMCQueue</td>
<td>9384067</td>
<td>6870924</td>
<td>4598620</td>
<td>6070151</td>
<td>0.96</td>
<td>0.93</td>
<td>0.28</td>
<td>681</td>
</tr>
<tr>
<td>OptimizedMPMCQueue</td>
<td>9105262</td>
<td>6436385</td>
<td>4379823</td>
<td>5838555</td>
<td>0.97</td>
<td>0.94</td>
<td>0.32</td>
<td>681</td>
</tr>
<tr>
<td>OptimizedMPMCQueueSharded</td>
<td>8130197</td>
<td>6369891</td>
<td>3834140</td>
<td>6781865</td>
<td>0.84</td>
<td>0.88</td>
<td>0.39</td>
<td>681</td>
</tr>
<tr>
<td>MultiHeadQueue</td>
<td>7391203</td>
<td>4363332</td>
<td>3492068</td>
<td>5558849</td>
<td>1.12</td>
<td>0.91</td>
<td>0.36</td>
<td>681</td>
</tr>
<tr>
<td>BasicMPMCQueue</td>
<td>5599252</td>
<td>4370889</td>
<td>2669612</td>
<td>3667715</td>
<td>0.89</td>
<td>0.93</td>
<td>0.30</td>
<td>681</td>
</tr>
<tr>
<td>Golang Buffered Channel</td>
<td>5312485</td>
<td>6667828</td>
<td>2760985</td>
<td>4312720</td>
<td>0.54</td>
<td>0.82</td>
<td>0.66</td>
<td>681</td>
</tr>
<tr>
<td>FastMPMCQueueTicket</td>
<td>3229780</td>
<td><strong>7705164</strong></td>
<td>1203924</td>
<td>5803821</td>
<td>0.21</td>
<td>0.64</td>
<td>1.19</td>
<td>681</td>
</tr>
</tbody>
</table>
<p>Please note that i build the package so that all queue adhere to the same interface and can be swapped out easily.</p>
]]></description><content:encoded><![CDATA[<p>
      <em>Best viewed on the <a href="https://heidenstedt.org/posts/2025/releasing-goqueuebench/">original page</a>, where extended functionality like the
    footnote helper is available.</em>
    </p><p>As i coded on <a href="https://github.com/i5heu/ouroboros-db">OuroborosDB</a> i noticed that i need a very fast queue for a rather unique architectural design decision.<br>
I try to build the network module in such a way that i can test the behavior completely deterministic while &ldquo;simulating&rdquo; entire clusters in a single process.</p>
<p>So i build a test prototype of my global network queue with go&rsquo;s channels and noticed that it was a major performance bottleneck, after writing 2 different ring buffer queue implementations it became clear that some queues behave completely different under different congestion levels and core counts - some so unpredictable that i just did not wanted to use them in my project.</p>
<p>This prompted me to take a relatively large chunk out of my free time and write a suite to benchmark different queue implementations i build under different conditions and score them based on their performance and predictability.</p>
<p>The result of this work is <a href="https://github.com/i5heu/GoQueueBench">GoQueueBench</a></p>
<p>These are the results of the benchmark suite:</p>
<table>
<thead>
<tr>
<th>Implementation</th>
<th>Overall Score</th>
<th>Throughput Light Load</th>
<th>Throughput Heavy Load</th>
<th>Throughput Average</th>
<th>Stability Ratio</th>
<th>Homogeneity Factor</th>
<th>Uncertainty</th>
<th>Total Tests</th>
</tr>
</thead>
<tbody>
<tr>
<td>VortexQueue</td>
<td><strong>11341466</strong></td>
<td>6926449</td>
<td><strong>5502925</strong></td>
<td><strong>8776309</strong></td>
<td><strong>1.15</strong></td>
<td>0.87</td>
<td><strong>0.25</strong></td>
<td>681</td>
</tr>
<tr>
<td>LightningQueue</td>
<td>9631771</td>
<td>6638213</td>
<td>4627690</td>
<td>6036728</td>
<td>0.99</td>
<td><strong>0.95</strong></td>
<td>0.31</td>
<td>681</td>
</tr>
<tr>
<td>FastMPMCQueue</td>
<td>9384067</td>
<td>6870924</td>
<td>4598620</td>
<td>6070151</td>
<td>0.96</td>
<td>0.93</td>
<td>0.28</td>
<td>681</td>
</tr>
<tr>
<td>OptimizedMPMCQueue</td>
<td>9105262</td>
<td>6436385</td>
<td>4379823</td>
<td>5838555</td>
<td>0.97</td>
<td>0.94</td>
<td>0.32</td>
<td>681</td>
</tr>
<tr>
<td>OptimizedMPMCQueueSharded</td>
<td>8130197</td>
<td>6369891</td>
<td>3834140</td>
<td>6781865</td>
<td>0.84</td>
<td>0.88</td>
<td>0.39</td>
<td>681</td>
</tr>
<tr>
<td>MultiHeadQueue</td>
<td>7391203</td>
<td>4363332</td>
<td>3492068</td>
<td>5558849</td>
<td>1.12</td>
<td>0.91</td>
<td>0.36</td>
<td>681</td>
</tr>
<tr>
<td>BasicMPMCQueue</td>
<td>5599252</td>
<td>4370889</td>
<td>2669612</td>
<td>3667715</td>
<td>0.89</td>
<td>0.93</td>
<td>0.30</td>
<td>681</td>
</tr>
<tr>
<td>Golang Buffered Channel</td>
<td>5312485</td>
<td>6667828</td>
<td>2760985</td>
<td>4312720</td>
<td>0.54</td>
<td>0.82</td>
<td>0.66</td>
<td>681</td>
</tr>
<tr>
<td>FastMPMCQueueTicket</td>
<td>3229780</td>
<td><strong>7705164</strong></td>
<td>1203924</td>
<td>5803821</td>
<td>0.21</td>
<td>0.64</td>
<td>1.19</td>
<td>681</td>
</tr>
</tbody>
</table>
<p>Please note that i build the package so that all queue adhere to the same interface and can be swapped out easily.</p>
]]></content:encoded></item><item><title>Release: OuroborosDB Data Storage Calculator</title><link>https://heidenstedt.org/posts/2024/release-ouroborosdb-data-storage-calculator/</link><pubDate>Fri, 01 Nov 2024 11:17:10 +0200</pubDate><guid>https://heidenstedt.org/posts/2024/release-ouroborosdb-data-storage-calculator/</guid><description><![CDATA[<p>
      <em>Best viewed on the <a href="https://heidenstedt.org/posts/2024/release-ouroborosdb-data-storage-calculator/">original page</a>, where extended functionality like the
    footnote helper is available.</em>
    </p><p>I forgot to mention that i released a new tool for my project <a href="https://github.com/i5heu/ouroboros-db">ouroboros-db</a>, so here it is:</p>
<p>The &ldquo;OuroborosDB Overhead Calculator&rdquo; is a little tool which helps you to calculate the overhead of erasure coding systems while also considering the overhead of indexes and the blocks themselves. It is kinda fun to play around with so give it a try!</p>
<p><a href="https://i5heu.github.io/ouroboros-db-overhead-calculator/">https://i5heu.github.io/ouroboros-db-overhead-calculator/</a></p>
<video width="100%" controls>
  <source src="https://heidenstedt.org/posts/2024/release-ouroborosdb-data-storage-calculator/demo.webm" type="video/webm">
  Your browser does not support the video tag.
</video>]]></description><content:encoded><![CDATA[<p>
      <em>Best viewed on the <a href="https://heidenstedt.org/posts/2024/release-ouroborosdb-data-storage-calculator/">original page</a>, where extended functionality like the
    footnote helper is available.</em>
    </p><p>I forgot to mention that i released a new tool for my project <a href="https://github.com/i5heu/ouroboros-db">ouroboros-db</a>, so here it is:</p>
<p>The &ldquo;OuroborosDB Overhead Calculator&rdquo; is a little tool which helps you to calculate the overhead of erasure coding systems while also considering the overhead of indexes and the blocks themselves. It is kinda fun to play around with so give it a try!</p>
<p><a href="https://i5heu.github.io/ouroboros-db-overhead-calculator/">https://i5heu.github.io/ouroboros-db-overhead-calculator/</a></p>
<video width="100%" controls>
  <source src="https://heidenstedt.org/posts/2024/release-ouroborosdb-data-storage-calculator/demo.webm" type="video/webm">
  Your browser does not support the video tag.
</video>]]></content:encoded></item><item><title>Ouroboros DB Dev Journal: Erasure Coding</title><link>https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/</link><pubDate>Tue, 27 Aug 2024 17:55:05 +0200</pubDate><guid>https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/</guid><description><![CDATA[<p>
      <em>Best viewed on the <a href="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/">original page</a>, where extended functionality like the
    footnote helper is available.</em>
    </p><p>This is a Dev Journal for the Ouroboros DB Project.<br>
I try to write down my thoughts and ideas Somewhat structured to have it as a reference for later and i publish it to give this information a chance to help other and to get feedback from the community.<br>
If you have feedback you can write me on my <a href="https://mastodon.social/@heidenstedt">Mastodon</a>.</p>
<p>Pls note that this is only a Dev Journal and not a Blog Post, so it may be a bit unstructured and not as polished as a Blog Post, including typos and other errors.</p>
<h2 id="tldr"><a href="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/#tldr">TL;DR:</a></h2><p>Today i worked on a refactoring of the architecture and feasibility of erasure coding for <a href="https://github.com/i5heu/ouroboros-db">Ouroboros DB</a>. There are some concerns i had regarding the potential index size and overhead that will result from it, and as it turns out, it is not as bad as i thought but i need a DHT for the index.</p>
<h2 id="architecture"><a href="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/#architecture">Architecture</a></h2><p>I think it would be super handy to be able to run parts of data pipelines (eg. storing files into chunked,compressed, encrypted and then erasure coded blocks) in a distributed manner. for safety and because i want to be able to run trustless nodes that can&rsquo;t see the raw data, it would be necessary to have the file chunking, compression and encryption part on the server the client currently speaks too, although it would be possible to pre-chunk the file on the client and upload it to different nodes, although here is the question if it is easy to port the chunking algorithm to browser JS or WASM.</p>
<h3 id="modules"><a href="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/#modules">Modules</a></h3><p>Refactoring of the code is needed to implement <code>erasure coding</code>, maybe we can even get rid of the <code>chunk</code> entirely as a stored thing since we have the data in the <code>parity block</code>s already.</p>
<p>Maybe best to add the <code>erasure coding</code> to the <code>StoreDataPipeline</code> and add the needed <code>erasure coding</code> metadata to <code>ChunkData</code>.</p>
<p>ASCII Art of the new architecture (click to get the .txt file):</p>
<a href="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/architecture.txt" target="_blank">
<p><div class="imageLoadingWrap"><img
      loading="lazy"
      src="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_720x0_resize_q85_h2_lanczos_3.webp"
      alt="ASCII Art of the new architecture"
      title="ASCII Art of the new architecture"
      height="871"
      width="3030"
      srcset='/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_100x0_resize_q85_h2_lanczos_3.webp 100w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_200x0_resize_q85_h2_lanczos_3.webp 200w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_300x0_resize_q85_h2_lanczos_3.webp 300w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_400x0_resize_q85_h2_lanczos_3.webp 400w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_500x0_resize_q85_h2_lanczos_3.webp 500w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_600x0_resize_q85_h2_lanczos_3.webp 600w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_700x0_resize_q85_h2_lanczos_3.webp 700w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_800x0_resize_q85_h2_lanczos_3.webp 800w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_900x0_resize_q85_h2_lanczos_3.webp 900w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_1000x0_resize_q85_h2_lanczos_3.webp 1000w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_1100x0_resize_q85_h2_lanczos_3.webp 1100w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_1200x0_resize_q85_h2_lanczos_3.webp 1200w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_1300x0_resize_q85_h2_lanczos_3.webp 1300w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_1400x0_resize_q85_h2_lanczos_3.webp 1400w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_1500x0_resize_q85_h2_lanczos_3.webp 1500w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_1600x0_resize_q85_h2_lanczos_3.webp 1600w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_1700x0_resize_q85_h2_lanczos_3.webp 1700w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_1800x0_resize_q85_h2_lanczos_3.webp 1800w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_1900x0_resize_q85_h2_lanczos_3.webp 1900w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_2000x0_resize_q85_h2_lanczos_3.webp 2000w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_2100x0_resize_q85_h2_lanczos_3.webp 2100w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_2200x0_resize_q85_h2_lanczos_3.webp 2200w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_2300x0_resize_q85_h2_lanczos_3.webp 2300w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_2400x0_resize_q85_h2_lanczos_3.webp 2400w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_2500x0_resize_q85_h2_lanczos_3.webp 2500w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_2600x0_resize_q85_h2_lanczos_3.webp 2600w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_2700x0_resize_q85_h2_lanczos_3.webp 2700w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_2800x0_resize_q85_h2_lanczos_3.webp 2800w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_2900x0_resize_q85_h2_lanczos_3.webp 2900w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_3000x0_resize_q85_h2_lanczos_3.webp 3000w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_3030x0_resize_q100_h2_lanczos_3.webp 3030w'
      sizes="(max-width: 672px) calc(100vw - 32px), (max-width: 736px) 640px, 624px"
    ><div class="imageLoading"></div>
</div></p>
</a>
<h2 id="erasure-coding"><a href="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/#erasure-coding">Erasure Coding</a></h2><p>For now i think it is okay yo make the erasure coding a simple n6 k3, which would result in a <code>parity block</code> size of 327,680 Bytes or 0.31MB as average. This would result in about 1.875MB per chunk that is 1.25MB big in average. This would result in a 50% overhead for the erasure coding, which i think is quite okay.</p>
<p>For the goal of storing 100TB in it, which are about 400M Chunks, we would need 2,400,000,000 <code>parity block</code>s, aka 2,4 Billion.</p>
<p>If we consider following overhead for the erasure coding:</p>
<details>
  <summary>Parity Meta Data</summary>
    This is the Meta Data for a `parity block`:
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-json" data-lang="json"><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span><span style="color:#f92672">&#34;parityHash&#34;</span>: <span style="color:#e6db74">&#34;aeae379a6e857728e44164267fdb7a0e27b205d757cc19899586c89dbb221930f1813d02ff93a661859bc17065eac4d6edf3c38a034e6283a84754d52917e5b0&#34;</span>,
</span></span><span style="display:flex;"><span><span style="color:#f92672">&#34;chunkHash&#34;</span>: <span style="color:#e6db74">&#34;aeae379a6e857728e44164267fdb7a0e27b205d757cc19899586c89dbb221930f1813d02ff93a661859bc17065eac4d6edf3c38a034e6283a84754d52917e5b0&#34;</span>,
</span></span><span style="display:flex;"><span><span style="color:#f92672">&#34;sizeByte&#34;</span>: <span style="color:#ae81ff">433000</span>,
</span></span><span style="display:flex;"><span><span style="color:#f92672">&#34;lastChecked&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span><span style="color:#f92672">&#34;reblanceLog&#34;</span>: [
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>],
</span></span><span style="display:flex;"><span><span style="color:#f92672">&#34;userLog&#34;</span>: [
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>],
</span></span><span style="display:flex;"><span><span style="color:#f92672">&#34;this is for the KV Key&#34;</span>: <span style="color:#e6db74">&#34;aeae379a6e857728e44164267fdb7a0e27b205d757cc19899586c89dbb221930f1813d02ff93a661859bc17065eac4d6edf3c38a034e6283a84754d52917e5b0&#34;</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div></details>
<p>We see that <code>parity block metadata</code> has under 2373 Bytes of overhead, which is about 0.72%% of the <code>parity block</code> size (very good). If one node would need to store the entire hash table we look at a size of 5.18TB, which means that we need a DHT for this to work. This also is the case for the Events, which introduce more storage overhead (more in <a href="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/#dht">DHT</a> )</p>
<h3 id="dht"><a href="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/#dht">DHT</a></h3><p>We need additional DHT Metadata like this:</p>
<details>
  <summary>DHT Meta Data</summary>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-json" data-lang="json"><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&#34;parityHash&#34;</span>: <span style="color:#e6db74">&#34;aeae379a6e857728e44164267fdb7a0e27b205d757cc19899586c89dbb221930f1813d02ff93a661859bc17065eac4d6edf3c38a034e6283a84754d52917e5b0&#34;</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&#34;storingNodes&#34;</span>: [
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;node&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;lastValidated&#34;</span>: <span style="color:#ae81ff">1724760126282</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;node&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;lastValidated&#34;</span>: <span style="color:#ae81ff">1724760126282</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;node&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;lastValidated&#34;</span>: <span style="color:#ae81ff">1724760126282</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;node&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;lastValidated&#34;</span>: <span style="color:#ae81ff">1724760126282</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;node&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;lastValidated&#34;</span>: <span style="color:#ae81ff">1724760126282</span>
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>  ]
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div></details>
<p>This example meta data has a size of 674 Bytes, which is about 0.21% of the <code>parity block</code> size. The <code>DHT meta data</code> would require an additional 1.47TB of storage.</p>
<h3 id="overhead"><a href="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/#overhead">Overhead</a></h3><p><code>DHT meta data</code> in combination with the <code>parity block meta data</code> would have a total overhead of 0.93%% of the <code>erasure coding metadata</code> relative to the <code>erasure coding parity block size</code>, the entire metadata overhead at 100TB and 2,4 Billion <code>parity block</code>s would be 6.65TB.</p>
<p>Adding this to the overhead from the <code>erasure coding</code> which are 50% we would have a total overhead of 56.65% relative to the raw data size. If we have a utilization of each <code>chunk</code> of 1310720 Bits which has been achieved with 10MB files with random binary data. This would result in a HDD to raw data ratio of 43.35% (which is pretty good for a k6n3 config and having self healing capabilities) for following configuration:</p>
<ul>
<li>100TB of raw data</li>
<li>data deduplication via BuzzHash into chunks</li>
<li>chunk compression via Zstd</li>
<li>chunk encryption via AES256</li>
<li>each chunk becomes 6 parity blocks of which 3 can be lost without data loss</li>
</ul>
<h2 id="conclusion"><a href="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/#conclusion">Conclusion</a></h2><p>I have written a Google Calc Spreadsheet for it, which you can find here: <a href="https://docs.google.com/spreadsheets/d/12Ad4vvA0dLSOffDLz6gMkkIJkYNZE70gY0wI1Qqkg8c/edit?usp=sharing">ouroboros-db Overhead Calculator</a>.</p>
<p>Though this Spreadsheet is neat, i discoverd that i need some surface plotting to find the best configuration for the erasure coding.<br>
Sadly there is no way to do this in Google Calc, so i will need to write a small Browser App for it&hellip; maybe i use Svelte for it, i haven&rsquo;t used it in a while now.</p>
<h2 id="further-reading"><a href="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/#further-reading">Further Reading</a></h2><p>A good overview over what erasure coding is about: <a href="https://transactional.blog/blog/2024-erasure-coding">Erasure Coding for Distributed Systems</a><br>
Also See <a href="https://news.ycombinator.com/item?id=41361281">HN comments</a></p>
]]></description><content:encoded><![CDATA[<p>
      <em>Best viewed on the <a href="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/">original page</a>, where extended functionality like the
    footnote helper is available.</em>
    </p><p>This is a Dev Journal for the Ouroboros DB Project.<br>
I try to write down my thoughts and ideas Somewhat structured to have it as a reference for later and i publish it to give this information a chance to help other and to get feedback from the community.<br>
If you have feedback you can write me on my <a href="https://mastodon.social/@heidenstedt">Mastodon</a>.</p>
<p>Pls note that this is only a Dev Journal and not a Blog Post, so it may be a bit unstructured and not as polished as a Blog Post, including typos and other errors.</p>
<h2 id="tldr"><a href="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/#tldr">TL;DR:</a></h2><p>Today i worked on a refactoring of the architecture and feasibility of erasure coding for <a href="https://github.com/i5heu/ouroboros-db">Ouroboros DB</a>. There are some concerns i had regarding the potential index size and overhead that will result from it, and as it turns out, it is not as bad as i thought but i need a DHT for the index.</p>
<h2 id="architecture"><a href="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/#architecture">Architecture</a></h2><p>I think it would be super handy to be able to run parts of data pipelines (eg. storing files into chunked,compressed, encrypted and then erasure coded blocks) in a distributed manner. for safety and because i want to be able to run trustless nodes that can&rsquo;t see the raw data, it would be necessary to have the file chunking, compression and encryption part on the server the client currently speaks too, although it would be possible to pre-chunk the file on the client and upload it to different nodes, although here is the question if it is easy to port the chunking algorithm to browser JS or WASM.</p>
<h3 id="modules"><a href="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/#modules">Modules</a></h3><p>Refactoring of the code is needed to implement <code>erasure coding</code>, maybe we can even get rid of the <code>chunk</code> entirely as a stored thing since we have the data in the <code>parity block</code>s already.</p>
<p>Maybe best to add the <code>erasure coding</code> to the <code>StoreDataPipeline</code> and add the needed <code>erasure coding</code> metadata to <code>ChunkData</code>.</p>
<p>ASCII Art of the new architecture (click to get the .txt file):</p>
<a href="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/architecture.txt" target="_blank">
<p><div class="imageLoadingWrap"><img
      loading="lazy"
      src="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_720x0_resize_q85_h2_lanczos_3.webp"
      alt="ASCII Art of the new architecture"
      title="ASCII Art of the new architecture"
      height="871"
      width="3030"
      srcset='/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_100x0_resize_q85_h2_lanczos_3.webp 100w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_200x0_resize_q85_h2_lanczos_3.webp 200w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_300x0_resize_q85_h2_lanczos_3.webp 300w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_400x0_resize_q85_h2_lanczos_3.webp 400w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_500x0_resize_q85_h2_lanczos_3.webp 500w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_600x0_resize_q85_h2_lanczos_3.webp 600w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_700x0_resize_q85_h2_lanczos_3.webp 700w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_800x0_resize_q85_h2_lanczos_3.webp 800w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_900x0_resize_q85_h2_lanczos_3.webp 900w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_1000x0_resize_q85_h2_lanczos_3.webp 1000w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_1100x0_resize_q85_h2_lanczos_3.webp 1100w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_1200x0_resize_q85_h2_lanczos_3.webp 1200w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_1300x0_resize_q85_h2_lanczos_3.webp 1300w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_1400x0_resize_q85_h2_lanczos_3.webp 1400w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_1500x0_resize_q85_h2_lanczos_3.webp 1500w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_1600x0_resize_q85_h2_lanczos_3.webp 1600w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_1700x0_resize_q85_h2_lanczos_3.webp 1700w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_1800x0_resize_q85_h2_lanczos_3.webp 1800w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_1900x0_resize_q85_h2_lanczos_3.webp 1900w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_2000x0_resize_q85_h2_lanczos_3.webp 2000w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_2100x0_resize_q85_h2_lanczos_3.webp 2100w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_2200x0_resize_q85_h2_lanczos_3.webp 2200w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_2300x0_resize_q85_h2_lanczos_3.webp 2300w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_2400x0_resize_q85_h2_lanczos_3.webp 2400w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_2500x0_resize_q85_h2_lanczos_3.webp 2500w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_2600x0_resize_q85_h2_lanczos_3.webp 2600w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_2700x0_resize_q85_h2_lanczos_3.webp 2700w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_2800x0_resize_q85_h2_lanczos_3.webp 2800w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_2900x0_resize_q85_h2_lanczos_3.webp 2900w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_3000x0_resize_q85_h2_lanczos_3.webp 3000w,/posts/2024/ouroboros-db-dev-journal-erasure-coding/Architecture_hu8ec97601451ce298d71e413e7198ac49_68065_3030x0_resize_q100_h2_lanczos_3.webp 3030w'
      sizes="(max-width: 672px) calc(100vw - 32px), (max-width: 736px) 640px, 624px"
    ><div class="imageLoading"></div>
</div></p>
</a>
<h2 id="erasure-coding"><a href="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/#erasure-coding">Erasure Coding</a></h2><p>For now i think it is okay yo make the erasure coding a simple n6 k3, which would result in a <code>parity block</code> size of 327,680 Bytes or 0.31MB as average. This would result in about 1.875MB per chunk that is 1.25MB big in average. This would result in a 50% overhead for the erasure coding, which i think is quite okay.</p>
<p>For the goal of storing 100TB in it, which are about 400M Chunks, we would need 2,400,000,000 <code>parity block</code>s, aka 2,4 Billion.</p>
<p>If we consider following overhead for the erasure coding:</p>
<details>
  <summary>Parity Meta Data</summary>
    This is the Meta Data for a `parity block`:
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-json" data-lang="json"><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span><span style="color:#f92672">&#34;parityHash&#34;</span>: <span style="color:#e6db74">&#34;aeae379a6e857728e44164267fdb7a0e27b205d757cc19899586c89dbb221930f1813d02ff93a661859bc17065eac4d6edf3c38a034e6283a84754d52917e5b0&#34;</span>,
</span></span><span style="display:flex;"><span><span style="color:#f92672">&#34;chunkHash&#34;</span>: <span style="color:#e6db74">&#34;aeae379a6e857728e44164267fdb7a0e27b205d757cc19899586c89dbb221930f1813d02ff93a661859bc17065eac4d6edf3c38a034e6283a84754d52917e5b0&#34;</span>,
</span></span><span style="display:flex;"><span><span style="color:#f92672">&#34;sizeByte&#34;</span>: <span style="color:#ae81ff">433000</span>,
</span></span><span style="display:flex;"><span><span style="color:#f92672">&#34;lastChecked&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span><span style="color:#f92672">&#34;reblanceLog&#34;</span>: [
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>],
</span></span><span style="display:flex;"><span><span style="color:#f92672">&#34;userLog&#34;</span>: [
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;time&#34;</span>: <span style="color:#ae81ff">1724760126282</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">&#34;from&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>],
</span></span><span style="display:flex;"><span><span style="color:#f92672">&#34;this is for the KV Key&#34;</span>: <span style="color:#e6db74">&#34;aeae379a6e857728e44164267fdb7a0e27b205d757cc19899586c89dbb221930f1813d02ff93a661859bc17065eac4d6edf3c38a034e6283a84754d52917e5b0&#34;</span>
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div></details>
<p>We see that <code>parity block metadata</code> has under 2373 Bytes of overhead, which is about 0.72%% of the <code>parity block</code> size (very good). If one node would need to store the entire hash table we look at a size of 5.18TB, which means that we need a DHT for this to work. This also is the case for the Events, which introduce more storage overhead (more in <a href="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/#dht">DHT</a> )</p>
<h3 id="dht"><a href="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/#dht">DHT</a></h3><p>We need additional DHT Metadata like this:</p>
<details>
  <summary>DHT Meta Data</summary>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-json" data-lang="json"><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&#34;parityHash&#34;</span>: <span style="color:#e6db74">&#34;aeae379a6e857728e44164267fdb7a0e27b205d757cc19899586c89dbb221930f1813d02ff93a661859bc17065eac4d6edf3c38a034e6283a84754d52917e5b0&#34;</span>,
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">&#34;storingNodes&#34;</span>: [
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;node&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;lastValidated&#34;</span>: <span style="color:#ae81ff">1724760126282</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;node&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;lastValidated&#34;</span>: <span style="color:#ae81ff">1724760126282</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;node&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;lastValidated&#34;</span>: <span style="color:#ae81ff">1724760126282</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;node&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;lastValidated&#34;</span>: <span style="color:#ae81ff">1724760126282</span>
</span></span><span style="display:flex;"><span>    },
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;node&#34;</span>: <span style="color:#e6db74">&#34;2deb000b57bfac9d72c14d4ed967b572&#34;</span>,
</span></span><span style="display:flex;"><span>      <span style="color:#f92672">&#34;lastValidated&#34;</span>: <span style="color:#ae81ff">1724760126282</span>
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>  ]
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div></details>
<p>This example meta data has a size of 674 Bytes, which is about 0.21% of the <code>parity block</code> size. The <code>DHT meta data</code> would require an additional 1.47TB of storage.</p>
<h3 id="overhead"><a href="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/#overhead">Overhead</a></h3><p><code>DHT meta data</code> in combination with the <code>parity block meta data</code> would have a total overhead of 0.93%% of the <code>erasure coding metadata</code> relative to the <code>erasure coding parity block size</code>, the entire metadata overhead at 100TB and 2,4 Billion <code>parity block</code>s would be 6.65TB.</p>
<p>Adding this to the overhead from the <code>erasure coding</code> which are 50% we would have a total overhead of 56.65% relative to the raw data size. If we have a utilization of each <code>chunk</code> of 1310720 Bits which has been achieved with 10MB files with random binary data. This would result in a HDD to raw data ratio of 43.35% (which is pretty good for a k6n3 config and having self healing capabilities) for following configuration:</p>
<ul>
<li>100TB of raw data</li>
<li>data deduplication via BuzzHash into chunks</li>
<li>chunk compression via Zstd</li>
<li>chunk encryption via AES256</li>
<li>each chunk becomes 6 parity blocks of which 3 can be lost without data loss</li>
</ul>
<h2 id="conclusion"><a href="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/#conclusion">Conclusion</a></h2><p>I have written a Google Calc Spreadsheet for it, which you can find here: <a href="https://docs.google.com/spreadsheets/d/12Ad4vvA0dLSOffDLz6gMkkIJkYNZE70gY0wI1Qqkg8c/edit?usp=sharing">ouroboros-db Overhead Calculator</a>.</p>
<p>Though this Spreadsheet is neat, i discoverd that i need some surface plotting to find the best configuration for the erasure coding.<br>
Sadly there is no way to do this in Google Calc, so i will need to write a small Browser App for it&hellip; maybe i use Svelte for it, i haven&rsquo;t used it in a while now.</p>
<h2 id="further-reading"><a href="https://heidenstedt.org/posts/2024/ouroboros-db-dev-journal-erasure-coding/#further-reading">Further Reading</a></h2><p>A good overview over what erasure coding is about: <a href="https://transactional.blog/blog/2024-erasure-coding">Erasure Coding for Distributed Systems</a><br>
Also See <a href="https://news.ycombinator.com/item?id=41361281">HN comments</a></p>
]]></content:encoded></item></channel></rss>