• chevron_right

      Beginners Guide for Rename Command in Linux

      Linux TLDR · Friday, 30 December, 2022 - 03:03 edit

    • chevron_right

      How to Change History File Location in Linux

      Linux TLDR · Friday, 30 December, 2022 - 03:02 edit

    In this article, you will learn how to use another file instead of the traditional “~/.bash_history” file to store your command history.

    #linux #linuxfan #linuxuser #systemadministrator #ubuntu #debian #dev #devops #webdevelopment #webdeveloper #programmer #programming #python #java #javascript #c #programmingmemes #linuxmemes #memes #memes #cat #funnycats

    • Nu chevron_right

      A new protocol and tool for PNG file attachments

      pubsub.kikeriki.at / null_program · Friday, 31 December, 2021 - 22:17 · 19 minutes

    <p>When my articles include diagrams to illustrate a concept, such as a <a href="/blog/2020/12/31/">state machine</a>, I will check the <a href="https://graphviz.org/">Graphviz</a>, <a href="https://graphviz.org/">gnuplot</a>, or SVG source into source control alongside the image in case I need to make changes in the future. Sometimes I even make the image itself a link to its source file. I’ve thought it would be convenient if the raster image somehow contained its own source as metadata so that they don’t get separated. I looked around and wasn’t satisfied with the solutions I found, so I wrote one: <strong><a href="https://github.com/skeeto/scratch/tree/master/pngattach">pngattach</a></strong>.</p> <p>My approach introduces a new private chunk type <code class="language-plaintext highlighter-rouge">atCh</code> (“attachment”) which contains a file name, a flag to indicate if the attachment is compressed, and an optionally DEFLATE-compressed blob of file contents. I tried to follow the spirit of PNG chunk formatting, but without the constraints I hoped to avoid. A single PNG can contain multiple attachments, e.g. source file, Makefile, README, license file, etc. The protocol places constraints on the file names to keep it simple and to avoid shenanigans: no control bytes (anything below ASCII space), no directories, and cannot start with a period (no special hidden files). If that’s too constraining, you could attach a ZIP or TAR.</p> <h3 id="png-chunk-format">PNG chunk format</h3> <p>PNG files begin with a fixed 8-byte header followed by of a series of chunks. Each chunk has an 8-byte header and 4-byte footer. The chunk header is a 32-bit big endian chunk length (not counting header or footer) and a 4-byte tag identifying its type. The length allows implementations to skip chunks it doesn’t recognize.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>LLLL TTTT ...chunk... CCCC </code></pre></div></div> <p>The footer is a big endian CRC-32 checksum of the 4-byte type tag and the chunk body itself.</p> <p>Chunk tags are interpreted as 4 ASCII characters, where the capitalization of each letter encodes 4 additional boolean flags. The flags in my tag, <code class="language-plaintext highlighter-rouge">atCh</code>, indicate it’s a non-critical private chunk which doesn’t depend on the image data.</p> <p>PNG always ends with a zero-length <code class="language-plaintext highlighter-rouge">IEND</code> chunk, which works out to a kind of 12-byte constant footer.</p> <h3 id="existing-chunk-types">Existing chunk types</h3> <p>The PNG standard currently defines three kinds of chunks for storing text metadata: <code class="language-plaintext highlighter-rouge">tEXt</code>, <code class="language-plaintext highlighter-rouge">iTXt</code>, <code class="language-plaintext highlighter-rouge">zTXt</code>. The first is limited to Latin-1 with LF newlines, and so cannot store UTF-8 source text. The latter two were introduced in the PNG 1.2 specification (November 1999), and allow (only) UTF-8 content with LF newlines. All three have a 1 to 79-byte Latin-1 “key” field, and the latter two some additional fields describing the language of the text.</p> <p>The key field is null-terminated, making it 80 bytes maximum when treated as a null-terminated string. I believe this constraint exists to aid implementations, which can rely on this hard upper limit for the key lengths they’re expected to handle. Otherwise a key could have been up to 4GiB in length.</p> <p>I had considered using part of the key as a file name, prefixed with a custom namespace (ex. <code class="language-plaintext highlighter-rouge">attachment:FILENAME</code>) to distinguish it from other text chunks. However, I didn’t like the constraints this placed on the file name, plus I wanted to support arbitrary file content, not limited to a particular subformat.</p> <p>As prior art, there’s a draw.io/diagrams.net format which embeds a source string without file name. The source string is encoded in base64 (i.e. unconstrained by PNG), wrapped in XML, then incorrectly encoded as an <code class="language-plaintext highlighter-rouge">iTXt</code> chunk. The XML alone was enough to keep me away from using this format.</p> <h3 id="pngattach-details">pngattach details</h3> <p>In my attachment protocol, the file name is an arbitrary length, null-terminated byte string (preferably UTF-8), much like a key field, with the previously-mentioned anti-shenanigans restrictions. The file name is followed by a byte, 0 or 1, indicating if the content is compressed using PNG’s officially-supported compression format. The rest is the arbitrary content bytes, which presumably the recipient will know how to use.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>LLLL atCh example.txt 0 F ...contents... CCCC </code></pre></div></div> <p>I expect any experienced programmer could write a basic attachment extractor in their language of choice inside of 30 or so minutes. Hooking up a DEFLATE library for decompression would be the most difficult part.</p> <p>Since it supports multiple attachments and behaves like an archive format, my tool supports flags much like <code class="language-plaintext highlighter-rouge">tar</code>: <code class="language-plaintext highlighter-rouge">-c</code> to create attachments (default and implicit), <code class="language-plaintext highlighter-rouge">-t</code> to list attachments, and <code class="language-plaintext highlighter-rouge">-x</code> to extract attachments. PNG data is always passed on standard input and standard output.</p> <p>For example, to render a Graphviz diagram and attach the source all at once:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ dot -Tpng graph.dot | pngattach graph.dot &gt;graph.png </code></pre></div></div> <p>Later on someone might extract it and tweak it, like so (<code class="language-plaintext highlighter-rouge">-v</code> verbose, lists files as they’re extracted, like <code class="language-plaintext highlighter-rouge">tar</code>):</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ pngattach -xv &lt;graph.png graph.dot $ vi graph.dot $ dot -Tpng graph.dot &gt;graph.png </code></pre></div></div> <p>Like <code class="language-plaintext highlighter-rouge">tar</code>, it can also write attachments to standard output with <code class="language-plaintext highlighter-rouge">-O</code>. For example, to re-render the image as an SVG:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ pngattach -xO &lt;graph.png | dot -Tsvg &gt;graph.svg </code></pre></div></div> <p>Strictly processing standard input to output, rather than taking the input as an argument, is something I’ve been trying lately. I’m pretty happy with my <a href="/blog/2020/08/01/">command line design</a> for <code class="language-plaintext highlighter-rouge">pngattach</code>. The real test will happen in the future, when I’ve forgotten the details and have to figure it out again from my own documentation.</p> <p>Curiously, lots of common software refuses to handle PNGs containing large chunks, and so your PNG may not display if you attach a file even as small as a few MiB. A defense against denial of service?</p> <h3 id="example-png">Example PNG</h3> <p>I haven’t gone back and embedded attachments in any older articles, but I may do so in future articles. If you wanted to try it out for yourself, either with my tool or writing your own for fun, this PNG contains a compressed attachment:</p> <p><img src="/img/atch-test.png" alt="" /></p> <p>I produced it like so (with the help of <a href="https://www.imagemagick.org/">ImageMagick</a>):</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ echo P3 1 1 1 0 1 0 | convert ppm:- resize 200 png:- | pngattach message.txt &gt;atch-test.png </code></pre></div></div> <h3 id="error-handling-addendum">Error handling (addendum)</h3> <p>Another technique I’ve been trying is Go-style error value returns in C programs, where the errors-as-values are <code class="language-plaintext highlighter-rouge">const char *</code> pointers to static string buffers. The contents contain an error message to be displayed to the user, and errors may be wrapped in more context (what file, what operation, etc.) as the stack unwinds. A null pointer means no error, i.e. nil. I’ve used this extensively in <code class="language-plaintext highlighter-rouge">pngattach</code>. Examples of the style:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="kt">int</span> <span class="o">*</span><span class="n">p</span><span class="p">;</span> <span class="k">if</span> <span class="p">(</span><span class="n">nelem</span> <span class="o">&gt;</span> <span class="p">(</span><span class="kt">size_t</span><span class="p">)</span><span class="o">-</span><span class="mi">1</span><span class="o">/</span><span class="k">sizeof</span><span class="p">(</span><span class="o">*</span><span class="n">p</span><span class="p">))</span> <span class="p">{</span> <span class="k">return</span> <span class="s">"out of memory"</span><span class="p">;</span> <span class="c1">// overflow</span> <span class="p">}</span> <span class="n">p</span> <span class="o">=</span> <span class="n">malloc</span><span class="p">(</span><span class="n">nelem</span><span class="o">*</span><span class="nf">sizeof</span><span class="p">(</span><span class="o">*</span><span class="n">p</span><span class="p">));</span> <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">p</span><span class="p">)</span> <span class="p">{</span> <span class="k">return</span> <span class="s">"out of memory"</span><span class="p">;</span> <span class="p">}</span> <span class="c1">// ...</span> <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">fwrite</span><span class="p">(</span><span class="n">buf</span><span class="p">,</span> <span class="n">len</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="n">stdout</span><span class="p">))</span> <span class="p">{</span> <span class="n">free</span><span class="p">(</span><span class="n">p</span><span class="p">);</span> <span class="k">return</span> <span class="s">"write error"</span><span class="p">;</span> <span class="p">}</span> </code></pre></div></div> <p>An <code class="language-plaintext highlighter-rouge">errwrap()</code> function builds a new error string in a static buffer. This simple solution wouldn’t work in a multi-threaded program, but that’s not the case here. Mine toggles between two static buffers so that it can wrap recursively.</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span> <span class="nf">errwrap</span><span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">pre</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">post</span><span class="p">)</span> <span class="p">{</span> <span class="k">static</span> <span class="kt">char</span> <span class="n">errtmp</span><span class="p">[</span><span class="mi">2</span><span class="p">][</span><span class="mi">256</span><span class="p">],</span> <span class="n">i</span><span class="p">;</span> <span class="kt">int</span> <span class="n">n</span> <span class="o">=</span> <span class="n">i</span> <span class="o">=</span> <span class="o">!</span><span class="n">i</span><span class="p">;</span> <span class="c1">// toggle between two static buffers</span> <span class="n">snprintf</span><span class="p">(</span><span class="n">errtmp</span><span class="p">[</span><span class="n">n</span><span class="p">],</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">errtmp</span><span class="p">[</span><span class="n">n</span><span class="p">]),</span> <span class="s">"%s: %s"</span><span class="p">,</span> <span class="n">pre</span><span class="p">,</span> <span class="n">post</span><span class="p">);</span> <span class="k">return</span> <span class="n">errtmp</span><span class="p">[</span><span class="n">n</span><span class="p">];</span> <span class="p">}</span> </code></pre></div></div> <p>Then I can do stuff like:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="kt">FILE</span> <span class="o">*</span><span class="n">f</span> <span class="o">=</span> <span class="n">fopen</span><span class="p">(</span><span class="n">path</span><span class="p">,</span> <span class="s">"wb"</span><span class="p">);</span> <span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">f</span><span class="p">)</span> <span class="p">{</span> <span class="k">return</span> <span class="n">errwrap</span><span class="p">(</span><span class="s">"failed to open file"</span><span class="p">,</span> <span class="n">path</span><span class="p">);</span> <span class="p">}</span> </code></pre></div></div> <p>And that can keep being wrapped on the way up:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">err</span> <span class="o">=</span> <span class="n">png_write</span><span class="p">(</span><span class="n">path</span><span class="p">);</span> <span class="k">if</span> <span class="p">(</span><span class="n">err</span><span class="p">)</span> <span class="p">{</span> <span class="k">return</span> <span class="n">errwrap</span><span class="p">(</span><span class="s">"writing PNG"</span><span class="p">,</span> <span class="n">err</span><span class="p">);</span> <span class="p">}</span> </code></pre></div></div> <p>So that ultimately the user sees something like:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pngattach: writing PNG: failed to open file: example.png </code></pre></div></div> <p>That’s always printed by a single error printout block at the top level, where all errors are ultimately routed.</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">main</span><span class="p">(</span><span class="kt">int</span> <span class="n">argc</span><span class="p">,</span> <span class="kt">char</span> <span class="o">**</span><span class="n">argv</span><span class="p">)</span> <span class="p">{</span> <span class="c1">// ...</span> <span class="n">err</span> <span class="o">=</span> <span class="n">run</span><span class="p">(</span><span class="n">options</span><span class="p">);</span> <span class="k">if</span> <span class="p">(</span><span class="n">err</span><span class="p">)</span> <span class="p">{</span> <span class="n">fprintf</span><span class="p">(</span><span class="n">stderr</span><span class="p">,</span> <span class="s">"pngattach: %s</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">err</span><span class="p">);</span> <span class="k">return</span> <span class="mi">1</span><span class="p">;</span> <span class="p">}</span> <span class="k">return</span> <span class="mi">0</span><span class="p">;</span> <span class="p">}</span> </code></pre></div></div>
    • Nu chevron_right

      Some sanity for C and C++ development on Windows

      pubsub.kikeriki.at / null_program · Thursday, 30 December, 2021 - 23:25 · 15 minutes

    <p>A hard reality of C and C++ software development on Windows is that there has never been a good, native C or C++ standard library implementation for the platform. A standard library should abstract over the underlying host facilities in order to ease portable software development. On Windows, C and C++ is so poorly hooked up to operating system interfaces that most portable or mostly-portable software — programs which work perfectly elsewhere — are subtly broken on Windows, particularly outside of the English-speaking world. The reasons are almost certainly political, originally motivated by vendor lock-in, than technical, which adds insult to injury. This article is about what’s wrong, how it’s wrong, and some easy techniques to deal with it in portable software.</p> <p>There are <a href="/blog/2016/06/13/">multiple C implementations</a>, so how could they all be bad, even the <a href="/blog/2018/04/13/">early ones</a>? Microsoft’s C runtime has defined how the standard library should work on the platform, and everyone else followed along for the sake of compatibility. I’m excluding <a href="https://www.cygwin.com/">Cygwin</a> and its major fork, <a href="https://www.msys2.org/">MSYS2</a>, despite not inheriting any of these flaws. They change so much that they’re effectively whole new platforms, not truly “native” to Windows.</p> <p>In practice, C++ standard libraries are implemented on top of a C standard library, which is why C++ shares the same problems. CPython dodges these issues: Though written in C, on Windows it bypasses the broken C standard library and directly calls the proprietary interfaces. Other language implementations, such “gc” Go, simply aren’t built on C at all, and instead do things correctly in the first place — the behaviors the C runtimes should have had all along.</p> <p>If you’re just working on one large project, bypassing the C runtime isn’t such a big deal, and you’re likely already doing so to access important platform functionality. You don’t really even need a C runtime. However, if you write many small programs, <a href="https://github.com/skeeto/scratch">as I do</a>, writing the same special Windows support for each one ends up being most of the work, and honestly makes properly supporting Windows not worth the trouble. I end up just accepting the broken defaults most of the time.</p> <p>Before diving into the details, if you’re looking for a quick-and-easy solution for the Mingw-w64 toolchain, <a href="/blog/2020/05/15/">including w64devkit</a>, which magically makes your C and C++ console programs behave well on Windows, I’ve put together a “library” named <strong><a href="https://github.com/skeeto/scratch/tree/master/libwinsane">libwinsane</a></strong>. It solves all problems discussed in this article, except for one. No source changes required, simply link it into your program.</p> <h3 id="what-exactly-is-broken">What exactly is broken?</h3> <p>The Windows API comes in two flavors: narrow with an “A” (“ANSI”) suffix, and wide (Unicode, UTF-16) with a “W” suffix. The former is the legacy API, where an active <em>code page</em> maps 256 bytes onto (up to) 256 specific characters. On typical machines configured for European languages, this means <a href="https://en.wikipedia.org/wiki/Windows-1252">code page 1252</a>. <a href="http://simonsapin.github.io/wtf-8/">Roughly speaking</a>, Windows internally uses UTF-16, and calls through the narrow interface use the active code page to translate the narrow strings to wide strings. The result is that calls through the narrow API have limited access to the system.</p> <p>The UTF-8 encoding was invented in 1992 and standardized by January 1993. UTF-8 was adopted by the unix world over the following years due to <a href="/blog/2017/10/06/#what-is-utf-8">its backwards-compatibility</a> with its existing interfaces. Programs could read and write Unicode data, access Unicode paths, pass Unicode arguments, and get and set Unicode environment variables without needing to change anything. Today UTF-8 has become the dominant text encoding format in the world, in large part due to the world wide web.</p> <p>In July 1993, Microsoft introduced the wide Windows API with the release of Windows NT 3.1, placing all their bets on UCS-2 (later UTF-16) rather than UTF-8. This turned out to be a mistake, since <a href="http://utf8everywhere.org/">UTF-16 is inferior to UTF-8 in practically every way</a>, though admittedly some problems weren’t so obvious at the time.</p> <p>The major problem: <strong>The C and C++ standard libraries only hook up to the narrow Windows interfaces</strong>. The standard library, and therefore typical portable software on Windows, cannot handle anything but ASCII. The effective result is that these programs:</p> <ul> <li>Cannot accept non-ASCII arguments</li> <li>Cannot get/set non-ASCII environment variables</li> <li>Cannot access non-ASCII paths</li> <li>Cannot read and write non-ASCII on a console</li> </ul> <p>Doing any of these requires calling proprietary functions, treating Windows as a special target. It’s part of what makes correctly porting software to Windows so painful.</p> <p>The sensible solution would have been for the C runtime to speak UTF-8 and connect to the wide API. Alternatively, the narrow API could have been changed over to UTF-8, phasing out the old code page concept. In theory this is what the UTF-8 “code page” is about, though it doesn’t always work. There would have been compatibility problems with abruptly making such a change, but until very recently, <em>this wasn’t even an option</em>. Why couldn’t there be a switch I could flip to get sane behavior that works like every other platform?</p> <h3 id="how-to-mostly-fix-unicode-support">How to mostly fix Unicode support</h3> <p>In 2019, Microsoft introduced a feature to allow programs to <a href="https://docs.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page">request UTF-8 as their active code page on start</a>, along with supporting UTF-8 on more narrow API functions. This is like the magic switch I wanted, except that it involves embedding some ugly XML into your binary in a particular way. At least it’s now an option.</p> <p>For Mingw-w64, that means writing a resource file like so:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#include &lt;winuser.h&gt; CREATEPROCESS_MANIFEST_RESOURCE_ID RT_MANIFEST "utf8.xml" </code></pre></div></div> <p>Compiling it with <code class="language-plaintext highlighter-rouge">windres</code>:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ windres -o manifest.o manifest.rc </code></pre></div></div> <p>Then linking that into your program. Amazingly it mostly works! Programs can access Unicode arguments, Unicode environment variables, and Unicode paths, including with <code class="language-plaintext highlighter-rouge">fopen</code>, just as it’s worked on other platforms for decades. Since the active code page is set at load time, it happens before <code class="language-plaintext highlighter-rouge">argv</code> is constructed (from <code class="language-plaintext highlighter-rouge">GetCommandLineA</code>), which is why that works out.</p> <p>Alternatively you could create a “side-by-side assembly” placing that XML in a file with the same name as your EXE but with <code class="language-plaintext highlighter-rouge">.manifest</code> suffix (after the <code class="language-plaintext highlighter-rouge">.exe</code> suffix), then placing that next to your EXE. Just be mindful that there’s a “side-by-side” cache (WinSxS), and so it might not immediately pick up your changes.</p> <p>What <em>doesn’t</em> work is console input and output since the console is external to the process, and so isn’t covered by the process’s active code page. It must be configured separately using a proprietary call:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">SetConsoleOutputCP</span><span class="p">(</span><span class="n">CP_UTF8</span><span class="p">);</span> </code></pre></div></div> <p>Annoying, but at least it’s not <em>that</em> painful. This only covers output, though, meaning programs can only print UTF-8. Unfortunately <a href="https://github.com/microsoft/terminal/issues/4551#issuecomment-585487802">UTF-8 input still doesn’t work</a>, and setting the input code page doesn’t do anything despite reporting success:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">SetConsoleCP</span><span class="p">(</span><span class="n">CP_UTF8</span><span class="p">);</span> <span class="c1">// doesn't work</span> </code></pre></div></div> <p>If you care about reading interactive Unicode input, you’re <a href="/blog/2020/05/04/">stuck bypassing the C runtime</a> since it’s still broken.</p> <h3 id="text-stream-translation">Text stream translation</h3> <p>Another long-standing issue is that C and C++ on Windows has distinct “text” and “binary” streams, which it inherited from DOS. Mainly this means automatic newline conversion between CRLF and LF. The C standard explicitly allows for this, though unix-like platforms have never actually distinguished between text and binary streams.</p> <p>The standard also specifies that standard input, output, and error are all open as text streams, and there’s no portable method to change the stream mode to binary — a serious deficiency with the standard. On unix-likes this doesn’t matter, but on Windows it means programs can’t read or write binary data on standard streams without calling a non-standard function. It also means reading and writing standard streams is slow, <a href="/blog/2021/12/04/">frequently a bottleneck</a> unless I route around it.</p> <p>Personally, I like <a href="/blog/2020/06/29/">writing binary data to standard output</a>, <a href="/blog/2020/11/24/">including video</a>, and sometimes <a href="/blog/2017/07/02/">binary filters</a> that also read binary input. I do it so often that in probably half my C programs I have this snippet in <code class="language-plaintext highlighter-rouge">main</code> just so they work correctly on Windows:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="cp">#ifdef _WIN32 </span> <span class="kt">int</span> <span class="nf">_setmode</span><span class="p">(</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">);</span> <span class="n">_setmode</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mh">0x8000</span><span class="p">);</span> <span class="n">_setmode</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mh">0x8000</span><span class="p">);</span> <span class="cp">#endif </span></code></pre></div></div> <p>That incantation sets standard input and output in the C runtime to binary mode without the need to include a header, making it compact, simple, and self-contained.</p> <p>This built-in newline translation, along with the Windows standard text editor, Notepad, <a href="https://devblogs.microsoft.com/commandline/extended-eol-in-notepad/">lagging decades behind</a>, meant that many other programs, including Git, grew their own, annoying, newline conversion <a href="https://github.com/skeeto/w64devkit/issues/10">misfeatures</a> that cause <a href="https://github.com/skeeto/binitools/commit/2efd690c3983856c9633b0be66d57483491d1e10">other problems</a>.</p> <h3 id="libwinsane">libwinsane</h3> <p>I introduced libwinsane at the beginning of the article, which fixes all this simply by being linked into a program. It includes the magic XML manifest <code class="language-plaintext highlighter-rouge">.rsrc</code> section, configures the console for UTF-8 output, and sets standard streams to binary before <code class="language-plaintext highlighter-rouge">main</code> (via a GCC constructor). I called it a “library”, but it’s actually a single object file. It can’t be a static library since it must be linked into the program despite not actually being referenced by the program.</p> <p>So normally this program:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include &lt;stdio.h&gt; #include &lt;string.h&gt; </span> <span class="kt">int</span> <span class="nf">main</span><span class="p">(</span><span class="kt">int</span> <span class="n">argc</span><span class="p">,</span> <span class="kt">char</span> <span class="o">**</span><span class="n">argv</span><span class="p">)</span> <span class="p">{</span> <span class="kt">char</span> <span class="o">*</span><span class="n">arg</span> <span class="o">=</span> <span class="n">argv</span><span class="p">[</span><span class="n">argc</span><span class="o">-</span><span class="mi">1</span><span class="p">];</span> <span class="kt">size_t</span> <span class="n">len</span> <span class="o">=</span> <span class="n">strlen</span><span class="p">(</span><span class="n">arg</span><span class="p">);</span> <span class="n">printf</span><span class="p">(</span><span class="s">"%zu %s</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">len</span><span class="p">,</span> <span class="n">arg</span><span class="p">);</span> <span class="p">}</span> </code></pre></div></div> <p>Compiled and run:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>C:\&gt;cc -o example example.c C:\&gt;example π 1 p </code></pre></div></div> <p>As usual, the Unicode argument is silently mangled into one byte. Linked with libwinsane, it just works like everywhere else:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>C:\&gt;gcc -o example example.c libwinsane.o C:\&gt;example π 2 π </code></pre></div></div> <p>If you’re maintaining a substantial program, you probably want to copy and integrate the necessary parts of libwinsane into your project and build, rather than always link against this loose object file. This is more for convenience and for succinctly capturing the concept. You may even want to <a href="https://github.com/skeeto/hastyhex/blob/master/hastyhex.c#L220">enable ANSI escape processing</a> in your version.</p>
    • Nu chevron_right

      Fast CSV processing with SIMD

      pubsub.kikeriki.at / null_program · Saturday, 4 December, 2021 - 01:13 · 26 minutes

    <p><em>This article was discussed <a href="https://news.ycombinator.com/item?id=29439403">on Hacker News</a>.</em></p> <p>I recently learned of <a href="https://github.com/dbro/csvquote">csvquote</a>, a tool that encodes troublesome <a href="https://datatracker.ietf.org/doc/html/rfc4180">CSV</a> characters such that unix tools can correctly process them. It reverses the encoding at the end of the pipeline, recovering the original input. The original implementation handles CSV quotes using the straightforward, naive method. However, there’s a better approach that is not only simpler, but around 3x faster on modern hardware. Even more, there’s yet another approach using SIMD intrinsics, plus some bit twiddling tricks, which increases the processing speed by an order of magnitude. <a href="https://github.com/skeeto/scratch/tree/master/csvquote"><strong>My csvquote implementation</strong></a> includes both approaches.</p> <!--more--> <h3 id="background">Background</h3> <p>Records in CSV data are separated by line feeds, and fields are separated by commas. Fields may be quoted.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>aaa,bbb,ccc xxx,"yyy",zzz </code></pre></div></div> <p>Fields containing a line feed (U+000A), quotation mark (U+0022), or comma (U+002C), must be quoted, otherwise they would be ambiguous with the CSV formatting itself. Quoted quotation marks are turned into a pair of quotes. For example, here are two records with two fields apiece:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>"George Herman ""Babe"" Ruth","1919–1921, 1923, 1926" "Frankenstein; or, The Modern Prometheus",Mary Shelley </code></pre></div></div> <p>A CSV-unaware tool splitting on commas and line feeds (ex. <code class="language-plaintext highlighter-rouge">awk</code>) would process these records improperly. So csvquote translates quoted line feeds into record separators (U+001E) and commas into unit separators (U+001F). These control characters rarely appear in normal text data, and can be trivially processed in UTF-8-encoded text without decoding or encoding. The above records become:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>"George Herman ""Babe"" Ruth","1919–1921\x1f 1923\x1f 1926" "Frankenstein;\x1eor\x1f The Modern Prometheus",Mary Shelley </code></pre></div></div> <p>I’ve used <code class="language-plaintext highlighter-rouge">\x1e</code> and <code class="language-plaintext highlighter-rouge">\x1f</code> here to illustrate the control characters.</p> <p>The data is exactly the same length since it’s a straight byte-for-byte replacement. Quotes are left entirely untouched. The challenge is parsing the quotes to track whether the two special characters fall inside or outside pairs of quotes.</p> <h3 id="state-machine-improvements">State machine improvements</h3> <p>The original csvquote walks the input a byte at a time and is in one of three states:</p> <ol> <li>Outside quotes (initial state)</li> <li>Inside quotes</li> <li>On a possibly “escaped” quote (the first <code class="language-plaintext highlighter-rouge">"</code> in a <code class="language-plaintext highlighter-rouge">""</code>)</li> </ol> <p>Since <a href="/blog/2020/12/31/">I love state machines so much</a>, here it is translated into a switch-based state machine:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Return the next state given an input character.</span> <span class="kt">int</span> <span class="nf">next</span><span class="p">(</span><span class="kt">int</span> <span class="n">state</span><span class="p">,</span> <span class="kt">int</span> <span class="n">c</span><span class="p">)</span> <span class="p">{</span> <span class="k">switch</span> <span class="p">(</span><span class="n">state</span><span class="p">)</span> <span class="p">{</span> <span class="k">case</span> <span class="mi">1</span><span class="p">:</span> <span class="k">return</span> <span class="n">c</span> <span class="o">==</span> <span class="sc">'"'</span> <span class="o">?</span> <span class="mi">2</span> <span class="o">:</span> <span class="mi">1</span><span class="p">;</span> <span class="k">case</span> <span class="mi">2</span><span class="p">:</span> <span class="k">return</span> <span class="n">c</span> <span class="o">==</span> <span class="sc">'"'</span> <span class="o">?</span> <span class="mi">3</span> <span class="o">:</span> <span class="mi">2</span><span class="p">;</span> <span class="k">case</span> <span class="mi">3</span><span class="p">:</span> <span class="k">return</span> <span class="n">c</span> <span class="o">==</span> <span class="sc">'"'</span> <span class="o">?</span> <span class="mi">2</span> <span class="o">:</span> <span class="mi">1</span><span class="p">;</span> <span class="p">}</span> <span class="p">}</span> </code></pre></div></div> <p><a href="/img/csv/csv3.dot"><img src="/img/csv/csv3.png" alt="" /></a></p> <p>The real program also has more conditions for potentially making a replacement. It’s an awful lot of <a href="/blog/2017/10/06/">performance-killing branching</a>.</p> <p>However, this <a href="https://vimeo.com/644068002">context</a> is about finding “in” and “out” — not validating the CSV — so the “escape” state is unnecessary. I need only match up pairs of quotes. An “escaped” quote can be considered terminating a quoted region and immediately starting a new quoted region. That’s means there’s just the first two states in a trivial arrangement:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">next</span><span class="p">(</span><span class="kt">int</span> <span class="n">state</span><span class="p">,</span> <span class="kt">int</span> <span class="n">c</span><span class="p">)</span> <span class="p">{</span> <span class="k">switch</span> <span class="p">(</span><span class="n">state</span><span class="p">)</span> <span class="p">{</span> <span class="k">case</span> <span class="mi">1</span><span class="p">:</span> <span class="k">return</span> <span class="n">c</span> <span class="o">==</span> <span class="sc">'"'</span> <span class="o">?</span> <span class="mi">2</span> <span class="o">:</span> <span class="mi">1</span><span class="p">;</span> <span class="k">case</span> <span class="mi">2</span><span class="p">:</span> <span class="k">return</span> <span class="n">c</span> <span class="o">==</span> <span class="sc">'"'</span> <span class="o">?</span> <span class="mi">1</span> <span class="o">:</span> <span class="mi">2</span><span class="p">;</span> <span class="p">}</span> <span class="p">}</span> </code></pre></div></div> <p><a href="/img/csv/csv2.dot"><img src="/img/csv/csv2.png" alt="" /></a></p> <p>Since the text can be processed as bytes, there are only 256 possible inputs. With 2 states and 256 inputs, this state machine, <em>with</em> replacement machinery, can be implemented with a 512-byte table and <em>no branches</em>. Here’s the table initialization:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">unsigned</span> <span class="kt">char</span> <span class="n">table</span><span class="p">[</span><span class="mi">2</span><span class="p">][</span><span class="mi">256</span><span class="p">];</span> <span class="kt">void</span> <span class="nf">init</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span> <span class="p">{</span> <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="mi">256</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span> <span class="n">table</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">i</span><span class="p">;</span> <span class="n">table</span><span class="p">[</span><span class="mi">1</span><span class="p">][</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">i</span><span class="p">;</span> <span class="p">}</span> <span class="n">table</span><span class="p">[</span><span class="mi">1</span><span class="p">][</span><span class="sc">'\n'</span><span class="p">]</span> <span class="o">=</span> <span class="mh">0x1e</span><span class="p">;</span> <span class="n">table</span><span class="p">[</span><span class="mi">1</span><span class="p">][</span><span class="sc">','</span><span class="p">]</span> <span class="o">=</span> <span class="mh">0x1f</span><span class="p">;</span> <span class="p">}</span> </code></pre></div></div> <p>In the first state, characters map onto themselves. In the second state, characters map onto their replacements. This is the <em>entire</em> encoder and decoder:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">encode</span><span class="p">(</span><span class="kt">unsigned</span> <span class="kt">char</span> <span class="o">*</span><span class="n">buf</span><span class="p">,</span> <span class="kt">size_t</span> <span class="n">len</span><span class="p">)</span> <span class="p">{</span> <span class="kt">int</span> <span class="n">state</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="k">for</span> <span class="p">(</span><span class="kt">size_t</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">len</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span> <span class="n">state</span> <span class="o">^=</span> <span class="p">(</span><span class="n">buf</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">==</span> <span class="sc">'"'</span><span class="p">);</span> <span class="n">buf</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">table</span><span class="p">[</span><span class="n">state</span><span class="p">][</span><span class="n">buf</span><span class="p">[</span><span class="n">i</span><span class="p">]];</span> <span class="p">}</span> <span class="p">}</span> </code></pre></div></div> <p>Well, strictly speaking, the decoder need not process quotes. By my benchmark (<code class="language-plaintext highlighter-rouge">csvdump</code> in my implementation) this processes at ~1 GiB/s on my laptop — 3x faster than the original. However, there’s still low-hanging fruit to be picked!</p> <h3 id="simd-and-twos-complement">SIMD and two’s complement</h3> <p>Any decent SIMD implementation is going to make use of masking. Find the quotes, compute a mask over quoted regions, compute another mask for replacement matches, combine the masks, then use that mask to blend the input with the replacements. Roughly:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>quotes = find_quoted_regions(input) linefeeds = input == '\n' commas = input == ',' output = blend(input, '\n', quotes &amp; linefeeds) output = blend(output, ',', quotes &amp; commas) </code></pre></div></div> <p>The hard part is computing the quote mask, and also somehow handle quoted regions straddling SIMD chunks (not pictured), <em>and</em> do all that without resorting to slow byte-at-time operations. Fortunately there are some bitwise tricks that can resolve each issue.</p> <p>Imagine I load 32 bytes into a SIMD register (e.g. AVX2), and I compute a 32-bit mask where each bit corresponds to one byte. If that byte contains a quote, the corresponding bit is set.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>"George Herman ""Babe"" Ruth","1 10000000000000011000011000001010 </code></pre></div></div> <p>That last/lowest 1 corresponds to the beginning of a quoted region. For my mask, I’d like to set all bits following that bit. I can do this by subtracting 1.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>"George Herman ""Babe"" Ruth","1 10000000000000011000011000001001 </code></pre></div></div> <p>Using the <a href="https://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetKernighan">Kernighan technique</a> I can also remove this bit from the original input by ANDing them together.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>"George Herman ""Babe"" Ruth","1 10000000000000011000011000001000 </code></pre></div></div> <p>Now I’m left with a new bottom bit. If I repeat this, I build up layers of masks, one for each input quote.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>10000000000000011000011000001001 10000000000000011000011000000111 10000000000000011000010111111111 10000000000000011000001111111111 10000000000000010111111111111111 10000000000000001111111111111111 01111111111111111111111111111111 </code></pre></div></div> <p>Remember how I use XOR in the state machine above to toggle between states? If I XOR all these together, I toggle the quotes on and off, building up quoted regions:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>"George Herman ""Babe"" Ruth","1 01111111111111100111100111110001 </code></pre></div></div> <p>However, for reasons I’ll explain shortly, it’s critical that the opening quote is included in this mask. If I XOR the pre-subtracted value with the mask when I compute the mask, I can toggle the remaining quotes on and off such that the opening quotes are included. Here’s my function:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">uint32_t</span> <span class="nf">find_quoted_regions</span><span class="p">(</span><span class="kt">uint32_t</span> <span class="n">x</span><span class="p">)</span> <span class="p">{</span> <span class="kt">uint32_t</span> <span class="n">r</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="k">while</span> <span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="p">{</span> <span class="n">r</span> <span class="o">^=</span> <span class="n">x</span><span class="p">;</span> <span class="n">r</span> <span class="o">^=</span> <span class="n">x</span> <span class="o">-</span> <span class="mi">1</span><span class="p">;</span> <span class="n">x</span> <span class="o">&amp;=</span> <span class="n">x</span> <span class="o">-</span> <span class="mi">1</span><span class="p">;</span> <span class="p">}</span> <span class="k">return</span> <span class="n">r</span><span class="p">;</span> <span class="p">}</span> </code></pre></div></div> <p>Which gives me exactly what I want:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>"George Herman ""Babe"" Ruth","1 11111111111111101111101111110011 </code></pre></div></div> <p>It’s important that the opening quote is included because it means a region that begins on the last byte will have that last bit set. I can use that last bit to determine if the next chunk begins in a quoted state. If a region begins in a quoted state, I need only NOT the whole result to reverse the quoted regions.</p> <p>How can I “sign extend” a 1 into all bits set, or do nothing for zero? Negate it!</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>uint32_t carry = -(prev &amp; 1); uint32_t quotes = find_quoted_regions(input) ^ carry; // ... prev = quotes; </code></pre></div></div> <p>That takes care of computing quoted regions and chaining them between chunks. The loop will unfortunately cause branch prediction penalties if the input has lots of quotes, but I couldn’t find a way around this.</p> <p>However, I’ve made a serious mistake. I’m using <code class="language-plaintext highlighter-rouge">_mm256_movemask_epi8</code> and it puts the first byte in the lowest bit. Doh! That means it looks like this:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1","htuR ""ebaB"" namreH egroeG" 01010000011000011000000000000001 </code></pre></div></div> <p>There’s no efficient way to flip the bits around, so I just need to find a way to work in the other direction. To flip the bits to the left of a set bit, negate it.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>00000000000000000000000010000000 = +0x00000080 11111111111111111111111110000000 = -0x00000080 </code></pre></div></div> <p>Unlike before, this keeps the original bit set, so I need to XOR the original value into the input to flip the quotes. This is as simple as initializing to the input rather than zero. The new loop:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">uint32_t</span> <span class="nf">find_quoted_regions</span><span class="p">(</span><span class="kt">uint32_t</span> <span class="n">x</span><span class="p">)</span> <span class="p">{</span> <span class="kt">uint32_t</span> <span class="n">r</span> <span class="o">=</span> <span class="n">x</span><span class="p">;</span> <span class="k">while</span> <span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="p">{</span> <span class="n">r</span> <span class="o">^=</span> <span class="o">-</span><span class="n">x</span> <span class="o">^</span> <span class="n">x</span><span class="p">;</span> <span class="n">x</span> <span class="o">&amp;=</span> <span class="n">x</span> <span class="o">-</span> <span class="mi">1</span><span class="p">;</span> <span class="p">}</span> <span class="k">return</span> <span class="n">r</span><span class="p">;</span> <span class="p">}</span> </code></pre></div></div> <p>The result:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1","htuR ""ebaB"" namreH egroeG" 11001111110111110111111111111111 </code></pre></div></div> <p>The carry now depends on the high bit rather than the low bit:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>uint32_t carry = -(prev &gt;&gt; 31); </code></pre></div></div> <h3 id="reversing-movemask">Reversing movemask</h3> <p>The next problem: for reasons I don’t understand, AVX2 does not include the inverse of <code class="language-plaintext highlighter-rouge">_mm256_movemask_epi8</code>. Converting the bit-mask back into a byte-mask requires some clever shuffling. Fortunately <a href="https://stackoverflow.com/questions/21622212/how-to-perform-the-inverse-of-mm256-movemask-epi8-vpmovmskb">I’m not the first to have this problem</a>, and so I didn’t have to figure it out from scratch.</p> <p>First fill the 32-byte register with repeated copies of the 32-bit mask.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>abcdabcdabcdabcdabcdabcdabcdabcd </code></pre></div></div> <p>Shuffle the bytes so that the first 8 register bytes have the same copy of the first bit-mask byte, etc.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>aaaaaaaabbbbbbbbccccccccdddddddd </code></pre></div></div> <p>In byte 0, I care only about bit 0, in byte 1 I care only about the bit 1, … in byte N I care only about bit <code class="language-plaintext highlighter-rouge">N%8</code>. I can pre-compute a mask to isolate each of these bits and produce a proper byte-wise mask from the bit-mask. Fortunately all this isn’t too bad: four instructions instead of the one I had wanted. It looks like a lot of code, but it’s really only a few instructions.</p> <h3 id="results">Results</h3> <p>In my benchmark, which includes randomly occurring quoted fields, the SIMD version processes at ~4 GiB/s — 10x faster than the original. I haven’t profiled, but I expect mispredictions on the bit-mask loop are the main obstacle preventing the hypothetical 32x speedup.</p> <p>My version also optionally rejects inputs containing the two special control characters since the encoding would be irreversible. This is implemented in SIMD when available, and it slows processing by around 10%.</p> <h3 id="followup-pclmulqdq">Followup: PCLMULQDQ</h3> <p>Geoff Langdale and others have <a href="https://lists.sr.ht/~skeeto/public-inbox/%3CCABwTFSrDpNkmJs6TpkAfofcZq6e8YWaJUur20xZBz7mDBnvQ2w%40mail.gmail.com%3E">graciously pointed out PCLMULQDQ</a>, which can <a href="https://wunkolo.github.io/post/2020/05/pclmulqdq-tricks/">compute the quote masks using carryless multiplication</a> (<a href="https://branchfree.org/2019/03/06/code-fragment-finding-quote-pairs-with-carry-less-multiply-pclmulqdq/">also</a>) entirely in SIMD and without a loop. I haven’t yet quite worked out exactly how to apply it, but it should be much faster.</p>
    • Nu chevron_right

      OpenBSD's pledge and unveil from Python

      pubsub.kikeriki.at / null_program · Wednesday, 15 September, 2021 - 02:46 · 23 minutes

    <p><em>This article was discussed <a href="https://news.ycombinator.com/item?id=28535255">on Hacker News</a>.</em></p> <p>Years ago, OpenBSD gained two new security system calls, <a href="https://man.openbsd.org/pledge.2"><code class="language-plaintext highlighter-rouge">pledge(2)</code></a> (originally <a href="https://www.openbsd.org/papers/tame-fsec2015/mgp00001.html"><code class="language-plaintext highlighter-rouge">tame(2)</code></a>) and <a href="https://man.openbsd.org/unveil.2"><code class="language-plaintext highlighter-rouge">unveil</code></a>. In both, an application surrenders capabilities at run-time. The idea is to perform initialization like usual, then drop capabilities before handling untrusted input, limiting unwanted side effects. This feature is applicable even where type safety isn’t an issue, such as Python, where a program might still get tricked into accessing sensitive files or making network connections when it shouldn’t. So how can a Python program access these system calls?</p> <p>As <a href="/blog/2021/06/29/">discussed previously</a>, it’s quite easy to access C APIs from Python through its <a href="https://docs.python.org/3/library/ctypes.html"><code class="language-plaintext highlighter-rouge">ctypes</code></a> package, and this is no exception. In this article I show how to do it. Here’s the full source if you want to dive in: <a href="https://github.com/skeeto/scratch/tree/master/misc/openbsd.py"><strong><code class="language-plaintext highlighter-rouge">openbsd.py</code></strong></a>.</p> <!--more--> <p>I’ve chosen these extra constraints:</p> <ul> <li> <p>As extra safety features, unnecessary for correctness, attempts to call these functions on systems where they don’t exist will silently do nothing, as though they succeeded. They’re provided as a best effort.</p> </li> <li> <p>Systems other than OpenBSD may support these functions, now or in the future, and it would be nice to automatically make use of them when available. This means no checking for OpenBSD specifically but instead <em>feature sniffing</em> for their presence.</p> </li> <li> <p>The interfaces should be Pythonic as though they were implemented in Python itself. Raise exceptions for errors, and accept strings since they’re more convenient than bytes.</p> </li> </ul> <p>For reference, here are the function prototypes:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">pledge</span><span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">promises</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">execpromises</span><span class="p">);</span> <span class="kt">int</span> <span class="nf">unveil</span><span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">path</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">permissions</span><span class="p">);</span> </code></pre></div></div> <p>The <a href="https://flak.tedunangst.com/post/string-interfaces">string-oriented interface of <code class="language-plaintext highlighter-rouge">pledge</code></a> will make this a whole lot easier to implement.</p> <h3 id="finding-the-functions">Finding the functions</h3> <p>The first step is to grab functions through <code class="language-plaintext highlighter-rouge">ctypes</code>. Like a lot of Python documentation, this area is frustratingly imprecise and under-documented. I want to grab a handle to the already-linked libc and search for either function. However, getting that handle is a little different on each platform, and in the process I saw four different exceptions, only one of which is documented.</p> <p>I came up with passing None to <code class="language-plaintext highlighter-rouge">ctypes.CDLL</code>, which ultimately just passes <code class="language-plaintext highlighter-rouge">NULL</code> to <a href="https://man.openbsd.org/dlopen.3"><code class="language-plaintext highlighter-rouge">dlopen(3)</code></a>. That’s really all I wanted. Currently on Windows this is a TypeError. Once the handle is in hand, try to access the <code class="language-plaintext highlighter-rouge">pledge</code> attribute, which will fail with AttributeError if it doesn’t exist. In the event of any exception, just assume the behavior isn’t available. If found, I also define the function prototype for <code class="language-plaintext highlighter-rouge">ctypes</code>.</p> <div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">_pledge</span> <span class="o">=</span> <span class="bp">None</span> <span class="k">try</span><span class="p">:</span> <span class="n">_pledge</span> <span class="o">=</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">CDLL</span><span class="p">(</span><span class="bp">None</span><span class="p">,</span> <span class="n">use_errno</span><span class="o">=</span><span class="bp">True</span><span class="p">).</span><span class="n">pledge</span> <span class="n">_pledge</span><span class="p">.</span><span class="n">restype</span> <span class="o">=</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">c_int</span> <span class="n">_pledge</span><span class="p">.</span><span class="n">argtypes</span> <span class="o">=</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">c_char_p</span><span class="p">,</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">c_char_p</span> <span class="k">except</span> <span class="nb">Exception</span><span class="p">:</span> <span class="n">_pledge</span> <span class="o">=</span> <span class="bp">None</span> </code></pre></div></div> <p>Catching a broad Exception isn’t great, but it’s the best we can do since the documentation is incomplete. From this block I’ve seen TypeError, AttributeError, FileNotFoundError, and OSError. I wouldn’t be surprised if there are more possibilities, and I don’t want to risk missing them.</p> <p>Note that I’m catching Exception rather than using a bare <code class="language-plaintext highlighter-rouge">except</code>. My code will not catch KeyboardInterrupt nor SystemExit. This is deliberate, and I never want to catch these.</p> <p>The same story for <code class="language-plaintext highlighter-rouge">unveil</code>:</p> <div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">_unveil</span> <span class="o">=</span> <span class="bp">None</span> <span class="k">try</span><span class="p">:</span> <span class="n">_unveil</span> <span class="o">=</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">CDLL</span><span class="p">(</span><span class="bp">None</span><span class="p">,</span> <span class="n">use_errno</span><span class="o">=</span><span class="bp">True</span><span class="p">).</span><span class="n">unveil</span> <span class="n">_unveil</span><span class="p">.</span><span class="n">restype</span> <span class="o">=</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">c_int</span> <span class="n">_unveil</span><span class="p">.</span><span class="n">argtypes</span> <span class="o">=</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">c_char_p</span><span class="p">,</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">c_char_p</span> <span class="k">except</span> <span class="nb">Exception</span><span class="p">:</span> <span class="n">_unveil</span> <span class="o">=</span> <span class="bp">None</span> </code></pre></div></div> <h3 id="pythonic-wrappers">Pythonic wrappers</h3> <p>The next and final step is to wrap the low-level call in an interface that hides their C and <code class="language-plaintext highlighter-rouge">ctypes</code> nature.</p> <p>Python strings must be encoded to bytes before they can be passed to C functions. Rather than make the caller worry about this, we’ll let them pass friendly strings and have the wrapper do the conversion. Either may also be <code class="language-plaintext highlighter-rouge">NULL</code>, so None is allowed.</p> <div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">pledge</span><span class="p">(</span><span class="n">promises</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">str</span><span class="p">],</span> <span class="n">execpromises</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">str</span><span class="p">]):</span> <span class="k">if</span> <span class="ow">not</span> <span class="n">_pledge</span><span class="p">:</span> <span class="k">return</span> <span class="c1"># unimplemented </span> <span class="n">r</span> <span class="o">=</span> <span class="n">_pledge</span><span class="p">(</span><span class="bp">None</span> <span class="k">if</span> <span class="n">promises</span> <span class="ow">is</span> <span class="bp">None</span> <span class="k">else</span> <span class="n">promises</span><span class="p">.</span><span class="n">encode</span><span class="p">(),</span> <span class="bp">None</span> <span class="k">if</span> <span class="n">execpromises</span> <span class="ow">is</span> <span class="bp">None</span> <span class="k">else</span> <span class="n">execpromises</span><span class="p">.</span><span class="n">encode</span><span class="p">())</span> <span class="k">if</span> <span class="n">r</span> <span class="o">==</span> <span class="o">-</span><span class="mi">1</span><span class="p">:</span> <span class="n">errno</span> <span class="o">=</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">get_errno</span><span class="p">()</span> <span class="k">raise</span> <span class="nb">OSError</span><span class="p">(</span><span class="n">errno</span><span class="p">,</span> <span class="n">os</span><span class="p">.</span><span class="n">strerror</span><span class="p">(</span><span class="n">errno</span><span class="p">))</span> </code></pre></div></div> <p>As usual, a return of -1 means there was an error, in which case we fetch <code class="language-plaintext highlighter-rouge">errno</code> and raise the appropriate OSError.</p> <p><code class="language-plaintext highlighter-rouge">unveil</code> works a little differently since the first argument is a path. Python functions that accept paths, such as <code class="language-plaintext highlighter-rouge">open</code>, generally accept either strings or bytes. On unix-like systems, <a href="https://simonsapin.github.io/wtf-8/">paths are fundamentally bytestrings</a> and not necessarily Unicode, so it’s necessary to accept bytes. Since strings are nearly always more convenient, they take both. The <code class="language-plaintext highlighter-rouge">unveil</code> wrapper here will do the same. If it’s a string, encode it, otherwise pass it straight through.</p> <div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">unveil</span><span class="p">(</span><span class="n">path</span><span class="p">:</span> <span class="n">Union</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">bytes</span><span class="p">,</span> <span class="bp">None</span><span class="p">],</span> <span class="n">permissions</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">str</span><span class="p">]):</span> <span class="k">if</span> <span class="ow">not</span> <span class="n">_unveil</span><span class="p">:</span> <span class="k">return</span> <span class="c1"># unimplemented </span> <span class="n">r</span> <span class="o">=</span> <span class="n">_unveil</span><span class="p">(</span><span class="n">path</span><span class="p">.</span><span class="n">encode</span><span class="p">()</span> <span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">path</span><span class="p">,</span> <span class="nb">str</span><span class="p">)</span> <span class="k">else</span> <span class="n">path</span><span class="p">,</span> <span class="bp">None</span> <span class="k">if</span> <span class="n">permissions</span> <span class="ow">is</span> <span class="bp">None</span> <span class="k">else</span> <span class="n">permissions</span><span class="p">.</span><span class="n">encode</span><span class="p">())</span> <span class="k">if</span> <span class="n">r</span> <span class="o">==</span> <span class="o">-</span><span class="mi">1</span><span class="p">:</span> <span class="n">errno</span> <span class="o">=</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">get_errno</span><span class="p">()</span> <span class="k">raise</span> <span class="nb">OSError</span><span class="p">(</span><span class="n">errno</span><span class="p">,</span> <span class="n">os</span><span class="p">.</span><span class="n">strerror</span><span class="p">(</span><span class="n">errno</span><span class="p">))</span> </code></pre></div></div> <p>That’s it!</p> <h3 id="trying-it-out">Trying it out</h3> <p>Let’s start with <code class="language-plaintext highlighter-rouge">unveil</code>. Initially a process has access to the whole file system with the usual restrictions. On the first call to <code class="language-plaintext highlighter-rouge">unveil</code> it’s immediately restricted to some subset of the tree. Each call reveals a little more until a final <code class="language-plaintext highlighter-rouge">NULL</code> which locks it in place for the rest of the process’s existence.</p> <p>Suppose a program has been tricked into accessing your shell history, perhaps by mishandling a path:</p> <div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">hackme</span><span class="p">():</span> <span class="k">try</span><span class="p">:</span> <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">pathlib</span><span class="p">.</span><span class="n">Path</span><span class="p">.</span><span class="n">home</span><span class="p">()</span> <span class="o">/</span> <span class="s">".bash_history"</span><span class="p">):</span> <span class="k">print</span><span class="p">(</span><span class="s">"You've been hacked!"</span><span class="p">)</span> <span class="k">except</span> <span class="nb">FileNotFoundError</span><span class="p">:</span> <span class="k">print</span><span class="p">(</span><span class="s">"Blocked by unveil."</span><span class="p">)</span> <span class="n">hackme</span><span class="p">()</span> </code></pre></div></div> <p>If you’re a Bash user, this prints:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>You've been hacked! </code></pre></div></div> <p>Using our new feature to restrict the program’s access first:</p> <div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># restrict access to static program data </span><span class="n">unveil</span><span class="p">(</span><span class="s">"/usr/share"</span><span class="p">,</span> <span class="s">"r"</span><span class="p">)</span> <span class="n">unveil</span><span class="p">(</span><span class="bp">None</span><span class="p">,</span> <span class="bp">None</span><span class="p">)</span> <span class="n">hackme</span><span class="p">()</span> </code></pre></div></div> <p>On OpenBSD this now prints:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Blocked by unveil. </code></pre></div></div> <p>Working just as it should!</p> <p>With <code class="language-plaintext highlighter-rouge">pledge</code> we declare what abilities we’d like to keep by supplying a list of promises, <em>pledging</em> to use only those abilities afterward. A common case is the <code class="language-plaintext highlighter-rouge">stdio</code> promise which allows reading and writing of open files, but not <em>opening</em> files. A program might open its log file, then drop the ability to open files while retaining the ability to write to its log.</p> <p>An invalid or unknown promise is an error. Does that work?</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&gt;&gt;&gt; pledge("doesntexist", None) OSError: [Errno 22] Invalid argument </code></pre></div></div> <p>So far so good. How about the functionality itself?</p> <div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pledge</span><span class="p">(</span><span class="s">"stdio"</span><span class="p">,</span> <span class="bp">None</span><span class="p">)</span> <span class="n">hackme</span><span class="p">()</span> </code></pre></div></div> <p>The program is instantly killed when making the disallowed system call:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Abort trap (core dumped) </code></pre></div></div> <p>If you want something a little softer, include the <code class="language-plaintext highlighter-rouge">error</code> promise:</p> <div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pledge</span><span class="p">(</span><span class="s">"stdio error"</span><span class="p">,</span> <span class="bp">None</span><span class="p">)</span> <span class="n">hackme</span><span class="p">()</span> </code></pre></div></div> <p>Instead it’s an exception, which will be a lot easier to debug when it comes to Python, so you probably always want to use it.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>OSError: [Errno 78] Function not implemented </code></pre></div></div> <p>The core dump isn’t going to be much help to a Python program, so you probably always want to use this promise. In general you need to be extra careful about <code class="language-plaintext highlighter-rouge">pledge</code> in complex runtimes like Python’s which may reasonably need to do many arbitrary, undocumented things at any time.</p>
    • Nu chevron_right

      Billions of Code Name Permutations in 32 bits

      pubsub.kikeriki.at / null_program · Tuesday, 14 September, 2021 - 21:06 · 36 minutes

    <p>My friend over at Possibly Wrong <a href="https://possiblywrong.wordpress.com/2021/09/13/code-name-generator/">created a code name generator</a>. By coincidence I happened to be thinking about code names myself while recently replaying <a href="https://en.wikipedia.org/wiki/XCOM:_Enemy_Within"><em>XCOM: Enemy Within</em></a> (2012/2013). The game generates a random code name for each mission, and I wondered how often it repeats. The <a href="https://www.ufopaedia.org/index.php/Mission_Names_(EU2012)">UFOpaedia page on the topic</a> gives the word lists: 53 adjectives and 76 nouns, for a total of 4028 possible code names. A typical game has around 60 missions, and if code names are generated naively on the fly, then per the birthday paradox around half of all games will see a repeated mission code name! Fortunately this is easy to avoid, and the particular configuration here lends itself to an interesting implementation.</p> <p>Mission code names are built using “<em>adjective</em> <em>noun</em>”. Some examples from the game’s word list:</p> <ul> <li>Fading Hammer</li> <li>Fallen Jester</li> <li>Hidden Crown</li> </ul> <p>To generate a code name, we could select a random adjective and a random noun, but as discussed it wouldn’t take long for a collision. The naive approach is to keep a database of previously-generated names, and to consult this database when generating new names. That works, but there’s an even better solution: use a random permutation. Done well, we don’t need to keep track of previous names, and the generator won’t repeat until it’s exhausted all possibilities.</p> <p>Further, the total number of possible code names, 4028, is suspiciously shy of 4,096, a power of two (<code class="language-plaintext highlighter-rouge">2**12</code>). That makes designing and implementing an efficient permutation that much easier.</p> <h3 id="a-linear-congruential-generator">A linear congruential generator</h3> <p>A classic, obvious solution is a <a href="/blog/2019/11/19/">linear congruential generator</a> (LCG). A full-period, 12-bit LCG is nothing more than a permutation of the numbers 0 to 4,095. When generating names, we can skip over the extra 68 values and pretend it’s a permutation of 4,028 elements. An LCG is constructed like so:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>f(n) = (f(n-1)*A + C) % M </code></pre></div></div> <p>Typically the seed is used for <code class="language-plaintext highlighter-rouge">f(0)</code>. M is selected based on the problem space or implementation efficiency, and usually a power of two. In this case it will be 4,096. Then there are some rules for choosing A and C.</p> <p>Simply choosing a random <code class="language-plaintext highlighter-rouge">f(0)</code> per game isn’t great. The code name order will always be the same, and we’re only choosing where in the cycle to start. It would be better to vary the permutation itself, which we can do by also choosing unique A and C constants per game.</p> <p>Choosing C is easy: It must be relatively prime with M, i.e. it must be odd. Since it’s addition modulo M, there’s no reason to choose <code class="language-plaintext highlighter-rouge">C &gt;= M</code> since the results are identical to a smaller C. If we think of C as a 12-bit integer, 1 bit is locked in, and the other 11 bits are free to vary:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>xxxxxxxxxxx1 </code></pre></div></div> <p>Choosing A is more complicated: must be odd, <code class="language-plaintext highlighter-rouge">A-1</code> must be divisible by 4, and <code class="language-plaintext highlighter-rouge">A-1</code> should be divisible by 8 (better results). Again, thinking of this in terms of a 12-bit number, this locks in 3 bits and leaves 9 bits free:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>xxxxxxxxx101 </code></pre></div></div> <p>This ensures all the <em>must</em> and <em>should</em> properties of A.</p> <p>Finally <code class="language-plaintext highlighter-rouge">0 &lt;= f(0) &lt; M</code>. Because of modular arithmetic larger, values are redundant, and all possible values are valid since the LCG, being full-period, will cycle through all of them. This is just choosing the starting point in a particular permutation cycle. As a 12-bit number, all 12 bits are free:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>xxxxxxxxxxxx </code></pre></div></div> <p>That’s <code class="language-plaintext highlighter-rouge">9 + 11 + 12 = 32</code> free bits to fill randomly: again, how incredibly convenient! Every 32-bit integer defines some unique code name permutation… <em>almost</em>. Any 32-bit descriptor where <code class="language-plaintext highlighter-rouge">f(0) &gt;= 4028</code> will collide with at least one other due to skipping, and so around 1.7% of the state space is redundant. A small loss that should shrink with slightly better word list planning. I don’t think anyone will notice.</p> <h3 id="slice-and-dice">Slice and dice</h3> <p><a href="/blog/2020/12/31/">I love compact state machines</a>, and this is an opportunity to put one to good use. My code name generator will be just one function:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">uint32_t</span> <span class="nf">codename</span><span class="p">(</span><span class="kt">uint32_t</span> <span class="n">state</span><span class="p">,</span> <span class="kt">char</span> <span class="o">*</span><span class="n">buf</span><span class="p">);</span> </code></pre></div></div> <p>This takes one of those 32-bit permutation descriptors, writes the first code name to <code class="language-plaintext highlighter-rouge">buf</code>, and returns a descriptor for another permutation that starts with the next name. All we have to do is keep track of that 32-bit number and we’ll never need to worry about repeating code names until all have been exhausted.</p> <p>First, lets extract A, C, and <code class="language-plaintext highlighter-rouge">f(0)</code>, which I’m calling S. The low bits are A, middle bits are C, and high bits are S. Note the OR with 1 and 5 to lock in the hard-set bits.</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">long</span> <span class="n">a</span> <span class="o">=</span> <span class="p">(</span><span class="n">state</span> <span class="o">&lt;&lt;</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">5</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mh">0xfff</span><span class="p">;</span> <span class="c1">// 9 bits</span> <span class="kt">long</span> <span class="n">c</span> <span class="o">=</span> <span class="p">(</span><span class="n">state</span> <span class="o">&gt;&gt;</span> <span class="mi">8</span> <span class="o">|</span> <span class="mi">1</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mh">0xfff</span><span class="p">;</span> <span class="c1">// 11 bits</span> <span class="kt">long</span> <span class="n">s</span> <span class="o">=</span> <span class="n">state</span> <span class="o">&gt;&gt;</span> <span class="mi">20</span><span class="p">;</span> <span class="c1">// 12 bits</span> </code></pre></div></div> <p>Next iterate the LCG until we have a number in range:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">do</span> <span class="p">{</span> <span class="n">s</span> <span class="o">=</span> <span class="p">(</span><span class="n">s</span><span class="o">*</span><span class="n">a</span> <span class="o">+</span> <span class="n">c</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mh">0xfff</span><span class="p">;</span> <span class="p">}</span> <span class="k">while</span> <span class="p">(</span><span class="n">s</span> <span class="o">&gt;=</span> <span class="mi">4028</span><span class="p">);</span> </code></pre></div></div> <p>Once we have an appropriate LCG state, compute the adjective/noun indexes and build a code name:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="n">s</span> <span class="o">%</span> <span class="mi">53</span><span class="p">;</span> <span class="kt">int</span> <span class="n">j</span> <span class="o">=</span> <span class="n">s</span> <span class="o">/</span> <span class="mi">53</span><span class="p">;</span> <span class="n">sprintf</span><span class="p">(</span><span class="n">buf</span><span class="p">,</span> <span class="s">"%s %s"</span><span class="p">,</span> <span class="n">adjvs</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="n">nouns</span><span class="p">[</span><span class="n">j</span><span class="p">]);</span> </code></pre></div></div> <p>Finally assemble the next 32-bit state. Since A and C don’t change, these are passed through while the old S is masked out and replaced with the new S.</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">return</span> <span class="p">(</span><span class="n">state</span> <span class="o">&amp;</span> <span class="mh">0xfffff</span><span class="p">)</span> <span class="o">|</span> <span class="p">(</span><span class="kt">uint32_t</span><span class="p">)</span><span class="n">s</span><span class="o">&lt;&lt;</span><span class="mi">20</span><span class="p">;</span> </code></pre></div></div> <p>Putting it all together:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">adjvs</span><span class="p">[]</span> <span class="o">=</span> <span class="p">{</span> <span class="cm">/* ... */</span> <span class="p">};</span> <span class="k">static</span> <span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">nouns</span><span class="p">[]</span> <span class="o">=</span> <span class="p">{</span> <span class="cm">/* ... */</span> <span class="p">};</span> <span class="kt">uint32_t</span> <span class="nf">codename</span><span class="p">(</span><span class="kt">uint32_t</span> <span class="n">state</span><span class="p">,</span> <span class="kt">char</span> <span class="o">*</span><span class="n">buf</span><span class="p">)</span> <span class="p">{</span> <span class="kt">long</span> <span class="n">a</span> <span class="o">=</span> <span class="p">(</span><span class="n">state</span> <span class="o">&lt;&lt;</span> <span class="mi">3</span> <span class="o">|</span> <span class="mi">5</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mh">0xfff</span><span class="p">;</span> <span class="c1">// 9 bits</span> <span class="kt">long</span> <span class="n">c</span> <span class="o">=</span> <span class="p">(</span><span class="n">state</span> <span class="o">&gt;&gt;</span> <span class="mi">8</span> <span class="o">|</span> <span class="mi">1</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mh">0xfff</span><span class="p">;</span> <span class="c1">// 11 bits</span> <span class="kt">long</span> <span class="n">s</span> <span class="o">=</span> <span class="n">state</span> <span class="o">&gt;&gt;</span> <span class="mi">20</span><span class="p">;</span> <span class="c1">// 12 bits</span> <span class="k">do</span> <span class="p">{</span> <span class="n">s</span> <span class="o">=</span> <span class="p">(</span><span class="n">s</span><span class="o">*</span><span class="n">a</span> <span class="o">+</span> <span class="n">c</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mh">0xfff</span><span class="p">;</span> <span class="p">}</span> <span class="k">while</span> <span class="p">(</span><span class="n">s</span> <span class="o">&gt;=</span> <span class="n">COUNTOF</span><span class="p">(</span><span class="n">adjvs</span><span class="p">)</span><span class="o">*</span><span class="n">COUNTOF</span><span class="p">(</span><span class="n">nouns</span><span class="p">));</span> <span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="n">s</span> <span class="o">%</span> <span class="n">COUNTOF</span><span class="p">(</span><span class="n">adjvs</span><span class="p">);</span> <span class="kt">int</span> <span class="n">j</span> <span class="o">=</span> <span class="n">s</span> <span class="o">/</span> <span class="n">COUNTOF</span><span class="p">(</span><span class="n">adjvs</span><span class="p">);</span> <span class="n">sprintf</span><span class="p">(</span><span class="n">buf</span><span class="p">,</span> <span class="s">"%s %s"</span><span class="p">,</span> <span class="n">adjvs</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="n">nouns</span><span class="p">[</span><span class="n">j</span><span class="p">]);</span> <span class="k">return</span> <span class="p">(</span><span class="n">state</span> <span class="o">&amp;</span> <span class="mh">0xfffff</span><span class="p">)</span> <span class="o">|</span> <span class="p">(</span><span class="kt">uint32_t</span><span class="p">)</span><span class="n">s</span><span class="o">&lt;&lt;</span><span class="mi">20</span><span class="p">;</span> <span class="p">}</span> </code></pre></div></div> <p>The caller just needs to generate an initial 32-bit integer. Any 32-bit integer is valid — even zero — so this could just be, say, the unix epoch (<code class="language-plaintext highlighter-rouge">time(2)</code>), but adjacent values will have similar-ish permutations. I intentionally placed S in the high bits, which are least likely to vary, since it only affects where the cycle begins, while A and C have a much more dramatic impact and so are placed at more variable locations.</p> <p>Regardless, it would be better to hash such an input so that adjacent time values map to distant states. It also helps hide poorer (less random) choices for A multipliers. I happen to have <a href="/blog/2018/07/31/">designed some great functions for exactly this purpose</a>. Here’s one of my best:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="kt">uint32_t</span> <span class="nf">hash32</span><span class="p">(</span><span class="kt">uint32_t</span> <span class="n">x</span><span class="p">)</span> <span class="p">{</span> <span class="n">x</span> <span class="o">+=</span> <span class="mh">0x3243f6a8U</span><span class="p">;</span> <span class="n">x</span> <span class="o">^=</span> <span class="n">x</span> <span class="o">&gt;&gt;</span> <span class="mi">15</span><span class="p">;</span> <span class="n">x</span> <span class="o">*=</span> <span class="mh">0xd168aaadU</span><span class="p">;</span> <span class="n">x</span> <span class="o">^=</span> <span class="n">x</span> <span class="o">&gt;&gt;</span> <span class="mi">15</span><span class="p">;</span> <span class="n">x</span> <span class="o">*=</span> <span class="mh">0xaf723597U</span><span class="p">;</span> <span class="n">x</span> <span class="o">^=</span> <span class="n">x</span> <span class="o">&gt;&gt;</span> <span class="mi">15</span><span class="p">;</span> <span class="k">return</span> <span class="n">x</span><span class="p">;</span> <span class="p">}</span> </code></pre></div></div> <p>This would be perfectly reasonable for generating all possible names in a random order:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">uint32_t</span> <span class="n">state</span> <span class="o">=</span> <span class="n">hash32</span><span class="p">(</span><span class="n">time</span><span class="p">(</span><span class="mi">0</span><span class="p">));</span> <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="mi">4028</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span> <span class="kt">char</span> <span class="n">buf</span><span class="p">[</span><span class="mi">32</span><span class="p">];</span> <span class="n">state</span> <span class="o">=</span> <span class="n">codename</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="n">buf</span><span class="p">);</span> <span class="n">puts</span><span class="p">(</span><span class="n">buf</span><span class="p">);</span> <span class="p">}</span> </code></pre></div></div> <p>To further help cover up poorer A multipliers, it’s better for the word list to be pre-shuffled in its static storage. If that underlying order happens to show through, at least it will be less obvious (i.e. not in alphabetical order). Shuffling the string list in my source is just a few keystrokes in Vim, so this is easy enough.</p> <h3 id="robustness">Robustness</h3> <p>If you’re set on making the <code class="language-plaintext highlighter-rouge">codename</code> function easier to use such that consumers don’t need to think about hashes, you could “encode” and “decode” the descriptor going in an out of the function:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">uint32_t</span> <span class="nf">codename</span><span class="p">(</span><span class="kt">uint32_t</span> <span class="n">state</span><span class="p">,</span> <span class="kt">char</span> <span class="o">*</span><span class="n">buf</span><span class="p">)</span> <span class="p">{</span> <span class="n">state</span> <span class="o">+=</span> <span class="mh">0x3243f6a8U</span><span class="p">;</span> <span class="n">state</span> <span class="o">^=</span> <span class="n">state</span> <span class="o">&gt;&gt;</span> <span class="mi">17</span><span class="p">;</span> <span class="n">state</span> <span class="o">*=</span> <span class="mh">0x9e485565U</span><span class="p">;</span> <span class="n">state</span> <span class="o">^=</span> <span class="n">state</span> <span class="o">&gt;&gt;</span> <span class="mi">16</span><span class="p">;</span> <span class="n">state</span> <span class="o">*=</span> <span class="mh">0xef1d6b47U</span><span class="p">;</span> <span class="n">state</span> <span class="o">^=</span> <span class="n">state</span> <span class="o">&gt;&gt;</span> <span class="mi">16</span><span class="p">;</span> <span class="c1">// ...</span> <span class="n">state</span> <span class="o">=</span> <span class="p">(</span><span class="n">state</span> <span class="o">&amp;</span> <span class="mh">0xfffff</span><span class="p">)</span> <span class="o">|</span> <span class="p">(</span><span class="kt">uint32_t</span><span class="p">)</span><span class="n">s</span><span class="o">&lt;&lt;</span><span class="mi">20</span><span class="p">;</span> <span class="n">state</span> <span class="o">^=</span> <span class="n">state</span> <span class="o">&gt;&gt;</span> <span class="mi">16</span><span class="p">;</span> <span class="n">state</span> <span class="o">*=</span> <span class="mh">0xeb00ce77U</span><span class="p">;</span> <span class="n">state</span> <span class="o">^=</span> <span class="n">state</span> <span class="o">&gt;&gt;</span> <span class="mi">16</span><span class="p">;</span> <span class="n">state</span> <span class="o">*=</span> <span class="mh">0x88ccd46dU</span><span class="p">;</span> <span class="n">state</span> <span class="o">^=</span> <span class="n">state</span> <span class="o">&gt;&gt;</span> <span class="mi">17</span><span class="p">;</span> <span class="n">state</span> <span class="o">-=</span> <span class="mh">0x3243f6a8U</span><span class="p">;</span> <span class="k">return</span> <span class="n">state</span><span class="p">;</span> <span class="p">}</span> </code></pre></div></div> <p>This permutes the state coming in, and reverses that permutation on the way out (read: inverse hash). This breaks up similar starting points.</p> <h3 id="a-random-access-code-name-permutation">A random-access code name permutation</h3> <p>Of course this isn’t the only way to build a permutation. I recently picked up another trick: <a href="https://andrew-helmer.github.io/permute/">Kensler permutation</a>. The key insight is cycle-walking, allowing for random-access to a permutation of a smaller domain (e.g. 4,028 elements) through permutation of a larger domain (e.g. 4096 elements).</p> <p>Here’s such a code name generator built around a bespoke 12-bit xorshift-multiply permutation. I used 4 “rounds” since xorshift-multiply is less effective the smaller the permutation.</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Generate the nth code name for this seed.</span> <span class="kt">void</span> <span class="nf">codename_n</span><span class="p">(</span><span class="kt">char</span> <span class="o">*</span><span class="n">buf</span><span class="p">,</span> <span class="kt">uint32_t</span> <span class="n">seed</span><span class="p">,</span> <span class="kt">int</span> <span class="n">n</span><span class="p">)</span> <span class="p">{</span> <span class="kt">uint32_t</span> <span class="n">i</span> <span class="o">=</span> <span class="n">n</span><span class="p">;</span> <span class="k">do</span> <span class="p">{</span> <span class="n">i</span> <span class="o">^=</span> <span class="n">i</span> <span class="o">&gt;&gt;</span> <span class="mi">6</span><span class="p">;</span> <span class="n">i</span> <span class="o">^=</span> <span class="n">seed</span> <span class="o">&gt;&gt;</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">*=</span> <span class="mh">0x325</span><span class="p">;</span> <span class="n">i</span> <span class="o">&amp;=</span> <span class="mh">0xfff</span><span class="p">;</span> <span class="n">i</span> <span class="o">^=</span> <span class="n">i</span> <span class="o">&gt;&gt;</span> <span class="mi">6</span><span class="p">;</span> <span class="n">i</span> <span class="o">^=</span> <span class="n">seed</span> <span class="o">&gt;&gt;</span> <span class="mi">8</span><span class="p">;</span> <span class="n">i</span> <span class="o">*=</span> <span class="mh">0x3f5</span><span class="p">;</span> <span class="n">i</span> <span class="o">&amp;=</span> <span class="mh">0xfff</span><span class="p">;</span> <span class="n">i</span> <span class="o">^=</span> <span class="n">i</span> <span class="o">&gt;&gt;</span> <span class="mi">6</span><span class="p">;</span> <span class="n">i</span> <span class="o">^=</span> <span class="n">seed</span> <span class="o">&gt;&gt;</span> <span class="mi">16</span><span class="p">;</span> <span class="n">i</span> <span class="o">*=</span> <span class="mh">0xa89</span><span class="p">;</span> <span class="n">i</span> <span class="o">&amp;=</span> <span class="mh">0xfff</span><span class="p">;</span> <span class="n">i</span> <span class="o">^=</span> <span class="n">i</span> <span class="o">&gt;&gt;</span> <span class="mi">6</span><span class="p">;</span> <span class="n">i</span> <span class="o">^=</span> <span class="n">seed</span> <span class="o">&gt;&gt;</span> <span class="mi">24</span><span class="p">;</span> <span class="n">i</span> <span class="o">*=</span> <span class="mh">0x85b</span><span class="p">;</span> <span class="n">i</span> <span class="o">&amp;=</span> <span class="mh">0xfff</span><span class="p">;</span> <span class="n">i</span> <span class="o">^=</span> <span class="n">i</span> <span class="o">&gt;&gt;</span> <span class="mi">6</span><span class="p">;</span> <span class="p">}</span> <span class="k">while</span> <span class="p">(</span><span class="n">i</span> <span class="o">&gt;=</span> <span class="n">COUNTOF</span><span class="p">(</span><span class="n">adjvs</span><span class="p">)</span><span class="o">*</span><span class="n">COUNTOF</span><span class="p">(</span><span class="n">nouns</span><span class="p">));</span> <span class="kt">int</span> <span class="n">a</span> <span class="o">=</span> <span class="n">i</span> <span class="o">%</span> <span class="n">COUNTOF</span><span class="p">(</span><span class="n">adjvs</span><span class="p">);</span> <span class="kt">int</span> <span class="n">b</span> <span class="o">=</span> <span class="n">i</span> <span class="o">/</span> <span class="n">COUNTOF</span><span class="p">(</span><span class="n">adjvs</span><span class="p">);</span> <span class="n">snprintf</span><span class="p">(</span><span class="n">buf</span><span class="p">,</span> <span class="mi">22</span><span class="p">,</span> <span class="s">"%s %s"</span><span class="p">,</span> <span class="n">adjvs</span><span class="p">[</span><span class="n">a</span><span class="p">],</span> <span class="n">nouns</span><span class="p">[</span><span class="n">b</span><span class="p">]);</span> <span class="p">}</span> </code></pre></div></div> <p>While this is more flexible, avoids poorer permutations, and doesn’t have state space collisions, I still have a soft spot for my LCG-based state machine generator.</p> <h3 id="source-code">Source code</h3> <p>You can find the complete, working source code with both generators here: <a href="https://github.com/skeeto/scratch/tree/master/misc/codename.c"><strong><code class="language-plaintext highlighter-rouge">codename.c</code></strong></a>. I used <a href="https://en.wikipedia.org/wiki/Secret_Service_code_name">real US Secret Service code names</a> for my word list. Some sample outputs:</p> <ul> <li>PLASTIC HUMMINGBIRD</li> <li>BLACK VENUS</li> <li>SILENT SUNBURN</li> <li>BRONZE AUTHOR</li> <li>FADING MARVEL</li> </ul>
    • Nu chevron_right

      Test cross-architecture without leaving home

      pubsub.kikeriki.at / null_program · Saturday, 21 August, 2021 - 23:59 · 15 minutes

    <p>I like to test my software across different environments, on <a href="/blog/2020/05/15/">strange platforms</a>, and with <a href="/blog/2018/04/13/">alternative implementations</a>. Each has its own quirks and oddities that can shake bugs out earlier. C is particularly good at this since it has such a wide selection of compilers and runs on everything. For instance I count at least 7 distinct C compilers in Debian alone. One advantage of <a href="/blog/2017/03/30/">writing portable software</a> is access to a broader testing environment, and it’s one reason I prefer to target standards rather than specific platforms.</p> <p>However, I’ve long struggled with architecture diversity. My work and testing has been almost entirely on x86, with ARM as a distant second (Raspberry Pi and friends). Big endian hosts are particularly rare. However, I recently learned a trick for quickly and conveniently accessing many different architectures without even leaving my laptop: <a href="https://wiki.debian.org/QemuUserEmulation">QEMU User Emulation</a>. Debian and its derivatives support this very well and require almost no setup or configuration.</p> <!--more--> <h3 id="cross-compilation-example">Cross-compilation Example</h3> <p>While there are many options, my main cross-testing architecture has been PowerPC. It’s 32-bit big endian, while I’m generally working on 64-bit little endian, which is exactly the sort of mismatch I’m going for. I use a Debian-supplied cross-compiler and qemu-user tools. The <a href="https://en.wikipedia.org/wiki/Binfmt_misc">binfmt</a> support is especially slick, so that’s how I usually use it.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># apt install gcc-powerpc-linux-gnu qemu-user-binfmt </code></pre></div></div> <p><code class="language-plaintext highlighter-rouge">binfmt_misc</code> is a kernel module that teaches Linux how to recognize arbitrary binary formats. For instance, there’s a Wine binfmt so that Linux programs can transparently <code class="language-plaintext highlighter-rouge">exec(3)</code> Windows <code class="language-plaintext highlighter-rouge">.exe</code> binaries. In the case of QEMU User Mode, binaries for foreign architectures are loaded into a QEMU virtual machine configured in user mode. In user mode there’s no guest operating system, and instead the virtual machine translates guest system calls to the host operating system.</p> <p>The first package gives me <code class="language-plaintext highlighter-rouge">powerpc-linux-gnu-gcc</code>. The prefix is the <a href="https://wiki.debian.org/Multiarch/Tuples">architecture tuple</a> describing the instruction set and system ABI. To try this out, I have a little test program that inspects its execution environment:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include &lt;stdio.h&gt; </span> <span class="kt">int</span> <span class="nf">main</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span> <span class="p">{</span> <span class="kt">char</span> <span class="o">*</span><span class="n">w</span> <span class="o">=</span> <span class="s">"?"</span><span class="p">;</span> <span class="k">switch</span> <span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="kt">void</span> <span class="o">*</span><span class="p">))</span> <span class="p">{</span> <span class="k">case</span> <span class="mi">1</span><span class="p">:</span> <span class="n">w</span> <span class="o">=</span> <span class="s">"8"</span><span class="p">;</span> <span class="k">break</span><span class="p">;</span> <span class="k">case</span> <span class="mi">2</span><span class="p">:</span> <span class="n">w</span> <span class="o">=</span> <span class="s">"16"</span><span class="p">;</span> <span class="k">break</span><span class="p">;</span> <span class="k">case</span> <span class="mi">4</span><span class="p">:</span> <span class="n">w</span> <span class="o">=</span> <span class="s">"32"</span><span class="p">;</span> <span class="k">break</span><span class="p">;</span> <span class="k">case</span> <span class="mi">8</span><span class="p">:</span> <span class="n">w</span> <span class="o">=</span> <span class="s">"64"</span><span class="p">;</span> <span class="k">break</span><span class="p">;</span> <span class="p">}</span> <span class="kt">char</span> <span class="o">*</span><span class="n">b</span> <span class="o">=</span> <span class="s">"?"</span><span class="p">;</span> <span class="k">switch</span> <span class="p">(</span><span class="o">*</span><span class="p">(</span><span class="kt">char</span> <span class="o">*</span><span class="p">)(</span><span class="kt">int</span> <span class="p">[]){</span><span class="mi">1</span><span class="p">})</span> <span class="p">{</span> <span class="k">case</span> <span class="mi">0</span><span class="p">:</span> <span class="n">b</span> <span class="o">=</span> <span class="s">"big"</span><span class="p">;</span> <span class="k">break</span><span class="p">;</span> <span class="k">case</span> <span class="mi">1</span><span class="p">:</span> <span class="n">b</span> <span class="o">=</span> <span class="s">"little"</span><span class="p">;</span> <span class="k">break</span><span class="p">;</span> <span class="p">}</span> <span class="n">printf</span><span class="p">(</span><span class="s">"%s-bit, %s endian</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">w</span><span class="p">,</span> <span class="n">b</span><span class="p">);</span> <span class="p">}</span> </code></pre></div></div> <p>When I run this natively on x86-64:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ gcc test.c $ ./a.out 64-bit, little endian </code></pre></div></div> <p>Running it on PowerPC via QEMU:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ powerpc-linux-gnu-gcc -static test.c $ ./a.out 32-bit, big endian </code></pre></div></div> <p>Thanks to binfmt, I could execute it as though the PowerPC binary were a native binary. With just a couple of environment variables in the right place, I could pretend I’m developing on PowerPC — aside from emulation performance penalties of course.</p> <p>However, you might have noticed I pulled a sneaky on ya: <code class="language-plaintext highlighter-rouge">-static</code>. So far what I’ve shown only works with static binaries. There’s no dynamic loader available to run dynamically-linked binaries. Fortunately this is easy to fix in two steps. The first step is to install the dynamic linker for PowerPC:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># apt install libc6-powerpc-cross </code></pre></div></div> <p>The second is to tell QEMU where to find it since, unfortunately, it cannot currently do so on its own.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ export QEMU_LD_PREFIX=/usr/powerpc-linux-gnu </code></pre></div></div> <p>Now I can leave out the <code class="language-plaintext highlighter-rouge">-static</code>:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ powerpc-linux-gnu-gcc test.c $ ./a.out 32-bit, big endian </code></pre></div></div> <p>A practical example: Remember <a href="https://github.com/skeeto/binitools">binitools</a>? I’m now ready to run its <a href="/blog/2019/01/25/">fuzz-generated test suite</a> on this cross-testing platform.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ git clone https://github.com/skeeto/binitools $ cd binitools/ $ make check CC=powerpc-linux-gnu-gcc ... PASS: 668/668 </code></pre></div></div> <p>Or if I’m going to be running <code class="language-plaintext highlighter-rouge">make</code> often:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ export CC=powerpc-linux-gnu-gcc $ make -e check </code></pre></div></div> <p>Recall: <a href="/blog/2017/08/20/">make’s <code class="language-plaintext highlighter-rouge">-e</code> flag</a> passes the environment through, so I don’t need to pass <code class="language-plaintext highlighter-rouge">CC=...</code> on the command line each time.</p> <p>When setting up a test suite for your own programs, consider how difficult it would be to run the tests under customized circumstances like this. The easier it is to run your tests, the more they’re going to be run. I’ve run into many projects with such overly-complex test builds that even enabling sanitizers in the tests suite was a pain, let alone cross-architecture testing.</p> <p>Dependencies? There might be a way to use <a href="https://wiki.debian.org/Multiarch/HOWTO">Debian’s multiarch support</a> to install these packages, but I haven’t been able to figure it out. You likely need to build dependencies yourself using the cross compiler.</p> <h3 id="testing-with-go">Testing with Go</h3> <p>None of this is limited to C (or even C++). I’ve also successfully used this to test Go libraries and programs cross-architecture. This isn’t nearly as important since it’s harder to write unportable Go than C — e.g. <a href="https://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html">dumb pointer tricks</a> are literally labeled “unsafe”. However, Go (gc) trivializes cross-compilation and is statically compiled, so it’s incredibly simple. Once you’ve installed <code class="language-plaintext highlighter-rouge">qemu-user-binfmt</code> it’s entirely transparent:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ GOARCH=mips64 go test </code></pre></div></div> <p>That’s all there is to cross-platform testing. If for some reason binfmt doesn’t work (WSL) or you don’t want to install it, there’s just one extra step (package named <code class="language-plaintext highlighter-rouge">example</code>):</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ GOARCH=mips64 go test -c $ qemu-mips64-static example.test </code></pre></div></div> <p>The <code class="language-plaintext highlighter-rouge">-c</code> option builds a test binary but doesn’t run it, instead allowing you to choose where and how to run it.</p> <p>It even works <a href="/blog/2021/06/29/">with cgo</a> — if you’re willing to jump through the same hoops as with C of course:</p> <div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">package</span> <span class="n">main</span> <span class="c">// #include &lt;stdint.h&gt;</span> <span class="c">// uint16_t v = 0x1234;</span> <span class="c">// char *hi = (char *)&amp;v + 0;</span> <span class="c">// char *lo = (char *)&amp;v + 1;</span> <span class="k">import</span> <span class="s">"C"</span> <span class="k">import</span> <span class="s">"fmt"</span> <span class="k">func</span> <span class="n">main</span><span class="p">()</span> <span class="p">{</span> <span class="n">fmt</span><span class="o">.</span><span class="n">Printf</span><span class="p">(</span><span class="s">"%02x %02x</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="o">*</span><span class="n">C</span><span class="o">.</span><span class="n">hi</span><span class="p">,</span> <span class="o">*</span><span class="n">C</span><span class="o">.</span><span class="n">lo</span><span class="p">)</span> <span class="p">}</span> </code></pre></div></div> <p>With <code class="language-plaintext highlighter-rouge">go run</code> on x86-64:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ CGO_ENABLED=1 go run example.go 34 12 </code></pre></div></div> <p>Via QEMU User Mode:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ export CGO_ENABLED=1 $ export GOARCH=mips64 $ export CC=mips64-linux-gnuabi64-gcc $ export QEMU_LD_PREFIX=/usr/mips64-linux-gnuabi64 $ go run example.go 12 34 </code></pre></div></div> <p>I was pleasantly surprised how well this all works.</p> <h3 id="one-dimension">One dimension</h3> <p>Despite the variety, all these architectures are still “running” the same operating system, Linux, and so they only vary on one dimension. For most programs primarily targeting x86-64 Linux, PowerPC Linux is practically the same thing, while x86-64 OpenBSD is foreign territory despite sharing an architecture and ABI (<a href="/blog/2016/11/17/">System V</a>). Testing across operating systems still requires spending the time to install, configure, and maintain these extra hosts. That’s an article for another time.</p>