• Nu chevron_right

      Some sanity for C and C++ development on Windows

      pubsub.kikeriki.at / null_program · Thursday, 30 December, 2021 - 23:25 · 15 minutes

    <p>A hard reality of C and C++ software development on Windows is that there has never been a good, native C or C++ standard library implementation for the platform. A standard library should abstract over the underlying host facilities in order to ease portable software development. On Windows, C and C++ is so poorly hooked up to operating system interfaces that most portable or mostly-portable software — programs which work perfectly elsewhere — are subtly broken on Windows, particularly outside of the English-speaking world. The reasons are almost certainly political, originally motivated by vendor lock-in, than technical, which adds insult to injury. This article is about what’s wrong, how it’s wrong, and some easy techniques to deal with it in portable software.</p> <p>There are <a href="/blog/2016/06/13/">multiple C implementations</a>, so how could they all be bad, even the <a href="/blog/2018/04/13/">early ones</a>? Microsoft’s C runtime has defined how the standard library should work on the platform, and everyone else followed along for the sake of compatibility. I’m excluding <a href="https://www.cygwin.com/">Cygwin</a> and its major fork, <a href="https://www.msys2.org/">MSYS2</a>, despite not inheriting any of these flaws. They change so much that they’re effectively whole new platforms, not truly “native” to Windows.</p> <p>In practice, C++ standard libraries are implemented on top of a C standard library, which is why C++ shares the same problems. CPython dodges these issues: Though written in C, on Windows it bypasses the broken C standard library and directly calls the proprietary interfaces. Other language implementations, such “gc” Go, simply aren’t built on C at all, and instead do things correctly in the first place — the behaviors the C runtimes should have had all along.</p> <p>If you’re just working on one large project, bypassing the C runtime isn’t such a big deal, and you’re likely already doing so to access important platform functionality. You don’t really even need a C runtime. However, if you write many small programs, <a href="https://github.com/skeeto/scratch">as I do</a>, writing the same special Windows support for each one ends up being most of the work, and honestly makes properly supporting Windows not worth the trouble. I end up just accepting the broken defaults most of the time.</p> <p>Before diving into the details, if you’re looking for a quick-and-easy solution for the Mingw-w64 toolchain, <a href="/blog/2020/05/15/">including w64devkit</a>, which magically makes your C and C++ console programs behave well on Windows, I’ve put together a “library” named <strong><a href="https://github.com/skeeto/scratch/tree/master/libwinsane">libwinsane</a></strong>. It solves all problems discussed in this article, except for one. No source changes required, simply link it into your program.</p> <h3 id="what-exactly-is-broken">What exactly is broken?</h3> <p>The Windows API comes in two flavors: narrow with an “A” (“ANSI”) suffix, and wide (Unicode, UTF-16) with a “W” suffix. The former is the legacy API, where an active <em>code page</em> maps 256 bytes onto (up to) 256 specific characters. On typical machines configured for European languages, this means <a href="https://en.wikipedia.org/wiki/Windows-1252">code page 1252</a>. <a href="http://simonsapin.github.io/wtf-8/">Roughly speaking</a>, Windows internally uses UTF-16, and calls through the narrow interface use the active code page to translate the narrow strings to wide strings. The result is that calls through the narrow API have limited access to the system.</p> <p>The UTF-8 encoding was invented in 1992 and standardized by January 1993. UTF-8 was adopted by the unix world over the following years due to <a href="/blog/2017/10/06/#what-is-utf-8">its backwards-compatibility</a> with its existing interfaces. Programs could read and write Unicode data, access Unicode paths, pass Unicode arguments, and get and set Unicode environment variables without needing to change anything. Today UTF-8 has become the dominant text encoding format in the world, in large part due to the world wide web.</p> <p>In July 1993, Microsoft introduced the wide Windows API with the release of Windows NT 3.1, placing all their bets on UCS-2 (later UTF-16) rather than UTF-8. This turned out to be a mistake, since <a href="http://utf8everywhere.org/">UTF-16 is inferior to UTF-8 in practically every way</a>, though admittedly some problems weren’t so obvious at the time.</p> <p>The major problem: <strong>The C and C++ standard libraries only hook up to the narrow Windows interfaces</strong>. The standard library, and therefore typical portable software on Windows, cannot handle anything but ASCII. The effective result is that these programs:</p> <ul> <li>Cannot accept non-ASCII arguments</li> <li>Cannot get/set non-ASCII environment variables</li> <li>Cannot access non-ASCII paths</li> <li>Cannot read and write non-ASCII on a console</li> </ul> <p>Doing any of these requires calling proprietary functions, treating Windows as a special target. It’s part of what makes correctly porting software to Windows so painful.</p> <p>The sensible solution would have been for the C runtime to speak UTF-8 and connect to the wide API. Alternatively, the narrow API could have been changed over to UTF-8, phasing out the old code page concept. In theory this is what the UTF-8 “code page” is about, though it doesn’t always work. There would have been compatibility problems with abruptly making such a change, but until very recently, <em>this wasn’t even an option</em>. Why couldn’t there be a switch I could flip to get sane behavior that works like every other platform?</p> <h3 id="how-to-mostly-fix-unicode-support">How to mostly fix Unicode support</h3> <p>In 2019, Microsoft introduced a feature to allow programs to <a href="https://docs.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page">request UTF-8 as their active code page on start</a>, along with supporting UTF-8 on more narrow API functions. This is like the magic switch I wanted, except that it involves embedding some ugly XML into your binary in a particular way. At least it’s now an option.</p> <p>For Mingw-w64, that means writing a resource file like so:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>#include &lt;winuser.h&gt; CREATEPROCESS_MANIFEST_RESOURCE_ID RT_MANIFEST "utf8.xml" </code></pre></div></div> <p>Compiling it with <code class="language-plaintext highlighter-rouge">windres</code>:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ windres -o manifest.o manifest.rc </code></pre></div></div> <p>Then linking that into your program. Amazingly it mostly works! Programs can access Unicode arguments, Unicode environment variables, and Unicode paths, including with <code class="language-plaintext highlighter-rouge">fopen</code>, just as it’s worked on other platforms for decades. Since the active code page is set at load time, it happens before <code class="language-plaintext highlighter-rouge">argv</code> is constructed (from <code class="language-plaintext highlighter-rouge">GetCommandLineA</code>), which is why that works out.</p> <p>Alternatively you could create a “side-by-side assembly” placing that XML in a file with the same name as your EXE but with <code class="language-plaintext highlighter-rouge">.manifest</code> suffix (after the <code class="language-plaintext highlighter-rouge">.exe</code> suffix), then placing that next to your EXE. Just be mindful that there’s a “side-by-side” cache (WinSxS), and so it might not immediately pick up your changes.</p> <p>What <em>doesn’t</em> work is console input and output since the console is external to the process, and so isn’t covered by the process’s active code page. It must be configured separately using a proprietary call:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">SetConsoleOutputCP</span><span class="p">(</span><span class="n">CP_UTF8</span><span class="p">);</span> </code></pre></div></div> <p>Annoying, but at least it’s not <em>that</em> painful. This only covers output, though, meaning programs can only print UTF-8. Unfortunately <a href="https://github.com/microsoft/terminal/issues/4551#issuecomment-585487802">UTF-8 input still doesn’t work</a>, and setting the input code page doesn’t do anything despite reporting success:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">SetConsoleCP</span><span class="p">(</span><span class="n">CP_UTF8</span><span class="p">);</span> <span class="c1">// doesn't work</span> </code></pre></div></div> <p>If you care about reading interactive Unicode input, you’re <a href="/blog/2020/05/04/">stuck bypassing the C runtime</a> since it’s still broken.</p> <h3 id="text-stream-translation">Text stream translation</h3> <p>Another long-standing issue is that C and C++ on Windows has distinct “text” and “binary” streams, which it inherited from DOS. Mainly this means automatic newline conversion between CRLF and LF. The C standard explicitly allows for this, though unix-like platforms have never actually distinguished between text and binary streams.</p> <p>The standard also specifies that standard input, output, and error are all open as text streams, and there’s no portable method to change the stream mode to binary — a serious deficiency with the standard. On unix-likes this doesn’t matter, but on Windows it means programs can’t read or write binary data on standard streams without calling a non-standard function. It also means reading and writing standard streams is slow, <a href="/blog/2021/12/04/">frequently a bottleneck</a> unless I route around it.</p> <p>Personally, I like <a href="/blog/2020/06/29/">writing binary data to standard output</a>, <a href="/blog/2020/11/24/">including video</a>, and sometimes <a href="/blog/2017/07/02/">binary filters</a> that also read binary input. I do it so often that in probably half my C programs I have this snippet in <code class="language-plaintext highlighter-rouge">main</code> just so they work correctly on Windows:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="cp">#ifdef _WIN32 </span> <span class="kt">int</span> <span class="nf">_setmode</span><span class="p">(</span><span class="kt">int</span><span class="p">,</span> <span class="kt">int</span><span class="p">);</span> <span class="n">_setmode</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mh">0x8000</span><span class="p">);</span> <span class="n">_setmode</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mh">0x8000</span><span class="p">);</span> <span class="cp">#endif </span></code></pre></div></div> <p>That incantation sets standard input and output in the C runtime to binary mode without the need to include a header, making it compact, simple, and self-contained.</p> <p>This built-in newline translation, along with the Windows standard text editor, Notepad, <a href="https://devblogs.microsoft.com/commandline/extended-eol-in-notepad/">lagging decades behind</a>, meant that many other programs, including Git, grew their own, annoying, newline conversion <a href="https://github.com/skeeto/w64devkit/issues/10">misfeatures</a> that cause <a href="https://github.com/skeeto/binitools/commit/2efd690c3983856c9633b0be66d57483491d1e10">other problems</a>.</p> <h3 id="libwinsane">libwinsane</h3> <p>I introduced libwinsane at the beginning of the article, which fixes all this simply by being linked into a program. It includes the magic XML manifest <code class="language-plaintext highlighter-rouge">.rsrc</code> section, configures the console for UTF-8 output, and sets standard streams to binary before <code class="language-plaintext highlighter-rouge">main</code> (via a GCC constructor). I called it a “library”, but it’s actually a single object file. It can’t be a static library since it must be linked into the program despite not actually being referenced by the program.</p> <p>So normally this program:</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include &lt;stdio.h&gt; #include &lt;string.h&gt; </span> <span class="kt">int</span> <span class="nf">main</span><span class="p">(</span><span class="kt">int</span> <span class="n">argc</span><span class="p">,</span> <span class="kt">char</span> <span class="o">**</span><span class="n">argv</span><span class="p">)</span> <span class="p">{</span> <span class="kt">char</span> <span class="o">*</span><span class="n">arg</span> <span class="o">=</span> <span class="n">argv</span><span class="p">[</span><span class="n">argc</span><span class="o">-</span><span class="mi">1</span><span class="p">];</span> <span class="kt">size_t</span> <span class="n">len</span> <span class="o">=</span> <span class="n">strlen</span><span class="p">(</span><span class="n">arg</span><span class="p">);</span> <span class="n">printf</span><span class="p">(</span><span class="s">"%zu %s</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">len</span><span class="p">,</span> <span class="n">arg</span><span class="p">);</span> <span class="p">}</span> </code></pre></div></div> <p>Compiled and run:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>C:\&gt;cc -o example example.c C:\&gt;example π 1 p </code></pre></div></div> <p>As usual, the Unicode argument is silently mangled into one byte. Linked with libwinsane, it just works like everywhere else:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>C:\&gt;gcc -o example example.c libwinsane.o C:\&gt;example π 2 π </code></pre></div></div> <p>If you’re maintaining a substantial program, you probably want to copy and integrate the necessary parts of libwinsane into your project and build, rather than always link against this loose object file. This is more for convenience and for succinctly capturing the concept. You may even want to <a href="https://github.com/skeeto/hastyhex/blob/master/hastyhex.c#L220">enable ANSI escape processing</a> in your version.</p>
    • Nu chevron_right

      More DLL fun with w64devkit: Go, assembly, and Python

      pubsub.kikeriki.at / null_program · Tuesday, 29 June, 2021 - 21:50 · 31 minutes

    <p>My previous article explained <a href="/blog/2021/05/31/">how to work with dynamic-link libraries (DLLs) using w64devkit</a>. These techniques also apply to other circumstances, including with languages and ecosystems outside of C and C++. In particular, <a href="/blog/2020/05/15/">w64devkit</a> is a great complement to Go and reliably fullfills all the needs of <a href="https://golang.org/cmd/cgo/">cgo</a> — Go’s C interop — and can even bootstrap Go itself. As before, this article is in large part an exercise in capturing practical information I’ve picked up over time.</p> <h3 id="go-bootstrap-and-cgo">Go: bootstrap and cgo</h3> <p>The primary Go implementation, confusingly <a href="https://golang.org/doc/faq#What_compiler_technology_is_used_to_build_the_compilers">named “gc”</a>, is an <a href="/blog/2020/01/21/">incredible piece of software engineering</a>. This is apparent when building the Go toolchain itself, a process that is fast, reliable, easy, and simple. It was originally written in C, but was re-written in Go starting with Go 1.5. The C compiler in w64devkit can build the original C implementation which then can be used to bootstrap any more recent version. It’s so easy that I personally never use official binary releases and always bootstrap from source.</p> <p>You will need the Go 1.4 source, <a href="https://dl.google.com/go/go1.4-bootstrap-20171003.tar.gz">go1.4-bootstrap-20171003.tar.gz</a>. This “bootstrap” tarball is the last Go 1.4 release plus a few additional bugfixes. You will also need the source of the actual version of Go you want to use, such as Go 1.16.5 (latest version as of this writing).</p> <p>Start by building Go 1.4 using w64devkit. On Windows, Go is built using a batch script and no special build system is needed. Since it shouldn’t be invoked with the BusyBox ash shell, I use <a href="/blog/2021/02/08/"><code class="language-plaintext highlighter-rouge">cmd.exe</code></a> explicitly.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ tar xf go1.4-bootstrap-20171003.tar.gz $ mv go/ bootstrap $ (cd bootstrap/src/ &amp;&amp; cmd /c make) </code></pre></div></div> <p>In about 30 seconds you’ll have a fully-working Go 1.4 toolchain. Next use it to build the desired toolchain. You can move this new toolchain after it’s built if necessary.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ export GOROOT_BOOTSTRAP="$PWD/bootstrap" $ tar xf go1.16.5.src.tar.gz $ (cd go/src/ &amp;&amp; cmd /c make) </code></pre></div></div> <p>At this point you can delete the bootstrap toolchain. You probably also want to put Go on your PATH.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ rm -rf bootstrap/ $ printf 'PATH="$PATH;%s/go/bin"\n' "$PWD" &gt;&gt;~/.profile $ source ~/.profile </code></pre></div></div> <p>Not only is Go now available, so is the full power of cgo. (Including <a href="https://dave.cheney.net/2016/01/18/cgo-is-not-go">its costs</a> if used.)</p> <h3 id="vim-suggestions">Vim suggestions</h3> <p>Since w64devkit is oriented so much around Vim, here’s my personal Vim configuration for Go. I don’t need or want fancy plugins, just access to <code class="language-plaintext highlighter-rouge">goimports</code> and a couple of corrections to Vim’s built-in Go support (<code class="language-plaintext highlighter-rouge">[[</code> and <code class="language-plaintext highlighter-rouge">]]</code> navigation). The included <code class="language-plaintext highlighter-rouge">ctags</code> understands Go, so tags navigation works the same as it does with C. <code class="language-plaintext highlighter-rouge">\i</code> saves the current buffer, runs <code class="language-plaintext highlighter-rouge">goimports</code>, and populates the quickfix list with any errors. Similarly <code class="language-plaintext highlighter-rouge">:make</code> invokes <code class="language-plaintext highlighter-rouge">go build</code> and, as expected, populates the quickfix list.</p> <div class="language-vim highlighter-rouge"><div class="highlight"><pre class="highlight"><code>autocmd <span class="nb">FileType</span> <span class="k">go</span> <span class="k">setlocal</span> <span class="nb">makeprg</span><span class="p">=</span><span class="k">go</span>\ build autocmd <span class="nb">FileType</span> <span class="k">go</span> <span class="nb">map</span> <span class="p">&lt;</span><span class="k">silent</span><span class="p">&gt;</span> <span class="p">&lt;</span><span class="k">buffer</span><span class="p">&gt;</span> <span class="p">&lt;</span>leader<span class="p">&gt;</span><span class="k">i</span> <span class="se"> \</span> <span class="p">:</span><span class="k">update</span> \<span class="p">|</span> <span class="se"> \</span> <span class="p">:</span><span class="k">cexpr</span> <span class="nb">system</span><span class="p">(</span><span class="s2">"goimports -w "</span> <span class="p">.</span> <span class="nb">expand</span><span class="p">(</span><span class="s2">"%"</span><span class="p">))</span> \<span class="p">|</span> <span class="se"> \</span> <span class="p">:</span><span class="k">silent</span> <span class="k">edit</span><span class="p">&lt;</span><span class="k">cr</span><span class="p">&gt;</span> autocmd <span class="nb">FileType</span> <span class="k">go</span> <span class="nb">map</span> <span class="p">&lt;</span><span class="k">buffer</span><span class="p">&gt;</span> <span class="p">[[</span> <span class="se"> \</span> ?^\<span class="p">(</span>func\\<span class="p">|</span>var\\<span class="p">|</span><span class="nb">type</span>\\<span class="p">|</span><span class="k">import</span>\\<span class="p">|</span>package\<span class="p">)</span>\<span class="p">&gt;&lt;</span><span class="k">cr</span><span class="p">&gt;</span> autocmd <span class="nb">FileType</span> <span class="k">go</span> <span class="nb">map</span> <span class="p">&lt;</span><span class="k">buffer</span><span class="p">&gt;</span> <span class="p">]]</span> <span class="se"> \</span> /^\<span class="p">(</span>func\\<span class="p">|</span>var\\<span class="p">|</span><span class="nb">type</span>\\<span class="p">|</span><span class="k">import</span>\\<span class="p">|</span>package\<span class="p">)</span>\<span class="p">&gt;&lt;</span><span class="k">cr</span><span class="p">&gt;</span> </code></pre></div></div> <p>Go only comes with <code class="language-plaintext highlighter-rouge">gofmt</code> but <code class="language-plaintext highlighter-rouge">goimports</code> is just one command away, so there’s little excuse not to have it:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ go install golang.org/x/tools/cmd/goimports@latest </code></pre></div></div> <p>Thanks to GOPROXY, all Go dependencies are accessible without (or before) installing Git, so this tool installation works with nothing more than w64devkit and a bootstrapped Go toolchain.</p> <h3 id="cgo-dlls">cgo DLLs</h3> <p>The intricacies of cgo are beyond the scope of this article, but the gist is that a Go source file contains C source in a comment followed by <code class="language-plaintext highlighter-rouge">import "C"</code>. The imported <code class="language-plaintext highlighter-rouge">C</code> object provides access to C types and functions. Go functions marked with an <code class="language-plaintext highlighter-rouge">//export</code> comment, as well as the commented C code, are accessible to C. The latter means we can use Go to implement a C interface in a DLL, and the caller will have no idea they’re actually talking to Go.</p> <p>To illustrate, here’s an little C interface. To keep it simple, I’ve specifically sidestepped some more complicated issues, particularly involving memory management.</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Which DLL am I running?</span> <span class="kt">int</span> <span class="nf">version</span><span class="p">(</span><span class="kt">void</span><span class="p">);</span> <span class="c1">// Generate 64 bits from a CSPRNG.</span> <span class="kt">unsigned</span> <span class="kt">long</span> <span class="kt">long</span> <span class="nf">rand64</span><span class="p">(</span><span class="kt">void</span><span class="p">);</span> <span class="c1">// Compute the Euclidean norm.</span> <span class="kt">float</span> <span class="nf">dist</span><span class="p">(</span><span class="kt">float</span> <span class="n">x</span><span class="p">,</span> <span class="kt">float</span> <span class="n">y</span><span class="p">);</span> </code></pre></div></div> <p>Here’s a C implementation which I’m calling “version 1”.</p> <div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include &lt;math.h&gt; #include &lt;windows.h&gt; #include &lt;ntsecapi.h&gt; </span> <span class="kr">__declspec</span><span class="p">(</span><span class="n">dllexport</span><span class="p">)</span> <span class="kt">int</span> <span class="nf">version</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span> <span class="p">{</span> <span class="k">return</span> <span class="mi">1</span><span class="p">;</span> <span class="p">}</span> <span class="kr">__declspec</span><span class="p">(</span><span class="n">dllexport</span><span class="p">)</span> <span class="kt">unsigned</span> <span class="kt">long</span> <span class="kt">long</span> <span class="nf">rand64</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span> <span class="p">{</span> <span class="kt">unsigned</span> <span class="kt">long</span> <span class="kt">long</span> <span class="n">x</span><span class="p">;</span> <span class="n">RtlGenRandom</span><span class="p">(</span><span class="o">&amp;</span><span class="n">x</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">x</span><span class="p">));</span> <span class="k">return</span> <span class="n">x</span><span class="p">;</span> <span class="p">}</span> <span class="kr">__declspec</span><span class="p">(</span><span class="n">dllexport</span><span class="p">)</span> <span class="kt">float</span> <span class="nf">dist</span><span class="p">(</span><span class="kt">float</span> <span class="n">x</span><span class="p">,</span> <span class="kt">float</span> <span class="n">y</span><span class="p">)</span> <span class="p">{</span> <span class="k">return</span> <span class="n">sqrtf</span><span class="p">(</span><span class="n">x</span><span class="o">*</span><span class="n">x</span> <span class="o">+</span> <span class="n">y</span><span class="o">*</span><span class="n">y</span><span class="p">);</span> <span class="p">}</span> </code></pre></div></div> <p>As discussed in the previous article, each function is exported using <code class="language-plaintext highlighter-rouge">__declspec</code> so that they’re available for import. As before:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ cc -shared -Os -s -o hello1.dll hello1.c </code></pre></div></div> <p>Side note: This could be trivially converted into a C++ implementation just by adding <code class="language-plaintext highlighter-rouge">extern "C"</code> to each declaration. It disables C++ features like name mangling, and follows the C ABI so that the C++ functions appear as C functions. Compiling the C++ DLL is exactly the same.</p> <p>Suppose we wanted to implement this in Go instead of C. We already have all the tools needed to do so. Here’s a Go implementation, “version 2”:</p> <div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">package</span> <span class="n">main</span> <span class="k">import</span> <span class="s">"C"</span> <span class="k">import</span> <span class="p">(</span> <span class="s">"crypto/rand"</span> <span class="s">"encoding/binary"</span> <span class="s">"math"</span> <span class="p">)</span> <span class="c">//export version</span> <span class="k">func</span> <span class="n">version</span><span class="p">()</span> <span class="n">C</span><span class="o">.</span><span class="kt">int</span> <span class="p">{</span> <span class="k">return</span> <span class="m">2</span> <span class="p">}</span> <span class="c">//export rand64</span> <span class="k">func</span> <span class="n">rand64</span><span class="p">()</span> <span class="n">C</span><span class="o">.</span><span class="n">ulonglong</span> <span class="p">{</span> <span class="k">var</span> <span class="n">buf</span> <span class="p">[</span><span class="m">8</span><span class="p">]</span><span class="kt">byte</span> <span class="n">rand</span><span class="o">.</span><span class="n">Read</span><span class="p">(</span><span class="n">buf</span><span class="p">[</span><span class="o">:</span><span class="p">])</span> <span class="n">r</span> <span class="o">:=</span> <span class="n">binary</span><span class="o">.</span><span class="n">LittleEndian</span><span class="o">.</span><span class="n">Uint64</span><span class="p">(</span><span class="n">buf</span><span class="p">[</span><span class="o">:</span><span class="p">])</span> <span class="k">return</span> <span class="n">C</span><span class="o">.</span><span class="n">ulonglong</span><span class="p">(</span><span class="n">r</span><span class="p">)</span> <span class="p">}</span> <span class="c">//export dist</span> <span class="k">func</span> <span class="n">dist</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span> <span class="n">C</span><span class="o">.</span><span class="n">float</span><span class="p">)</span> <span class="n">C</span><span class="o">.</span><span class="n">float</span> <span class="p">{</span> <span class="k">return</span> <span class="n">C</span><span class="o">.</span><span class="n">float</span><span class="p">(</span><span class="n">math</span><span class="o">.</span><span class="n">Sqrt</span><span class="p">(</span><span class="kt">float64</span><span class="p">(</span><span class="n">x</span><span class="o">*</span><span class="n">x</span> <span class="o">+</span> <span class="n">y</span><span class="o">*</span><span class="n">y</span><span class="p">)))</span> <span class="p">}</span> <span class="k">func</span> <span class="n">main</span><span class="p">()</span> <span class="p">{</span> <span class="p">}</span> </code></pre></div></div> <p>Note the use of C types for all arguments and return values. The <code class="language-plaintext highlighter-rouge">main</code> function is required since this is the main package, but it will never be called. The DLL is built like so:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ go build -buildmode=c-shared -o hello2.dll hello2.go </code></pre></div></div> <p>Without the <code class="language-plaintext highlighter-rouge">-o</code> option, the DLL will lack an extension. This works fine since it’s mostly only convention on Windows, but it may be confusing without it.</p> <p>What if we need an import library? This will be required when linking with the MSVC toolchain. In the previous article we asked Binutils to generate one using <code class="language-plaintext highlighter-rouge">--out-implib</code>. For Go we have to handle this ourselves via <code class="language-plaintext highlighter-rouge">gendef</code> and <code class="language-plaintext highlighter-rouge">dlltool</code>.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ gendef hello2.dll $ dlltool -l hello2.lib -d hello2.def </code></pre></div></div> <p>The only way anyone upgrading would know version 2 was implemented in Go is that the DLL is a lot bigger (a few MB vs. a few kB) since it now contains an entire Go runtime.</p> <h3 id="nasm-assembly-dll">NASM assembly DLL</h3> <p>We could also go the other direction and implement the DLL using plain assembly. It won’t even require linking against a C runtime.</p> <p>w64devkit includes two assemblers: GAS (Binutils) which is used by GCC, and NASM which has <a href="https://elronnd.net/writ/2021-02-13_att-asm.html">friendlier syntax</a>. I prefer the latter whenever possible — exactly why I included NASM in the distribution. So here’s how I implemented “version 3” in NASM assembly.</p> <div class="language-nasm highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nf">bits</span> <span class="mi">64</span> <span class="nf">section</span> <span class="nv">.text</span> <span class="nf">global</span> <span class="nb">Dl</span><span class="nv">lMainCRTStartup</span> <span class="nf">export</span> <span class="nb">Dl</span><span class="nv">lMainCRTStartup</span> <span class="nl">DllMainCRTStartup:</span> <span class="nf">mov</span> <span class="nb">eax</span><span class="p">,</span> <span class="mi">1</span> <span class="nf">ret</span> <span class="nf">global</span> <span class="nv">version</span> <span class="nf">export</span> <span class="nv">version</span> <span class="nl">version:</span> <span class="nf">mov</span> <span class="nb">eax</span><span class="p">,</span> <span class="mi">3</span> <span class="nf">ret</span> <span class="nf">global</span> <span class="nv">rand64</span> <span class="nf">export</span> <span class="nv">rand64</span> <span class="nl">rand64:</span> <span class="nf">rdrand</span> <span class="nb">rax</span> <span class="nf">ret</span> <span class="nf">global</span> <span class="nb">di</span><span class="nv">st</span> <span class="nf">export</span> <span class="nb">di</span><span class="nv">st</span> <span class="nl">dist:</span> <span class="nf">mulss</span> <span class="nv">xmm0</span><span class="p">,</span> <span class="nv">xmm0</span> <span class="nf">mulss</span> <span class="nv">xmm1</span><span class="p">,</span> <span class="nv">xmm1</span> <span class="nf">addss</span> <span class="nv">xmm0</span><span class="p">,</span> <span class="nv">xmm1</span> <span class="nf">sqrtss</span> <span class="nv">xmm0</span><span class="p">,</span> <span class="nv">xmm0</span> <span class="nf">ret</span> </code></pre></div></div> <p>The <code class="language-plaintext highlighter-rouge">global</code> directive is common in NASM assembly and causes the named symbol to have the external linkage needed when linking the DLL. The <code class="language-plaintext highlighter-rouge">export</code> directive is Windows-specific and is equivalent to <code class="language-plaintext highlighter-rouge">dllexport</code> in C.</p> <p>Every DLL must have an entrypoint, usually named <code class="language-plaintext highlighter-rouge">DllMainCRTStartup</code>. The return value indicates if the DLL successfully loaded. So far this has been handled automatically by the C implementation, but at this low level we must define it explicitly.</p> <p>Here’s how to assemble and link the DLL:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ nasm -fwin64 -o hello3.o hello3.s $ ld -shared -s -o hello3.dll hello3.o </code></pre></div></div> <h3 id="call-the-dlls-from-python">Call the DLLs from Python</h3> <p>Python has a nice, built-in C interop, <code class="language-plaintext highlighter-rouge">ctypes</code>, that allows Python to call arbitrary C functions in shared libraries, including DLLs, without writing C to glue it together. To tie this all off, here’s a Python program that loads all of the DLLs above and invokes each of the functions:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">ctypes</span> <span class="k">def</span> <span class="nf">load</span><span class="p">(</span><span class="n">version</span><span class="p">):</span> <span class="n">hello</span> <span class="o">=</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">CDLL</span><span class="p">(</span><span class="sa">f</span><span class="s">"./hello</span><span class="si">{</span><span class="n">version</span><span class="si">}</span><span class="s">.dll"</span><span class="p">)</span> <span class="n">hello</span><span class="p">.</span><span class="n">version</span><span class="p">.</span><span class="n">restype</span> <span class="o">=</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">c_int</span> <span class="n">hello</span><span class="p">.</span><span class="n">version</span><span class="p">.</span><span class="n">argtypes</span> <span class="o">=</span> <span class="p">()</span> <span class="n">hello</span><span class="p">.</span><span class="n">dist</span><span class="p">.</span><span class="n">restype</span> <span class="o">=</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">c_float</span> <span class="n">hello</span><span class="p">.</span><span class="n">dist</span><span class="p">.</span><span class="n">argtypes</span> <span class="o">=</span> <span class="p">(</span><span class="n">ctypes</span><span class="p">.</span><span class="n">c_float</span><span class="p">,</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">c_float</span><span class="p">)</span> <span class="n">hello</span><span class="p">.</span><span class="n">rand64</span><span class="p">.</span><span class="n">restype</span> <span class="o">=</span> <span class="n">ctypes</span><span class="p">.</span><span class="n">c_ulonglong</span> <span class="n">hello</span><span class="p">.</span><span class="n">rand64</span><span class="p">.</span><span class="n">argtypes</span> <span class="o">=</span> <span class="p">()</span> <span class="k">return</span> <span class="n">hello</span> <span class="k">for</span> <span class="n">hello</span> <span class="ow">in</span> <span class="n">load</span><span class="p">(</span><span class="mi">1</span><span class="p">),</span> <span class="n">load</span><span class="p">(</span><span class="mi">2</span><span class="p">),</span> <span class="n">load</span><span class="p">(</span><span class="mi">3</span><span class="p">):</span> <span class="k">print</span><span class="p">(</span><span class="s">"version"</span><span class="p">,</span> <span class="n">hello</span><span class="p">.</span><span class="n">version</span><span class="p">())</span> <span class="k">print</span><span class="p">(</span><span class="s">"rand "</span><span class="p">,</span> <span class="sa">f</span><span class="s">"</span><span class="si">{</span><span class="n">hello</span><span class="p">.</span><span class="n">rand64</span><span class="p">():</span><span class="mi">016</span><span class="n">x</span><span class="si">}</span><span class="s">"</span><span class="p">)</span> <span class="k">print</span><span class="p">(</span><span class="s">"dist "</span><span class="p">,</span> <span class="n">hello</span><span class="p">.</span><span class="n">dist</span><span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">))</span> </code></pre></div></div> <p>After loading the DLL with <code class="language-plaintext highlighter-rouge">CDLL</code> the program defines each function prototype so that Python knows how to call it. Unfortunately it’s not possible to build Python with w64devkit, so you’ll also need to install the standard CPython distribution in order to run it. Here’s the output:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ python finale.py version 1 rand b011ea9bdbde4bdf dist 5.0 version 2 rand f7c86ff06ae3d1a2 dist 5.0 version 3 rand 2a35a05b0482c898 dist 5.0 </code></pre></div></div> <p>That output is the result of four different languages interfacing in one process: C, Go, x86-64 assembly, and Python. Pretty neat if you ask me!</p>
    • chevron_right

      gtkmm und gloox

      Stefan · pubsub.movim.eu / xmpp-messenger · Saturday, 21 November, 2020 - 16:00 edit

    Ich habe mich die letzten Monate mehr mit C und libstrophe beschäftige.

    Der Prototyp von eagle ist in C mit libstrophe geschrieben. Das funktioniert schon mal ganz gut.

    Allerdings würde ich mir jetzt schon gerne noch mal C++ im Vergleich ansehen. Ich hatte vor längerer Zeit mal ein XMPP Bot angefangen zu schreiben, den Hawkbit Bot. Es gab aber dann ein Problem mit TLS und ich habe es dann nicht mehr weiter verfolgt.

    Aktuell geht es auch wieder mit Debian GNU/Linux Buster. Die Verbindung mit dem Server funktioniert.

    Also,... nächster Versuch,...

    Sparrow ein XMPP Client für XEP-0060: Publish-Subscribe - was ich schon die ganze Zeit schreiben wollte. Ich habe jetzt erst mal das Projekt setup erstellt. Automake, etwas boost sowie gtkmm und gloox eingebunden.

    Demnächst einfach mal drauf los programmieren,...

    #xmpp #libstrophe #gtkmm #cpp #gloox