On Nim

Table of Contents

1 Static deployment

Guide: https://scripter.co/nim-deploying-static-binaries/

  • install musl
  • POC: nim c --gcc.exe:musl-gcc --gcc.linkerexe:musl-gcc --passL:--static nimprogram

The guide then uses config.nims to add musl builds and handle static OpenSSL.

2 Scripting

In certain environments like system administration with numerous servers and teams of admins, scripts have incredible practical advantages over binaries.

Compiled languages in turn have a lot to offer to scripts, f.e. they can expose kernel and libc APIs that scripting languages are too willfully generic to incorporate. Did you know you can swap two files in a single syscall with renameat2? Did you know you can open a directory that definitely isn't a symlink with O_DIRECTORY|O_NOFOLLOW and then use the resulting fd to anchor further operations on this directory, even if it moves for some reason while you're working on it?

Compiled languages have a lot to offer, mainly needing some form of compilation cache, but the tooling is rarely all the way there. D has 'rdmd' for #! but it's not aware of dub, so using third-party libraries requires that you use 'dub' instead, and that's a bit slower and requires some extra configuration in comments in the file, and it can make potentially unwanted network communications of its own.

Rust has a series of abandoned scripting options, starting with cargo-script and cargo-eval. They

  • easily perform unwanted network communication
  • recompile scripts unnecessarily when dependencies update, which can result in unexpected slow runs of the script
  • are not that smart about caching: you can't have a public_html/siteone/index.rs and a public_html/sitetwo/index.rs that both use cargo-eval, as the 'index' names will colllide and both scripts will refer to the last-compiled binary between the two of them.

It's quite a depressing record.

How does Nim fare? First, get nim fully installed with its tools. choosenim from the official download page should do this for you. Then, grab nimcr for scripting, and vec just to show that libraries can be used naturally.

# nimble install nimcr
# nimble install vec

With that in place, let's try

2.1 a 'hello world'

#! /usr/bin/env nimcr
echo "Hello, world!"

Which runs just as you'd expect, and (after an initial compilation) is about 11 times faster than similar Python3.

Yes, just printing something out "is I/O bound" and therefore CPU performance "doesn't matter", but scripting languages have so much overhead just finding their libraries and examining the system they're running on that it becomes noticeable.

2.2 A script using a nimble-provided library

#! /usr/bin/env nimcr
import vec

let q = vec2f(1.0, 2.0)
echo q.normal

Outputs:

(x: -0.8944271802902222, y: 0.4472135901451111)

At the initial cost of having to provide dependencies as a separate step with nimble, you lose the much more severe and harder to predict cost of network communication and dependency upgrades that the D and Rust solutions have.

2.3 A CGI script

#! /home/example/bin/nimcr
import vec, strformat

echo "Content-type: text"
echo ""
let q = vec2f(1.0, 2.0)
echo q.normal
# an .htaccess
AddHandler cgi-script .nim
#! /bin/bash
. /home/example/.bashrc
exec ~/.nimble/bin/nimcr "$@"

This is a lot messier than a normal script would be, and it's entirely due to hosting and having to restore a bunch of shell environment settings on this server before running nimcr. This all does work however (at the time of testing):

# curl nim-lang.moe/hello.nim
(x: -0.8944271802902222, y: 0.4472135901451111)

3 Nim on CentOS 6

CentOS 6 was first released in mid 2011. It's end-of-life'd in Nov 2020. So it's not that surprising that Nim doesn't work out of the box on it.

A convenient choosenim script can manage multiple versions of Nim and upgrade it easily after a new release.

# curl https://nim-lang.org/choosenim/init.sh -sSf | sh
choosenim-init: Downloading choosenim-0.6.0_linux_amd64
/tmp/choosenim-0.6.0_linux_amd64: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /tmp/choosenim-0.6.0_linux_amd64)
/tmp/choosenim-0.6.0_linux_amd64: /lib64/libc.so.6: version `GLIBC_2.16' not found (required by /tmp/choosenim-0.6.0_linux_amd64)
/tmp/choosenim-0.6.0_linux_amd64: /lib64/libc.so.6: version `GLIBC_2.15' not found (required by /tmp/choosenim-0.6.0_linux_amd64)

But CentOS 6's glibc is too old, so we can't use it. Let's just get the latest source archive from nim-lang.org:

# wget https://nim-lang.org/download/nim-1.4.0.tar.xz
# tar xvf nim-1.4.0.tar.xz
# cd nim-1.4.0
# ./build.sh
# OS: linux
# CPU: amd64
gcc -w -fmax-errors=3 -O3 -fno-strict-aliasing -fno-ident -Ic_code -c c_code/1_2/stdlib_assertions.nim.c -o c_code/1_2/stdlib_assertions.nim.o
In file included from c_code/1_2/stdlib_assertions.nim.c:8:
c_code/nimbase.h:542: error: redefinition of typedef ‘NIM_STATIC_ASSERT_AUX’
c_code/nimbase.h:323: note: previous declaration of ‘NIM_STATIC_ASSERT_AUX’ was here
# 

But CentOS 6's gcc is too old, so we can't compile from the pregenerated C sources. Right, so what's the latest CentOS? 8? Let's use its version of gcc.

root# yum install centos-release-scl
root# yum install devtoolset-8

And back to Nim source:

# scl enable devtoolset-8 bash
# ./build.sh
# bin/nim c koch
# ./koch boot -d:release
# ./koch tools
# echo 'PATH=$PATH:'$PWD/bin >> ~/.bashrc

Great! But, erm, Nim is going to continuously need that newer gcc in order to compile anything, even after Nim itself is built.

As a quick reload of this session will demonstrate:

# nim r hello
...
Error: execution of an external program failed: 'gcc -c  -w -fmax-errors=3 ...'

One option is to always enter scl enable devtoolset-8 bash before compiling with Nim, and that'll work just fine actually. Since I want to use Nim with scripts, I'm going to go a little farther and just source the scl 'enable' script in my .bashrc:

# echo '. /opt/rh/devtoolset-8/enable' >> ~/.bashrc

So everything works now?

# nim r --hints:off hello
Hello, world!

That works.

# nimble refresh
could not import: X509_check_host

Goddamnit … I spent quite a while looking for a Nim-only solution to this (Nim invites this by being distractingly easy to dig into. Hey it has a -d:nimDisableCertificateValidation , hey here's the list of OpenSSL versions it wants, hey–), but a newer OpenSSL is just required.

# wget https://ftp.openssl.org/source/old/1.1.1/openssl-1.1.1.tar.gz
# tar xvf openssl-1.1.1.tar.gz
# cd openssl-1.1.1
# ./config --prefix=/usr
# make
root# make install

That fixes it.

4 Running examples with a clean directory

Consider the following directory listing:

./  ../  copy.nim  ti1.nim

If you run nim c -r ti1 successfully, what will the new directory look like?

It will contain the new binary:

./  ../  copy.nim  ti1*  ti1.nim

If you have a lot of little programs because you're learning the language, these unnecessary binaries can get annoying. Fortunately there's an alternate mode that doesn't clutter the current directory:

$ ls
./  ../  copy.nim  ti1.nim
$ nim r --hints:off --warnings:off ti1
(1, 2)
((1, 2), (1, 2))
$ ls
./  ../  copy.nim  ti1.nim
$

That's nim r file for short, the extra flags there are just for cleaner output.

4.1 Caveat

(This is fixed in 1.5.x devel)

nim r caching is by the basename of the file (nim c -r and nimcr are more robust than this), so you can occasionally run into some frustrating issues if you reuse names.

Consider:

$ cat 1/hello.nim 
echo "1"
$ cat 2/hello.nim 
echo "2"
$ ( cd 1; nim r --hints:off hello )
1
$ ( cd 2; nim r --hints:off hello )
1

5 Inline testing

Nim has very good support for dedicated files of tests, but how about inline testing?

The D language has a unit testing feature that keeps tests close to the source they're testing, and special compilation options that make it easy to run these tests without performing other unwanted activities. Because Nim doesn't have this feature, you'll tend to find (even in sources accompanying the Nim compiler itself) ad-hoc tests like:

func double*(n: float): float = n * 2

when isMainModule:
  doAssert(2.double == 4)

This is a file that can be import'ed for its double function, or can be run as a program in its own right for that assertion. This kind of testing obviously falls apart as soon as you want to test a file that's already a program in its own right, with other actions that already will be happening when isMainModule.

But Nim can still get very close to D's feature:

func double*(n: float): float = n * 2

when defined(unittests):
  import unittest
  suite "doubling":
    test "double 2": check(2.double == 4)
    test "double -2": check(-2.double == (-4))

when isMainModule and not defined(unittests):
  echo "but then this ^^^ is required"

6 That isn't tuple initialization

What is the output of this program?

let (a, b) = (1, 2)
echo (a, b)

let x, y = (1, 2)
echo (x, y)

It is:

(1, 2)
((1, 2), (1, 2))

The second let does not deconstruct (1, 2), but rather is a shorthand for assigning it to both variables. This is similar to the more familiar var x, y: int syntax that gives two variables the same type.

Nim now has an EachIdentIsTuple warning for the above case. There's no such warning for the following program. In it, are the printed addresses the same?

type
  Odd = object
    id: int

proc `=copy`(dest: var Odd; source: Odd) {.error.}

let x, y = Odd(id: 2)
echo x
echo y
echo cast[int](x.unsafeAddr)
echo cast[int](y.unsafeAddr)

They are not. Sample output:

(id: 2)
(id: 2)
4347016
4347024

This syntax then is not constructing a single object for both variables, but is just a shorthand for the following:

let x = Odd(id: 2)
let y = Odd(id: 2)

7 Are those function arguments or a tuple?

What's the output of this program?

echo(1, 2)
echo (1, 2)
echo 1, 2

It is:

12
(1, 2)
12

8 Default returns

What does the following program print at runtime?

type
  Node = ref object
    case kind: range[0..1]
    of 0: onedata: int
    of 1: twodata: bool

proc get[T](node: Node): T =
  case node.kind
  of 0:
    when T is string: return $node.onedata
    else: assert(false)
  of 1:
    when T is string: return $node.twodata
    else:
      var a: int
      for n in 0 ..< node.kind:
        a.inc n

echo get[seq[int]](Node(kind: 1, twodata: true))

It prints @[], the default value of a seq[int].

9 Constant style

Why can't the following program compile?

import std/[net, strformat]

const PORT = 4444

var
  server, client: Socket
  address: string

server = newSocket()
server.setSockOpt(OptReusePort, true)
server.bindAddr(Port(PORT))
server.listen

while true:
  server.accept(client)
  let (address, port) = client.getPeerAddr
  echo &"Client connected from: {address}:{port}"
  client.send "Hello, world!\n"
  client.close

If you didn't see it, does the error help?

/path/to/style1.nim(11, 21) Error: attempting to call routine: 'Port'
  found 'PORT' of kind 'const'
  found 'nativesockets.Port [declared in /home/jfondren/.choosenim/toolchains/nim-1.4.0/lib/pure/nativesockets.nim(52, 3)]' of kind 'type'

The problem is the constant PORT conflicts with the function Port, because Nim is style insensitive past the first character of an identifier. So this convention of putting constants in all caps, it doesn't really suit Nim, as all that it's done here is inflict on the reader the illusion of the code not having this name conflict.

What other options are there?

9.1 fake namespacing in the style of C function names

const cfgPort = 4444

This seems to be the route followed in Nim internals and in the stdlib. For File I/O there are fmRead, fmWrite. For networking there's that OptReusePort option.

9.2 awkwardly renaming things when you notice a conflict

const PORT_NO = 4444

This was my first impulse, before I wondered what the point was again of putting this in all caps.

9.3 real namespacing with pure enums?

type
  Config {.pure.} = enum
    Port = 4444
...
server.bindAddr(Port(Config.Port))

This would work for the exact code above, but just by reading through the manual on Enums it should be clear how limiting and annoying this would be in practice. You'd have to reorder your definitions if changing one's value caused it to change its order in the enum, and you're limited to integer values. Also, when even 'pure' enum names don't conflict with another identifier, they'll still be accessible without the Config. prefix:

type
  Example {.pure.} = enum
    Apple = 1
    Orange = 2

let Apple = 10

echo Apple   # output: 10
echo Orange  # not an error. output: Orange

9.4 real namespacing with a constant object

type
  configtype = object
    port: int
    greeting: string

const Config = configtype(
  greeting: "Hello, world!\n",
  port: 4444,
)

This occurred to me later, but it seems like a completely satisfactory solution, and one that lends itself to other uses of the object type.

'Fake namespacing' still seems like the best option.

10 Why doesn't this regex match?

What is the output of this program?

import re

if "What the heck" =~ re"the":
  echo "Found it"
else:
  echo "Didn't find it"

Here's how some other languages answer this question:

$ echo "What the heck" | awk '/the/{print "Found it"}'
Found it
$ ruby -le 'puts "Found it" if "What the heck" =~ /the/' 2>/dev/null
Found it
$ perl -le 'print "Found it" if "What the heck" =~ /the/'
Found it
$ if [[ "What the heck" =~ "the" ]]; then echo Found it; fi
Found it

iex(1)> "What the heck" =~ ~r/the/  # Elixir
true

If you run by https://www.rosettacode.org/wiki/Regular_expressions you can find a few more languages that clearly behave similarly from the examples.

Nim's output is:

Didn't find it

Nim effectively has a ^ start-of-string anchor at the beginning of any regex used with this syntax. You'll probably want code more like:

import re

if "What the heck".contains re"the":
  echo "Found it"
else:
  echo "Didn't find it"

11 Indentation: this ain't Python

What does the following Python output at runtime?

greeting = "Hello"
    .upper()
    .lower()
print(greeting)

It outputs an error:

  File "dotty.py", line 2
    .upper()
    ^
IndentationError: unexpected indent

To continue an complete-looking line onto another line, Python needs the line to look incomplete again, with \ to-be-continued markers.

In Nim though, the equivalent code runs without error:

import strutils

let greeting = "Hello"
  .toUpper
  .toLower
echo greeting

12 You can't import self

Consider this Rust, slightly adapted from its reference manual:

use std::option::Option::{Some, None};
use std::collections::hash_map::{self, HashMap, my_fake_submodule::*};

There are two shorthands deployed here: multiple imports after a path, and importing the parent of the multiple imports. Members of a module (Some) are also imported with the same syntax that would import a whole module. In full the above does:

use std::option::Option::Some;
use std::option::Option::None;
use std::collections::hash_map;
use std::collections::hash_map::HashMap;
use std::collections::hash_map::my_fake_submodule::*;

The nearest possible Nim equivalent to the last example is:

from std/option/Option import Some
from std/option/Option import None
from std/collections/hash_map import nil
from std/collections/hash_map import HashMap
import std/collections/hash_map/my_fake_submodule

This exhibits the Python from M import E selective-import syntax, to get Some and None. And just like in Python it could be abbreviated to import both of those at the same time:

from std/option/Option import Some, None

Nim's from M import nil inhibits importing all exports from a module, the default; so the third line would let the importing code refer to hash_map.whatever but not resolve a bare whatever to something in the hash_map module. The third line is implied by the fourth line of course: importing "nothing and then only one thing" is the same as "importing only the one thing".

Finally Nim's import M imports everything from the module (that is exported by the module).

The nearest possible Nim equivalent to the first Rust example then is:

from std/option/Option import Some, None
from std/collections/hash_map import HashMap
import std/collections/hash_map/my_fake_submodule

The selective imports really foul up this example, and they're not usually appropriate for Nim because Nim lacks OOP's classes or FP's typeclasses as a way to implicitly add a bunch of imports. If you from vectormath import vec3 you get that one constructor but you don't get all the operator overloads that let you do anything with the object.

Nim also doesn't have Python's need for restrictive imports as it imports from a module only what is exported rather than everything, and as name conflicts are much less of a bother in Nim.

So for Nim this is slightly more idiomatic:

import std/option/Option
import std/collections/hash_map
import std/collections/hash_map/my_fake_submodule

Which would work given these paths:

std/option/Option.nim
std/collections/hash_map.nim
std/collections/hash_map/my_fake_submodule.nim

(These paths still aren't great in two ways: std/foo usually imports from Nim's standard library rather than local paths, and capitalized modules can conflict with capitalized types. But this does work and it resembles the Rust example.)

Can the 'more idiomatic' version be as short as the original Rust? Nim doesn't have self, so this doesn't work:

import std/option/Option
import std/collections/hash_map/[self, my_fake_submodule]

But by splitting one level up, you can name the module and its descendants together:

import std/option/Option
import std/collections/[hash_map, hash_map/my_fake_submodule]

You can also slightly neaten this with indentation:

import
  std/option/Option,  # <-- required comma
  std/collections/[hash_map, hash_map/my_fake_submodule]

Note that this syntax doesn't nest, so this (thankfully) doesn't work:

import std/[option/Option, collections/[hash_map, hash_map/my_fake_submodule]]

I'll just have to go on living somehow without duplicating those sprawling multi-leveled imports that I see sometimes in Rust.

The original code that I wanted to write, that prompted these notes, was

import maxminddb/[self, node]

To import the general point-of-contact maxminddb module, and also one of its less-useful internal modules. This is neatly written:

import maxminddb, maxminddb/node

12.1 Cursed alternative

From IRC:

import
  std/option/Option,
  std/collections/hash_map/["../hash_map", my_fake_submodule]

13 Documentation complaints

https://nim-lang.org/docs/with.html isn't present in 'Standard library' listing.

https://nim-lang.github.io/Nim/testament.html isn't linked from https://nim-lang.org/docs/unittest.html

The previous 'regex match' question isn't clearly answered by https://nim-lang.org/docs/re.html

https://github.com/nim-lang/Nim/issues/2042 valgrind-apparent leak behavior isn't documented with the FFI.

14 Good documents

15 Versus Rust

15.1 I was ashamed of my code

I wrote a tool in Rust to search hashes on a server for very commonly used passwords. This tool was fast, it got the job done, and it was satisfying to write. When discussing it with a coworker who is technically knowledable and has written a bit of code in 'sysadmin' languages like Perl, Python, and Ruby, I went to copy and paste some of the Rust code, to show it off … and I hesitated. It was only at this moment, months into using Rust, that I became aware of just how much of my code's visual space was taken up by matters unrelated to the business logic–to the parts of the code that I wanted to show off, or indeed to the behavior of the code that I might want someone to review. So much of Rust is hash.chars().next().unwrap(), CString::new(hash.into_bytes()).unwrap(), key.clone().into_string().unwrap() – which in many other languages would translate, respectively, to hash[0], hash, and key.

I decided not to show that code off, and after that day I wrote very little new Rust. I went looking for a language whose code I wouldn't mind other people seeing.

That code's at chrestomathy/pwcheck.

15.2 Noisy code is unsafe code

Consider the following:

testarea.add artifact
testarea.add exposed_hazardous_materials
doAssert not testarea.anyIt(it.isHuman)

Without context you don't quite know what's going on, but if you're reviewing this and want to know if humans can possibly coincide with exposed hazardous materials, you can plainly see that the check comes after those materials are added, and not before. Would this be easier to see if the code were twice as long and filled with unrelated concerns like whether the hazardous materials have a UTF-8 name, whether the artifact had to be copied, whether the isHuman check consumed the testarea contents iterator? No, at best, all of this noise would just mean that the reviewer takes a little bit longer to confirm what's going on, as the new code has to be discounted. At less than best, the reviewer is confused by the unrelated code and erroneously determines that an unsafe situation is safe because an .unwrap() would panic first–which actually only panics on bad UTF-8.

Domain experts being able to see what's going on, and to be able to object to unsafe practices like potentially exposing humans to hazardous materials, is a pretty big deal. It doesn't matter much memory safety your Crypto Wallet has if it's shipped without anyone noticing that the default password is always accepted, even after a non-default is set.

15.3 Rust is too hard

15.3.1 Intellectual satisfaction

Suppose you have a struct that contains String values only because you punted initially on working out the lifetime of the struct, and just cloned any strings that would go into it. Later, you come back around and decide to purge these unnecessary copies from your program. You change String to &'a str, you add a <'a> to the struct, and after a little while the job is done. You feel a sense of intellectual satisfaction, and your program is slightly more efficient.

That "sense of intellectual satisfaction"? That's your brain telling you that Rust is too hard. A trivial refactor shouldn't make you feel that way. "Too hard" doesn't mean "beyond one's capabilities": most people would balk at a ten pound laptop as being too heavy, but that's only a little bit heavier than a gallon of milk.

15.3.2 Camel languages

I once joked at work, "ha ha just serious"ly, that we could replace Perl with OCaml, and it'd be fine since they're both "camel languages". I got a look.

Consider SKS Keyserver Network Under Attack:

There are powerful technical and social factors inhibiting further keyserver development.

The first technical and social factor is: this shit is written in OCaml. The second technical and social factor is: literally nobody in the community feels qualified to change something that's written in OCaml.

15.3.3 (Don't) write one that other people will throw away

Andrew Kelley (of Zig) has a line that goes something like "if your code is slower than C code would be, someone will rewrite your code into C." And wouldn't you like to write code that people will use, rather than throw away and rewrite?

I think this is pretty defensible for library code, which Andrew seemed to be focused on (he also objected to GC as a barrier to code reuse). If you see a bunch of different projects in a bunch of different languages using the same JSON parsing library, it's not going to be a pure Python library.

A similar concern that I have is "if my code is too hard for coworkers to maintain, someone will throw it away after I leave and rewrite it". I want to write code that people will keep using, and not just throw away. I don't want people to check out my code to make some minor change, see the language the code is written in, and then curse the people who let me get away with it. So I write in Nim instead of Rust.

The relative simplicity of the language is a feature. I can imagine introducing Nim to embedded/microcontroller firmware developers that only know/use C99 now. I cannot imagine introducing Rust - while a great language, it is C++ level complexity - and that is not always warranted.

link

16 What did I just read?

Nothing much. These are just some notes about Nim as they occur to me.

Author: J Fondren

Created: 2020-12-31 Thu 11:46

Validate