On Nim
Table of Contents
- 1. Static deployment
- 2. Scripting
- 3. Nim on CentOS 6
- 4. Running examples with a clean directory
- 5. Inline testing
- 6. That isn't tuple initialization
- 7. Are those function arguments or a tuple?
- 8. Default returns
- 9. Constant style
- 10. Why doesn't this regex match?
- 11. Indentation: this ain't Python
- 12. You can't import self
- 13. Documentation complaints
- 14. Good documents
- 15. Versus Rust
- 16. What did I just read?
1 Static deployment
Guide: https://scripter.co/nim-deploying-static-binaries/
- install musl
- POC:
nim c --gcc.exe:musl-gcc --gcc.linkerexe:musl-gcc --passL:--static nimprogram
The guide then uses config.nims to add musl builds and handle static OpenSSL.
2 Scripting
In certain environments like system administration with numerous servers and teams of admins, scripts have incredible practical advantages over binaries.
Compiled languages in turn have a lot to offer to scripts, f.e. they
can expose kernel and libc APIs that scripting languages are too
willfully generic to incorporate. Did you know you can swap two files
in a single syscall with renameat2
? Did you know you can open a
directory that definitely isn't a symlink with
O_DIRECTORY|O_NOFOLLOW
and then use the resulting fd to anchor
further operations on this directory, even if it moves for some reason
while you're working on it?
Compiled languages have a lot to offer, mainly needing some form of
compilation cache, but the tooling is rarely all the way there. D has
'rdmd' for #!
but it's not aware of dub, so using third-party
libraries requires that you use 'dub' instead, and that's a bit slower
and requires some extra configuration in comments in the file, and it
can make potentially unwanted network communications of its own.
Rust has a series of abandoned scripting options, starting with cargo-script and cargo-eval. They
- easily perform unwanted network communication
- recompile scripts unnecessarily when dependencies update, which can result in unexpected slow runs of the script
- are not that smart about caching: you can't have a
public_html/siteone/index.rs
and apublic_html/sitetwo/index.rs
that both use cargo-eval, as the 'index' names will colllide and both scripts will refer to the last-compiled binary between the two of them.
It's quite a depressing record.
How does Nim fare? First, get nim fully installed with its tools.
choosenim
from the official download page should do this for you.
Then, grab nimcr
for scripting, and vec
just to show that
libraries can be used naturally.
# nimble install nimcr # nimble install vec
With that in place, let's try
2.1 a 'hello world'
#! /usr/bin/env nimcr echo "Hello, world!"
Which runs just as you'd expect, and (after an initial compilation) is about 11 times faster than similar Python3.
Yes, just printing something out "is I/O bound" and therefore CPU performance "doesn't matter", but scripting languages have so much overhead just finding their libraries and examining the system they're running on that it becomes noticeable.
2.2 A script using a nimble-provided library
#! /usr/bin/env nimcr import vec let q = vec2f(1.0, 2.0) echo q.normal
Outputs:
(x: -0.8944271802902222, y: 0.4472135901451111)
At the initial cost of having to provide dependencies as a separate step with nimble, you lose the much more severe and harder to predict cost of network communication and dependency upgrades that the D and Rust solutions have.
2.3 A CGI script
#! /home/example/bin/nimcr import vec, strformat echo "Content-type: text" echo "" let q = vec2f(1.0, 2.0) echo q.normal
# an .htaccess AddHandler cgi-script .nim
#! /bin/bash . /home/example/.bashrc exec ~/.nimble/bin/nimcr "$@"
This is a lot messier than a normal script would be, and it's entirely due to hosting and having to restore a bunch of shell environment settings on this server before running nimcr. This all does work however (at the time of testing):
# curl nim-lang.moe/hello.nim (x: -0.8944271802902222, y: 0.4472135901451111)
3 Nim on CentOS 6
CentOS 6 was first released in mid 2011. It's end-of-life'd in Nov 2020. So it's not that surprising that Nim doesn't work out of the box on it.
A convenient choosenim
script can manage multiple versions of Nim
and upgrade it easily after a new release.
# curl https://nim-lang.org/choosenim/init.sh -sSf | sh choosenim-init: Downloading choosenim-0.6.0_linux_amd64 /tmp/choosenim-0.6.0_linux_amd64: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /tmp/choosenim-0.6.0_linux_amd64) /tmp/choosenim-0.6.0_linux_amd64: /lib64/libc.so.6: version `GLIBC_2.16' not found (required by /tmp/choosenim-0.6.0_linux_amd64) /tmp/choosenim-0.6.0_linux_amd64: /lib64/libc.so.6: version `GLIBC_2.15' not found (required by /tmp/choosenim-0.6.0_linux_amd64)
But CentOS 6's glibc is too old, so we can't use it. Let's just get the latest source archive from nim-lang.org:
# wget https://nim-lang.org/download/nim-1.4.0.tar.xz # tar xvf nim-1.4.0.tar.xz # cd nim-1.4.0 # ./build.sh # OS: linux # CPU: amd64 gcc -w -fmax-errors=3 -O3 -fno-strict-aliasing -fno-ident -Ic_code -c c_code/1_2/stdlib_assertions.nim.c -o c_code/1_2/stdlib_assertions.nim.o In file included from c_code/1_2/stdlib_assertions.nim.c:8: c_code/nimbase.h:542: error: redefinition of typedef ‘NIM_STATIC_ASSERT_AUX’ c_code/nimbase.h:323: note: previous declaration of ‘NIM_STATIC_ASSERT_AUX’ was here #
But CentOS 6's gcc is too old, so we can't compile from the pregenerated C sources. Right, so what's the latest CentOS? 8? Let's use its version of gcc.
root# yum install centos-release-scl root# yum install devtoolset-8
And back to Nim source:
# scl enable devtoolset-8 bash # ./build.sh # bin/nim c koch # ./koch boot -d:release # ./koch tools # echo 'PATH=$PATH:'$PWD/bin >> ~/.bashrc
Great! But, erm, Nim is going to continuously need that newer gcc in order to compile anything, even after Nim itself is built.
As a quick reload of this session will demonstrate:
# nim r hello ... Error: execution of an external program failed: 'gcc -c -w -fmax-errors=3 ...'
One option is to always enter scl enable devtoolset-8 bash
before
compiling with Nim, and that'll work just fine actually. Since I want
to use Nim with scripts, I'm going to go a little farther and just
source the scl 'enable' script in my .bashrc:
# echo '. /opt/rh/devtoolset-8/enable' >> ~/.bashrc
So everything works now?
# nim r --hints:off hello Hello, world!
That works.
# nimble refresh could not import: X509_check_host
Goddamnit … I spent quite a while looking for a Nim-only solution to this (Nim invites this by being distractingly easy to dig into. Hey it has a -d:nimDisableCertificateValidation , hey here's the list of OpenSSL versions it wants, hey–), but a newer OpenSSL is just required.
# wget https://ftp.openssl.org/source/old/1.1.1/openssl-1.1.1.tar.gz # tar xvf openssl-1.1.1.tar.gz # cd openssl-1.1.1 # ./config --prefix=/usr # make root# make install
That fixes it.
4 Running examples with a clean directory
Consider the following directory listing:
./ ../ copy.nim ti1.nim
If you run nim c -r ti1
successfully, what will the new directory look like?
It will contain the new binary:
./ ../ copy.nim ti1* ti1.nim
If you have a lot of little programs because you're learning the language, these unnecessary binaries can get annoying. Fortunately there's an alternate mode that doesn't clutter the current directory:
$ ls ./ ../ copy.nim ti1.nim $ nim r --hints:off --warnings:off ti1 (1, 2) ((1, 2), (1, 2)) $ ls ./ ../ copy.nim ti1.nim $
That's nim r file
for short, the extra flags there are just for
cleaner output.
4.1 Caveat
(This is fixed in 1.5.x devel)
nim r
caching is by the basename of the file (nim c -r
and nimcr
are more robust than this), so you can occasionally run into some
frustrating issues if you reuse names.
Consider:
$ cat 1/hello.nim echo "1" $ cat 2/hello.nim echo "2" $ ( cd 1; nim r --hints:off hello ) 1 $ ( cd 2; nim r --hints:off hello ) 1
5 Inline testing
Nim has very good support for dedicated files of tests, but how about inline testing?
The D language has a unit testing feature that keeps tests close to the source they're testing, and special compilation options that make it easy to run these tests without performing other unwanted activities. Because Nim doesn't have this feature, you'll tend to find (even in sources accompanying the Nim compiler itself) ad-hoc tests like:
func double*(n: float): float = n * 2 when isMainModule: doAssert(2.double == 4)
This is a file that can be import
'ed for its double
function, or
can be run as a program in its own right for that assertion. This kind
of testing obviously falls apart as soon as you want to test a file
that's already a program in its own right, with other actions that
already will be happening when isMainModule
.
But Nim can still get very close to D's feature:
func double*(n: float): float = n * 2 when defined(unittests): import unittest suite "doubling": test "double 2": check(2.double == 4) test "double -2": check(-2.double == (-4)) when isMainModule and not defined(unittests): echo "but then this ^^^ is required"
6 That isn't tuple initialization
What is the output of this program?
let (a, b) = (1, 2) echo (a, b) let x, y = (1, 2) echo (x, y)
It is:
(1, 2) ((1, 2), (1, 2))
The second let
does not deconstruct (1, 2)
, but rather is a
shorthand for assigning it to both variables. This is similar to the
more familiar var x, y: int
syntax that gives two variables the same
type.
Nim now has an EachIdentIsTuple
warning for the above case. There's
no such warning for the following program. In it, are the printed
addresses the same?
type Odd = object id: int proc `=copy`(dest: var Odd; source: Odd) {.error.} let x, y = Odd(id: 2) echo x echo y echo cast[int](x.unsafeAddr) echo cast[int](y.unsafeAddr)
They are not. Sample output:
(id: 2) (id: 2) 4347016 4347024
This syntax then is not constructing a single object for both variables, but is just a shorthand for the following:
let x = Odd(id: 2) let y = Odd(id: 2)
7 Are those function arguments or a tuple?
What's the output of this program?
echo(1, 2) echo (1, 2) echo 1, 2
It is:
12 (1, 2) 12
8 Default returns
What does the following program print at runtime?
type Node = ref object case kind: range[0..1] of 0: onedata: int of 1: twodata: bool proc get[T](node: Node): T = case node.kind of 0: when T is string: return $node.onedata else: assert(false) of 1: when T is string: return $node.twodata else: var a: int for n in 0 ..< node.kind: a.inc n echo get[seq[int]](Node(kind: 1, twodata: true))
It prints @[]
, the default value of a seq[int]
.
9 Constant style
Why can't the following program compile?
import std/[net, strformat] const PORT = 4444 var server, client: Socket address: string server = newSocket() server.setSockOpt(OptReusePort, true) server.bindAddr(Port(PORT)) server.listen while true: server.accept(client) let (address, port) = client.getPeerAddr echo &"Client connected from: {address}:{port}" client.send "Hello, world!\n" client.close
If you didn't see it, does the error help?
/path/to/style1.nim(11, 21) Error: attempting to call routine: 'Port' found 'PORT' of kind 'const' found 'nativesockets.Port [declared in /home/jfondren/.choosenim/toolchains/nim-1.4.0/lib/pure/nativesockets.nim(52, 3)]' of kind 'type'
The problem is the constant PORT
conflicts with the function Port
,
because Nim is style insensitive past the first character of an
identifier. So this convention of putting constants in all caps, it
doesn't really suit Nim, as all that it's done here is inflict on the
reader the illusion of the code not having this name conflict.
What other options are there?
9.1 fake namespacing in the style of C function names
const cfgPort = 4444
This seems to be the route followed in Nim internals and in the
stdlib. For File I/O there are fmRead
, fmWrite
. For networking
there's that OptReusePort
option.
9.2 awkwardly renaming things when you notice a conflict
const PORT_NO = 4444
This was my first impulse, before I wondered what the point was again of putting this in all caps.
9.3 real namespacing with pure enums?
type Config {.pure.} = enum Port = 4444 ... server.bindAddr(Port(Config.Port))
This would work for the exact code above, but just by reading through
the manual on Enums it should be clear how limiting and annoying this
would be in practice. You'd have to reorder your definitions if
changing one's value caused it to change its order in the enum, and
you're limited to integer values. Also, when even 'pure' enum names
don't conflict with another identifier, they'll still be accessible
without the Config.
prefix:
type Example {.pure.} = enum Apple = 1 Orange = 2 let Apple = 10 echo Apple # output: 10 echo Orange # not an error. output: Orange
9.4 real namespacing with a constant object
type configtype = object port: int greeting: string const Config = configtype( greeting: "Hello, world!\n", port: 4444, )
This occurred to me later, but it seems like a completely satisfactory solution, and one that lends itself to other uses of the object type.
'Fake namespacing' still seems like the best option.
10 Why doesn't this regex match?
What is the output of this program?
import re if "What the heck" =~ re"the": echo "Found it" else: echo "Didn't find it"
Here's how some other languages answer this question:
$ echo "What the heck" | awk '/the/{print "Found it"}' Found it $ ruby -le 'puts "Found it" if "What the heck" =~ /the/' 2>/dev/null Found it $ perl -le 'print "Found it" if "What the heck" =~ /the/' Found it $ if [[ "What the heck" =~ "the" ]]; then echo Found it; fi Found it iex(1)> "What the heck" =~ ~r/the/ # Elixir true
If you run by https://www.rosettacode.org/wiki/Regular_expressions you can find a few more languages that clearly behave similarly from the examples.
Nim's output is:
Didn't find it
Nim effectively has a ^
start-of-string anchor at the beginning of
any regex used with this syntax. You'll probably want code more like:
import re if "What the heck".contains re"the": echo "Found it" else: echo "Didn't find it"
11 Indentation: this ain't Python
What does the following Python output at runtime?
greeting = "Hello" .upper() .lower() print(greeting)
It outputs an error:
File "dotty.py", line 2 .upper() ^ IndentationError: unexpected indent
To continue an complete-looking line onto another line, Python needs
the line to look incomplete again, with \
to-be-continued markers.
In Nim though, the equivalent code runs without error:
import strutils let greeting = "Hello" .toUpper .toLower echo greeting
12 You can't import self
Consider this Rust, slightly adapted from its reference manual:
use std::option::Option::{Some, None}; use std::collections::hash_map::{self, HashMap, my_fake_submodule::*};
There are two shorthands deployed here: multiple imports after a path,
and importing the parent of the multiple imports. Members of a module
(Some
) are also imported with the same syntax that would import a
whole module. In full the above does:
use std::option::Option::Some; use std::option::Option::None; use std::collections::hash_map; use std::collections::hash_map::HashMap; use std::collections::hash_map::my_fake_submodule::*;
The nearest possible Nim equivalent to the last example is:
from std/option/Option import Some from std/option/Option import None from std/collections/hash_map import nil from std/collections/hash_map import HashMap import std/collections/hash_map/my_fake_submodule
This exhibits the Python from M import E
selective-import syntax, to
get Some
and None
. And just like in Python it could be abbreviated
to import both of those at the same time:
from std/option/Option import Some, None
Nim's from M import nil
inhibits importing all exports from a
module, the default; so the third line would let the importing code
refer to hash_map.whatever
but not resolve a bare whatever
to
something in the hash_map
module. The third line is implied by the
fourth line of course: importing "nothing and then only one thing" is
the same as "importing only the one thing".
Finally Nim's import M
imports everything from the module (that is
exported by the module).
The nearest possible Nim equivalent to the first Rust example then is:
from std/option/Option import Some, None from std/collections/hash_map import HashMap import std/collections/hash_map/my_fake_submodule
The selective imports really foul up this example, and they're not
usually appropriate for Nim because Nim lacks OOP's classes or FP's
typeclasses as a way to implicitly add a bunch of imports. If you
from vectormath import vec3
you get that one constructor but you
don't get all the operator overloads that let you do anything with the
object.
Nim also doesn't have Python's need for restrictive imports as it imports from a module only what is exported rather than everything, and as name conflicts are much less of a bother in Nim.
So for Nim this is slightly more idiomatic:
import std/option/Option import std/collections/hash_map import std/collections/hash_map/my_fake_submodule
Which would work given these paths:
std/option/Option.nim std/collections/hash_map.nim std/collections/hash_map/my_fake_submodule.nim
(These paths still aren't great in two ways: std/foo
usually imports
from Nim's standard library rather than local paths, and capitalized
modules can conflict with capitalized types. But this does work and it
resembles the Rust example.)
Can the 'more idiomatic' version be as short as the original Rust? Nim
doesn't have self
, so this doesn't work:
import std/option/Option import std/collections/hash_map/[self, my_fake_submodule]
But by splitting one level up, you can name the module and its descendants together:
import std/option/Option import std/collections/[hash_map, hash_map/my_fake_submodule]
You can also slightly neaten this with indentation:
import std/option/Option, # <-- required comma std/collections/[hash_map, hash_map/my_fake_submodule]
Note that this syntax doesn't nest, so this (thankfully) doesn't work:
import std/[option/Option, collections/[hash_map, hash_map/my_fake_submodule]]
I'll just have to go on living somehow without duplicating those sprawling multi-leveled imports that I see sometimes in Rust.
The original code that I wanted to write, that prompted these notes, was
import maxminddb/[self, node]
To import the general point-of-contact maxminddb
module, and also
one of its less-useful internal modules. This is neatly written:
import maxminddb, maxminddb/node
12.1 Cursed alternative
From IRC:
import std/option/Option, std/collections/hash_map/["../hash_map", my_fake_submodule]
13 Documentation complaints
https://nim-lang.org/docs/with.html isn't present in 'Standard library' listing.
https://nim-lang.github.io/Nim/testament.html isn't linked from https://nim-lang.org/docs/unittest.html
The previous 'regex match' question isn't clearly answered by https://nim-lang.org/docs/re.html
https://github.com/nim-lang/Nim/issues/2042 valgrind-apparent leak behavior isn't documented with the FFI.
14 Good documents
15 Versus Rust
15.1 I was ashamed of my code
I wrote a tool in Rust to search hashes on a server for very commonly
used passwords. This tool was fast, it got the job done, and it was
satisfying to write. When discussing it with a coworker who is
technically knowledable and has written a bit of code in 'sysadmin'
languages like Perl, Python, and Ruby, I went to copy and paste some
of the Rust code, to show it off … and I hesitated. It was only at
this moment, months into using Rust, that I became aware of just how
much of my code's visual space was taken up by matters unrelated to
the business logic–to the parts of the code that I wanted to show
off, or indeed to the behavior of the code that I might want someone
to review. So much of Rust is hash.chars().next().unwrap()
,
CString::new(hash.into_bytes()).unwrap()
,
key.clone().into_string().unwrap()
– which in many other languages
would translate, respectively, to hash[0]
, hash
, and key
.
I decided not to show that code off, and after that day I wrote very little new Rust. I went looking for a language whose code I wouldn't mind other people seeing.
That code's at chrestomathy/pwcheck.
15.2 Noisy code is unsafe code
Consider the following:
testarea.add artifact testarea.add exposed_hazardous_materials doAssert not testarea.anyIt(it.isHuman)
Without context you don't quite know what's going on, but if you're
reviewing this and want to know if humans can possibly coincide with
exposed hazardous materials, you can plainly see that the check comes
after those materials are added, and not before. Would this be
easier to see if the code were twice as long and filled with unrelated
concerns like whether the hazardous materials have a UTF-8 name,
whether the artifact had to be copied, whether the isHuman check
consumed the testarea contents iterator? No, at best, all of this
noise would just mean that the reviewer takes a little bit longer to
confirm what's going on, as the new code has to be discounted. At less
than best, the reviewer is confused by the unrelated code and
erroneously determines that an unsafe situation is safe because an
.unwrap()
would panic first–which actually only panics on bad
UTF-8.
Domain experts being able to see what's going on, and to be able to object to unsafe practices like potentially exposing humans to hazardous materials, is a pretty big deal. It doesn't matter much memory safety your Crypto Wallet has if it's shipped without anyone noticing that the default password is always accepted, even after a non-default is set.
15.3 Rust is too hard
15.3.1 Intellectual satisfaction
Suppose you have a struct that contains String
values only because you
punted initially on working out the lifetime of the struct, and just
cloned any strings that would go into it. Later, you come back around
and decide to purge these unnecessary copies from your program. You
change String
to &'a str
, you add a <'a>
to the struct, and after a
little while the job is done. You feel a sense of intellectual
satisfaction, and your program is slightly more efficient.
That "sense of intellectual satisfaction"? That's your brain telling you that Rust is too hard. A trivial refactor shouldn't make you feel that way. "Too hard" doesn't mean "beyond one's capabilities": most people would balk at a ten pound laptop as being too heavy, but that's only a little bit heavier than a gallon of milk.
15.3.2 Camel languages
I once joked at work, "ha ha just serious"ly, that we could replace Perl with OCaml, and it'd be fine since they're both "camel languages". I got a look.
Consider SKS Keyserver Network Under Attack:
There are powerful technical and social factors inhibiting further keyserver development.
The first technical and social factor is: this shit is written in OCaml. The second technical and social factor is: literally nobody in the community feels qualified to change something that's written in OCaml.
15.3.3 (Don't) write one that other people will throw away
Andrew Kelley (of Zig) has a line that goes something like "if your code is slower than C code would be, someone will rewrite your code into C." And wouldn't you like to write code that people will use, rather than throw away and rewrite?
I think this is pretty defensible for library code, which Andrew seemed to be focused on (he also objected to GC as a barrier to code reuse). If you see a bunch of different projects in a bunch of different languages using the same JSON parsing library, it's not going to be a pure Python library.
A similar concern that I have is "if my code is too hard for coworkers to maintain, someone will throw it away after I leave and rewrite it". I want to write code that people will keep using, and not just throw away. I don't want people to check out my code to make some minor change, see the language the code is written in, and then curse the people who let me get away with it. So I write in Nim instead of Rust.
The relative simplicity of the language is a feature. I can imagine introducing Nim to embedded/microcontroller firmware developers that only know/use C99 now. I cannot imagine introducing Rust - while a great language, it is C++ level complexity - and that is not always warranted.
16 What did I just read?
Nothing much. These are just some notes about Nim as they occur to me.