Support for embedded is indeed getting into focus. I hope to release an article soon how the '--gc:regions' switch works (which is misnamed, it's a way to do memory management, not a GC...) and in the longer run the destructor-based runtime will ensure that memory is freed sooner than a GC would for memory constrained devices.
Having said that, "embedded" is a wide field and sometimes one cannot even afford a heap, no matter if garbage collected or managed manually. But even then Nim can be used, you still get array bounds checking, Ada-inspired strong type checking plus the meta programming goodies.
Great! My interest is in bare metal environments where you typically don't use a heap. I'm not familiar with the `--gc:regions` switch, but I think a good start could be a guide on how to implement simple non-garbage collected programs.
Looks like they concluded that having the compiler automatically choose an appropriate region but silently leak the memory when that fails doesn't work well? Kind of like GC by reference-counting without a cycle collector.
Couldn't the compiler just reject code that would need to allocate in the global region, unless the programmer explicitly annotates it to make it global? It says the SML type system is too weak for this, but perhaps Nim's type system is sufficient?
I'm not following you really. 'discard' is an enforced, explicit statement about that you throw away information/the result of a computation. That's against "nice if you are programming alone" as much as it can get.
Huh? What does it’s status as an explicit language construct have to do with the situations when it’s useful for design?
I’m saying when you program alone, you know when to use discard on your otherwise value-returning function. Other people don’t, and the use of the return type actually suggests the opposite. That you should intentionally invoke that proc for its return value.
> “That's against "nice if you are programming alone" as much as it can get.”
I don’t understand this claim. Nothing about the formal definition of a language is for or against being “nice if you program alone” — rather it is what patterns of usage does it encourage or facilitate.
It’s like “C++ without exceptions”. The formal implementation is just some factoid of the language, but the usage that arises around discard is a bad anti-pattern in terms of communicating intended usage and whether / when to rely on side-effects.
I agree that 'discard' is usually a code smell, but how would implicitly ignoring the result be any better? It wouldn't be better at all, and that means Nim's discard feature is rather well designed as it improves the status quo.
Why is implicitly ignoring the result the option you’re comparing with? Instead, create language features that encourage you to separate side-effectful functions from value-returning ones.
Also many languages use a very standard convention of assigning underscore to parts of a result value to be ignored, and discard has no clear advantages over this in my mind.
That's true and I wouldn't use the Option type for that either. There is a switch that warns about variables that are not initialized explicitly (including the 'result' variable) and an RFC to make this switch non-optional. https://github.com/nim-lang/Nim/issues/7917
Compiling to C vs using LLVM is a complex design tradeoff. For example, the Posix standard specifies a C interface. Quote: "The stat structure shall contain at least the following members..." This means you can wrap 'struct stat' in Nim once and be done with it (since the mapping to C is by generating C code) whereas for eg Rust you need to wrap it once for every different OS (and potentially even for every OS version+CPU combination), since the offsets and sizes can differ. So yes, porting to other platforms really is easier when you have a C code generator.
There are other ways to address this issue. You can do what Swift does and call into libclang so that you can pull out the structure definitions, for instance.
Furthermore, if your frontend doesn't know about structure layout then you forgo some important systems language features. For example, you can't implement sizeof as a compile time constant.
> There are other ways to address this issue. You can do what Swift does and call into libclang so that you can pull out the structure definitions, for instance.
Sure and you only need to call libclang for each different platform. Or is it on every platform? ;-)
It would be physically impossible for a call to that function to work with a single definition of the struct stat, because the set of fields (and their ordering) and struct size differ between platforms. So that libc wrapper must provide separate definitions of the struct for each platform.
Ah, sorry! I was thinking about things like "the size of this type is different per-platform," not "every single platform has a different definition of this struct."
I still think that, given Cargo, the one-time cost makes it worth it, but after seeing my error, think that the point makes more sense. Thank you for being patient. :)
Good questions. I'm afraid the documentation is seriously out of date about the GC: It used to implement a variant of "trial deletion" so that "never scans the whole heap" used to refer to the fact that it doesn't use traditional mark&sweep, but only scans the subgraph(s) leading to the cycles. Of course you can always create a "subgraph" that spans the whole heap, so even for trial deletion it is a dubious claim.
Since version 0.14 (iirc) however, Nim uses a traditional mark&sweep backup GC to collect cycles. Reasons: Trial deletion in practice is much slower.
For all versions of the GC is stack is scanned conservatively as precise stack scanning is simply too expensive when C is your main target.
That has been my experience too. That all the extra work and logic (cause the algorithm is complicated) you need for detecting cycles and trial deletion is so expensive that regular mark&sweep beats it.
But to ask a pointed question, doesn't that mean Nim gets the worst of both worlds? You have both the overhead of updating reference counts and the (relatively long) garbage collection pauses. I guess if the programmer codes in such a way that no cyclic garbage is created it is not a problem because the gc will never be triggered. But how common is that in practice? Does the language make it easy to avoid cycles?
> But to ask a pointed question, doesn't that mean Nim gets the worst of both worlds? You have both the overhead of updating reference counts and the (relatively long) garbage collection pauses.
There's a patch of the collector to make the M&S incremental, to avoid pauses for this step too. Of course, whether a deferred RC'ing GC with incremental cycle collector works better in practice than a more conventional generational incremental GC (or just an incremental GC) is an open question. :-)
Nim has garbage collected pointers (ref) and raw pointers (ptr). You can use 'ptr' to break up cycles manually and disable the M&S. I suppose it's very much comparable to Swift except that Nim's RC overhead is much lower since it's deferred RC.
> And there's no strong sense of design and elegance. It very much comes off as "here's a bunch of ideas thrown together with the restriction that they all must somehow compile to C".
But C is Turing-complete, so there is no restriction here. It's also not a "bunch of ideas thrown together" anymore than Haskell or Scala or Rust is. Nim focuses on 3 things:
* Metaprogramming via a hygienic AST-based macro system.
* An effect system to make the type system simpler to use. (The compiler infers an awful lot of useful information for you, the docgen shows the results for everybody to look at.)
* GC'ed thread local heaps where each GC can comply with soft-realtime requirements. Hard realtime is being worked on.
Any modern statically typed language also needs to have generics, closures, inheritance, exceptions or workarounds of the same complexity, so yeah Nim also has these making Nim instantly a big language. That doesn't mean it's a "bunch of ideas thrown together".
> It doesn't have memory safety as a core goal.
It does and it is reasonably safe (yeah yeah yeah we don't check for 'nil' properly yet, give me a break) with quite some improvements in the pipeline.
My apologies then. I didn't mean to come off rude; elegance is subjective I suppose.
I was under the impression that the only safe pointers (references) must be GC'd. So if you want/must to avoid that (say for perf), you're stuck using regular unsafe pointers. Is that inaccurate?
The manual even says that just calling printf is actually unsafe as the cstring could be GC'd (but it probably won't). Maybe Nim should have an unsafe directive to make sure any unsafe user code is clearly marked?
I also coulda sworn there was some part of the manual or site that was discussing a feature with a limitation due to the difficulty of expressing it in C. Maybe I imagined it. Sorry.
> The manual even says that just calling printf is actually unsafe as the cstring could be GC'd (but it probably won't).
The manual tries very hard to mention corner cases since it's evolving into a proper spec. Calling printf is safe, it's C functions that take ownership of the char* pointer that are inherently unsafe.
> Maybe Nim should have an unsafe directive to make sure any unsafe user code is clearly marked?
That's a common misunderstanding, Nim doesn't need this. Instead of an ``unsafe`` keyword, Nim has ``addr`` and ``cast`` as keywords, these are the unsafe building blocks of the language. There are other corner cases that introduce unsafety, but they are all known and can be solved in time.
> I also coulda sworn there was some part of the manual or site that was discussing a feature with a limitation due to the difficulty of expressing it in C.
Well these things certainly are everywhere, but the issue here is not the difficulty of expressing it in C, but a desire to have very good C interop. Rust actually has all the same design constraints even though Rust builds on top of LLVM because the constraints come from the "systems programming" problem domain: Control over memory layouts, the stressed difference between heap and stack, etc. LLVM's IR is very close to C, so what is difficult to map to C is difficult to map to LLVM too.
Compiling to C is not tricky, it is horrible and we don't do it for the fun of it! That said, clang has lots of options to tame C's undefined behaviour. Guess what, you can enable these too for C code generated by Nim.
> That said, clang has lots of options to tame C's undefined behaviour. Guess what, you can enable these too for C code generated by Nim.
They aren't really designed for performance though. For example, `-fsanitize-undefined-trap-on-error -fsanitize=null` emits explicit comparisons against null for every pointer load instead of catching SEGV like (for example) Java or Go do.
From what you're saying it sounds like maximum performance plus memory safety isn't a design goal for Nim. That's totally reasonable. It does mean that Nim and Rust have very different aims, however, and comparisons between the two need to take this into account.
-fstack-check in GCC guarantees that the guard page will be hit (as long as there is one), assuming no undefined behaviour like buffer overflows. It has a negligible cost and should really be enabled by default. It only has to add a one byte write per uninitialized page on the stack, and a less efficient dynamic alloca due to probes.
Well I did my homework. When you find another language that does it in a somewhat similar fashion, I'll happily change the website. ;-) I didn't think Rust counts, but since it's constantly changing, I will have a fresh look at it.
pcwalton's remark is excellent but "automated proof technology" is not a well defined term. What I mean by this is that it goes beyond what a traditional type checker can do. I don't think Rust can do exactly the same things via its borrow checking and its iterators, but I might be wrong. Note that the very same analysis also proves your index bounds are correct.
Nim's disjoint checker is so experimental that its docs are indeed very terse and we only have a couple of test cases for now. That said, the disjoint checking is restricted to the 'parallel' statement, so its complexity only affects this language construct and not the whole language. You can think of it as a macro that does additional checking.
> Every location of the form a[i] and a[i..j] and dest where dest is part of the pattern dest = spawn f(...) has to be provably disjoint. This is called the disjoint check.
The type system guarantees disjointness when necessary: a mutable reference `&mut` is guaranteed to be the only way to access the data it points to at any given point in time and iterators over mutable references preserve this guarantee, so disjointness-for-writing is automatic.
> Every other complex location loc that is used in a spawned proc (spawn f(loc)) has to be immutable for the duration of the parallel section. This is called the immutability check. Currently it is not specified what exactly "complex location" means. We need to make this an optimization!
Rust generalises immutable to "safe to be used in parallel"; everything that is (truly) immutable satisfies this, but so do, for example, memory locations that can only be used with atomic CPU instructions, or values that are protected by a mutex. There's no way to get data races with such things, so they're safe to refer to in multiple threads.
> Every array access has to be provably within bounds. This is called the bounds check.
Rust's iterators give in-bounds automatically, but there's also no restriction about requiring bounds checks or not. (What does this rule offer Nim?)
> Slices are optimized so that no copy is performed. This optimization is not yet performed for ordinary slices outside of a parallel section. Slices are also special in that they currently do not support negative indexes!
I'm not sure what this means in the context of Nim, but passing around a Rust references never does a copy (even into another thread).
(Disclaimer 2: it's not currently possible to pass a reference into another thread safely, but the standard library is designed to support it, the only missing piece is changing one piece of the type system, https://github.com/rust-lang/rfcs/pull/458 , to be able to guarantee safety.)
Having said that, "embedded" is a wide field and sometimes one cannot even afford a heap, no matter if garbage collected or managed manually. But even then Nim can be used, you still get array bounds checking, Ada-inspired strong type checking plus the meta programming goodies.