The non allocating D subset - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » The non allocating D subset

Thread overview

The non allocating D subset
Jun 01, 2013 SomeDude
Jun 01, 2013 SomeDude
Jun 01, 2013 Dicebot
Jun 01, 2013 Adam D. Ruppe
Jun 01, 2013 Adam D. Ruppe
Jun 01, 2013 Adam D. Ruppe
Jun 01, 2013 Adam D. Ruppe
Jun 01, 2013 Paulo Pinto
Jun 01, 2013 SomeDude
Jun 01, 2013 Adam D. Ruppe
Jun 02, 2013 Adam D. Ruppe
Jun 02, 2013 Mr. Anonymous
Jun 09, 2013 dam
Jun 09, 2013 adam d ruppe
Jun 02, 2013 Manu
Jun 02, 2013 Piotr Szturmaj
Jun 07, 2013 Tyler Jameson Little
Jun 07, 2013 Piotr Szturmaj
Jun 07, 2013 Tyler Jameson Little
Jun 07, 2013 Simen Kjaeraas
Jun 07, 2013 Tyler Jameson Little
Jun 07, 2013 Piotr Szturmaj
Jun 08, 2013 Tyler Jameson Little
Jun 08, 2013 Simen Kjaeraas
Jun 08, 2013 Tyler Jameson Little
Jun 09, 2013 Simen Kjaeraas
Jun 07, 2013 Simen Kjaeraas

June 01, 2013

The non allocating D subset

Posted by SomeDude

SomeDude

In the "Rust based provocation thread", I think Adam Ruppe's work went largely overlooked. He basically created a minimal D that runs on bare metal, proving thus that D can be used fruitfully on small embedded devices in place of C.

On Monday, 27 May 2013 at 15:45:04 UTC, Adam D. Ruppe wrote:
> On Monday, 27 May 2013 at 14:36:59 UTC, Dicebot wrote:
>> But issue is not creating minimal run-time, it is creating minimal one that still has most part of language usable.
>
> eh the question is what is "most"? Even my little 200 line thing has: functions, templates, scope closures, structs including operator overloading, static arrays, slices, pointers, __traits+ctfe, scope guards, switches, and more.
>
> I just now added basic classes and that wasn't too hard (copy/pasted some code from the real druntime for the typeinfo and so on).
>
> But it doesn't do AAs, throwing exceptions, dynamic arrays and other nice features. Looking at druntime's src, exceptions look hard, and while dynamic arrays, heap closures, and others can easily 'work', they will leak memory, so I don't think they will ever be really good without gc. Exceptions are doable though from what I can tell.
>
>
> Anyway I think this is enough to do some real programs and feel kinda nice. Surely no worse than C at least.

This is already a great proof of concept in my opinion.
But more than that, I believe that this minimal D is the sublanguage that could prove useful to game programmers in place of C++.

Basically it is a non allocating D subset. It retains most of the niceties of D without ever resorting to the GC, so it's a definite improvement over C++. There is absolutely no reason performance would suffer from programming in this D sublanguage. The only real hassle being one still has to do manual memory management, but users of such a subset are used to that and prefer to do it themselves than using a GC anyways.

Following this idea, I believe a fairly large chunk of Phobos could be ported to compile with this minimal D sublanguage, and that one could use the allocating D and its added sugar on top of it. So in the end, the user could decide between working with the non allocating language (mostly embedded programmers and game makers), or benefit from the GC and all the associated niceties (the rest of us).

This would make D the truely universal language it was intended to be.

June 01, 2013

Re: The non allocating D subset

Posted by SomeDude
in reply to SomeDude

SomeDude

Posted in reply to SomeDude

On Saturday, 1 June 2013 at 05:45:38 UTC, SomeDude wrote:
> This would make D the truely universal language it was intended to be.

This is a large undertaking, but I think there is no technical hurdle preventing it to succeed. IBasically it's only a matter of sweat. In fact I believe it has a much better chance of success than having a GC that solves all the problems, as we know that even after Sun and Microsoft have poured in thousands of hours of brainpower in engineering, the GC is never good enough for all purposes.

June 01, 2013

Re: The non allocating D subset

Posted by Dicebot
in reply to SomeDude

Dicebot

Posted in reply to SomeDude

On Saturday, 1 June 2013 at 05:45:38 UTC, SomeDude wrote:
> In the "Rust based provocation thread", I think Adam Ruppe's work went largely overlooked.

Discussion has continued via e-mail and his efforts has not gone unnoticed :)
It is important proof-of-concept that can be used to highlight problematic places but to be truly useful such subset needs to be verified by a compiler, for friendlier error messages at the very least.

June 01, 2013

Re: The non allocating D subset

Posted by Adam D. Ruppe
in reply to SomeDude

Adam D. Ruppe

Posted in reply to SomeDude

On Saturday, 1 June 2013 at 05:45:38 UTC, SomeDude wrote:
> Basically it is a non allocating D subset.

Not necessarily nonallocating, but it doesn't use a gc. I just updated the zip:

http://arsdnet.net/dcode/minimal.zip

If you make the program (only works on linux btw) and run it, you'll see a bunch of allocations fly by, but they are all done with malloc/free and their locations are pretty predictable.

The file minimal.d is a test program and you can see a lot of D works, including features like classes, exceptions, templates, structs, and delegates. (Heap allocated delegates should be banned but aren't, so if you do one built in it will leak. The helper file, memory.d, though contains a HeapDelegate struct that refcounts and frees it, so the concept is still usable.)

The other cool thing is since the library is so minimal, the generated executable is small too. Only about 30 KB with an empty main(), and no outside dependencies. A cool fact about that is you can compile it and run on bare metal (given a bootloader like grub) too, and it all just works.

You can also make "LIBC=yes" and depend on the C library, which makes things work better - there's a real malloc function there! - and adds about 10kb to the executable. That's probably a more realistic way to use it on the desktop at least than totally standalone.



But yeah, I haven't written any real code with this, but so far it seems to be pretty usable.





I also talked a while on the reddit thread last night about this, so let me copy/paste that here too:




Yes, certainly. And it wouldn't even necessarily be no array concats, just you wouldn't want to use the built-in ones.

Some features that use the gc in the real druntime don't necessarily have to. You'll need to be aware of this most the time to free the memory in your app, but you can have a pretty good idea of when it will happen. One example is new class. If that mallocs, if you just match every new with a delete (or call to free_obj() or whatever), you'll be fine, just like C++.

I played with one I wasn't sure would work earlier, but now think it can: heap closures. scope delegates are easy, since they don't allocate, but heap closures allocate automatically and don't give much indication that they do.... but, if you are careful with it, the rules can be followed (if it accesses an outside scope and has its address taken/reference copied or passed to a function, it will automatically allocate), and you can manually call free(dg.ptr); when you're done with it.

I think it is probably safer to just disallow them, either by not implementing _d_allocmemory in druntime (thus if you accidentally use it, you'll get a linker error about the missing function), or, and this is tricky right now but not actually impossible, use compile time reflection to scan your methods and members for a non-scope delegate reference and throw an error.

If we do the latter, a heap delegate can actually be allowed in a fairly safe way, by wrapping it in a struct. Usage of HeapDelegate!T will be pretty obvious, so you aren't going to accidentally use it the way you might the more sugary built in. I have a proof of concept implemented in my local copy of minimal.d that automatically refcounts the delegate, freeing it when the last reference goes out of scope.

Array slices are ok the way I have them implemented now: the built in concat function is missing, so if you try a ~ b, it will be a linker error (including the source file and line number btw, easy enough to handle). No allocation there. The biggest risk is lifecycle management, and the rule there is you don't own slices (non-immutable ones at least). I'd like the compiler to implement a check on this, but right now it doesn't. Not a hard coding convention though.

Built in new array[] is not implemented, meaning it is a linker error, because they are indistinguishable from slices type-wise. (In theory it could be like classes, where you just know to manually free them, but if you have a char[] member, are you sure that was new'd or is it holding a slice someone else owns? Let's just avoid it entirely.)

But, this doesn't mean we can't have some of D's array convenience! In minimal.d, you can see a StackArray!T struct and maybe, not sure if I put it in that zip or not, a HeapArray!T struct. These types own their memory, stack of course going away with scope, and heap being automatically reference counted to call free() when all copies are gone, and overload a few operators for convenience:

alias this is to a slice function, so you can do char[] slice = myCustomArray; You can't change the original pointer through that slice, so no risk of it losing the memory.

The ~= operator is implemented too on the *Array containers. They know their length and their capacities, and you can append up to the capacity. (The HeapArray could also realloc() as needed, but right now I don't.) One important difference though with this and regular D arrays is in regular D:

string a = "hey"; string b = a; b ~= " man";
assert(a == "hey" && b == "hey man");

Appending to the second one doesn't change the first one. It may allocate as needed (see this for details: http://dlang.org/d-array-article.html )

Whereas with a HeapArray or StackArray, they share the same underlying data, so appending to one reference would append to all. I think that's OK though, because we have helper things like const to avoid that, and they are a custom type, so they are allowed to work differently than the built ins.

Thankfully, btw, static arrays are a different type. They can be permitted with ease.

I didn't implement a ~ b. I think that one would be too easy to lose and either pointlessly malloc/free in the middle of an expression, or just forget to free entirely, but maybe it could be done too.

Another issue is strings. In phobos and druntime both, there's a lot or creating strings on the heap and returning them from functions. e.g. to!string(10) returns a brand new allocated string "10". We don't want that in our library, so it looks a little more like C, but slices make it easier to manage. The analogous function I wrote is

char[] intToString(int a, char[] buffer)

You pass it an area of pre-allocated buffer to write to. buffer.length tells it where it isn't allowed to continue (unlike a plain char* in C). It returns the slice of your own buffer that was actually used. So writeln(10) becomes:

char[16] buffer;
write(intToString(10, buffer), "\n");

a little more verbose, but there's no mystery there about the memory. intToString knows it only has 16 spaces to work with, you know exactly where it is going, no allocations, and the return value conveniently has the length used too, so we can pass it directly to another function. (as long as that function doesn't store the reference!)

Built in AAs? Not implemented. But we could do a library AA just as easily, and thanks to overloaded operators, it would be pretty too.

Another issue is exceptions. They work, and must be classes. So where do you free them? I haven't tried this yet, but I'm pretty sure you can just do it when you catch() it, and no problem.

Well this is turning into a real beast of a comment, so let me sum up and finish: a lot of D can work without the GC. It will take some custom types and deliberately missing druntime functions to make pretty, but it leaves us with a language at least as usable as C++, with the same idea of no surprise/hidden allocations in there. There's still a question of having stuff we don't necessarily want like RTTI, but their impact can be minimized so I think it will be ok too. (Oh btw, since I have a custom druntime here, I added some runtime reflection the real D doesn't have yet. It rox, and came cheap, but you can still version it out.)

June 01, 2013

Re: The non allocating D subset

Posted by Adam D. Ruppe
in reply to Adam D. Ruppe

Adam D. Ruppe

Posted in reply to Adam D. Ruppe

huh array literals don't work since they call an allocation function, even if all the data in them is all static. That's disappointing.

We can force it though with something like this:

template arrlit(T...) {
        static __gshared immutable int[] literal = [T];
        alias arrlit = literal;
}

        arr2 ~= arrlit!(1,2,3);

then it doesn't allocate (arr2 is a StackArray btw). The arrlit.ptr is the initialized data segment, which is a much more reasonable place for it than doing a runtime allocation.

June 01, 2013

Re: The non allocating D subset

Posted by Adam D. Ruppe
in reply to Adam D. Ruppe

Adam D. Ruppe

Posted in reply to Adam D. Ruppe

BTW I think the compiler should redo array literals to work like this:

rewrite it into: array!T(elements....)

always try to CTFE it. Since it is a literal this should provide as much indication as enum that you want this.

If it works, put it out as static info in the exe.

If not just leave that call in place for the runtime to handle, either failing our allocating or whatever. It should be able to return a library type too, not necessarily a built in array type. e.g. it might return a HeapArrayLiteral which is incompatible with slice assignment, but can be assigned to a refcounted HeapArray. In this case, int[] a = [1,2,3]; is a compile time failure. Or it can return int[], whatever the lib wants.


Assoc arrays are the same thing. Rewrite it into assocarray!(K, V)(K[] keys, V[] values). Try to ctfe, if it works put that in the data segment (if possible, I guess the memory format of ctfe might not match.)

And if not just let the library do it.


While T[] declarations should remain just like they are now, we need them as a primitive building block (even if the array!() returns a different type), I'd change V[K] delaratiions to just be sugar over AssocArray!(K, V).



This would kill more typeinfo usage too while adding to flexibility. idk how hard it is to implement in the compiler though. My guess is if it were easy, Walter would have already done it.

June 01, 2013

Re: The non allocating D subset

Posted by Adam D. Ruppe
in reply to Adam D. Ruppe

Adam D. Ruppe

Posted in reply to Adam D. Ruppe

On Saturday, 1 June 2013 at 15:35:25 UTC, Adam D. Ruppe wrote:
> BTW I think the compiler should redo array literals to work like this:

lol then we'd probably complain about template bloat. the typeinfo+void* plan druntime uses is actually kinda nice, it avoids that and typeinfo is potentially super useful in places.

But I just want to be able to use custom types for all these language features!

June 01, 2013

Re: The non allocating D subset

Posted by Paulo Pinto
in reply to Adam D. Ruppe

Paulo Pinto

Posted in reply to Adam D. Ruppe

Am 01.06.2013 17:42, schrieb Adam D. Ruppe:
> On Saturday, 1 June 2013 at 15:35:25 UTC, Adam D. Ruppe wrote:
>> BTW I think the compiler should redo array literals to work like this:
>
> lol then we'd probably complain about template bloat. the typeinfo+void*
> plan druntime uses is actually kinda nice, it avoids that and typeinfo
> is potentially super useful in places.
>
> But I just want to be able to use custom types for all these language
> features!

I get the feeling it starts to feel like Ada then. :)

June 01, 2013

Re: The non allocating D subset

Posted by SomeDude
in reply to Paulo Pinto

SomeDude

Posted in reply to Paulo Pinto

On Saturday, 1 June 2013 at 16:02:06 UTC, Paulo Pinto wrote:
> I get the feeling it starts to feel like Ada then. :)

Adam starts with Ada !

June 01, 2013

Re: The non allocating D subset

Posted by Adam D. Ruppe
in reply to Adam D. Ruppe

Adam D. Ruppe

Posted in reply to Adam D. Ruppe

So I'm still playing with this and just realized the TypeInfos that are manually written in the real druntime could just as easily be auto-generated.

mixin(makeTypeInfo!(char[], ... more types as needed ... )());

private string makeTypeInfo(T...)() {
        if(__ctfe) { // this is here so we don't have runtime requirements of array append, etc, but can still use them in ctfe

        string code;
        foreach(t; T) {
                code ~= "class TypeInfo_" ~ t.mangleof ~ " : TypeInfo {
                        override string toString() const { return \""~t.stringof~"\"; }
                }";
        }
        return code;

        } else assert(0);
}



and so on for the other typeinfo methods. Or we could just copy/paste the typeinfos Walter wrote in the real druntime but meh.

What I'm actually doing with most these is this:

class TypeInfo_A : TypeInfo {}
class TypeInfo_i : TypeInfo {}
[etc...]


So they are pretty much useless at runtime, but the compiler often outputs references to them, so if we don't have the definition, it won't actually link.


But just thinking about runtime reflection, if we wanted to expand it, I say this is the way to go. Literally build the runtime code out of the compile time information.

Currently, you can't do something like typeid(int).isIntegral(). There is no isIntegral function, and adding it means doing a lot of boring manual stuff to add.

But with the mixin, we can just throw it in with three lines and be done with it.


I might actually do that here, just because I can. On a -version switch, of course, so you can easily opt out of the fat implementation.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation