DIP 1008 Preliminary Review Round 1 (page 3)

May 20, 2017

Re: DIP 1008 Preliminary Review Round 1

Posted by Stanislav Blinov
in reply to Jonathan M Davis

Permalink

Stanislav Blinov

Posted in reply to Jonathan M Davis

Permalink

On Saturday, 20 May 2017 at 03:54:43 UTC, Jonathan M Davis wrote:

> Because of the issue of lifetimes, some language features simply cannot be implemented without the GC, and I think don't see any point in trying to make it so that you can use all features of D without the GC. That simply won't work. By the very nature of the language, completely avoiding the GC means completely avoiding some features.

Even with the GC we have guns to shoot us in the foot with, which are .destroy() and GC.free(). The GC itself is not an issue at all, it's the lack of choice in the language that is the problem. @nogc attribute alone is not enough.

> D's dynamic arrays fundamentally require the GC, because they do not manage their own memory. They're just a pointer and a length and literally do not care what memory backs them. As long as all you're doing is slicing them and passing them around (i.e. restrict yourself to the range-based functions), then the GC is not involved, and doesn't need to be, but as soon as you concatenate or append, the GC has to be involved. For that not to be the case, dynamic arrays would have to manage their own memory (e.g. be ref-counted), which means that they could not be what they are now. A different data structure would be required.

That is not necessary. See my previous comment. We can amend the type system so it understands when it can't use the GC.

// Syntax is temporary, for illustration purposes. It is currently ambiguous with the language

int[] (nogc) myArray;

auto a = myArray ~ [1,2,3]; // error, cannot concatenate nogc and gc arrays.
auto b = myArray ~ [1,2,3] (myAllocator);
// implies compiler-generated:
// auto block = myAllocator.reallocate(myArray, (myArray.length + 3)*int.sizeof);
// handle out-of-memory, etc...
// int[] (nogc) result = (cast(int[])block.ptr)[0..myArray.length+3];
// return result;

Yes, verbose, and yes, ugly. Manual memory management is that. But just flat-out forbidding users to use certain features is no less verbose and no less ugly.

> Similarly, stuff like closures require the GC. They need something to manage their memory. They're designed to be automatic.

They were designed long ago, perhaps that design needs revisiting. They absolutely do not *have* to manage their memory. It is convenient when they do and very pleasant when working with GC. But that makes them a niche feature at best. If the user is given a little bit more control over captures, we'd get more cases when allocation is not needed. If the user is given control of the allocation itself, even better, as it gives them back a feature taken away by the GC.

Explicit (nogc) requirement can be devised for the closures too, we just need to put effort into that instead of silently ignoring it.

> Honestly, I think that this push for @nogc and manual memory mangement is toxic. Yes, we should strive to not require the GC where reasonable, but some things simply are going to require the GC to work well, and avoiding the GC very quickly gives you a lot of the problems that you have with languages like C and C++.
>
> For instance, at dconf, Atila talked about the D wrapper for excel that he wrote. He decided to use @nogc and std.exception.allocator, and not only did that make it much harder for him to come up with a good, workable design, it meant that he suddenly had to deal with memory corruption bugs that you simply never have with the GC. He felt like he was stuck programming in C++ again - only worse, because he had issues with valgrind that made it so that he couldn't effectively use it to locate his memory corruption problems.

That is *mostly* due to the lack of facilities in the language *and* standard library. It's not written with manual memory management in mind and so does not provide any ready-made primitives for that, which means you have to write your own, which means you will have bugs. At least more bugs than you would've had have you had the help.

> The GC makes it far easier to write clean, memory-safe code. It is a _huge_ boon for us to have the GC. Yes, there are cases where you can't afford to use the GC, or you have to limit its use in order for your code to be as performant as it needs to be, but that's the exception, not the norm. And avoiding the GC comes at a real cost.

All is true except the last sentence. The cost should not be huge, but for that the language has to work with us. Explicitly, without any "special cases".

> And the reality of the matter is that using the GC has real benefits, and trying to avoid it comes at a real cost, much as a number of C++ progammers want to complain and deride as soon as they hear that D has a GC. And honestly, even having @nogc all over the place won't make many of them happy, because the GC is still in the language.

If people simply want to assume the GC is bad and turn away, let them. The community or the language won't suffer from it.
OTOH, for people who do have legitimate nogc use cases, we should strive to keep as much language facilities as possible.

On Saturday, 20 May 2017 at 02:05:21 UTC, Jonathan M Davis wrote: > > 2. This really isn't going to fix the @nogc problem with exceptions without either seriously overhauling how exceptions are generated and printed or by having less informative error messages. The problem is with how exception messages are generated. They take a string, and that pretty much means that either they're given a string literal (which can be @nogc but does not allow for customizing the error message with stuff like what the bad input was), or they're given a constructed string (usually by using format) - and that can't be @nogc. > > And you can't even create an alternate constructor to get around the problem. Everything relies on the msg member which is set by the constructor. Code that wants the message accesses msg directly, and when the exception is printed when it isn't caught, it's msg that is used. Not even overiding toString gets around the issue. For instance, this code > > class E : Exception > { > this(int i, string file = __FILE__, size_t line = __LINE__) > { > super("no message", file, line); > _i = i; > } > > override string toString() > { > import std.format; > return format("The value was %s", _i); > } > > int _i; > } > > void main() > { > throw new E(42); > } > > prints > > foo.E@foo.d(20): no message > [...] > --- class E : Exception { this(int i, string file = __FILE__, size_t line = __LINE__) { super("no message", file, line); _i = i; } override void toString(scope void delegate(in char[]) sink) const { import std.format; sink(format("The value was %s", _i)); } int _i; } void main() { throw new E(42); } --- prints "The value was 42". Personally, I use a string literal for msg in the constructor, add some value members to the exception, and then override the above toString.

On Saturday, May 20, 2017 12:02:10 AM PDT Walter Bright via Digitalmars-d wrote: > On 5/19/2017 8:54 PM, Jonathan M Davis via Digitalmars-d wrote: > > And the reality of the matter is that using the GC has real benefits, > > and > > trying to avoid it comes at a real cost, much as a number of C++ > > progammers want to complain and deride as soon as they hear that D has > > a GC. And honestly, even having @nogc all over the place won't make > > many of them happy, because the GC is still in the language. > > Also, have a GC makes CTFE real nice. Yeah, especially when you're not allowed to manually allocate memory in CTFE. :) And given that Stephan thinks that ref is too hard to implement in newCTFE (IIRC from what he said at dconf anyway), I'd hate to think what it would take to allow something like malloc or free. - Jonathan M Davis

On Saturday, 20 May 2017 at 13:06:01 UTC, Jonathan M Davis wrote: > Yeah, especially when you're not allowed to manually allocate memory in CTFE. :) > > And given that Stephan thinks that ref is too hard to implement in newCTFE (IIRC from what he said at dconf anyway), I'd hate to think what it would take to allow something like malloc or free. Did I say that ? ref is working. unions and other ABI-related things will be tricky.

On Saturday, May 20, 2017 1:36:14 PM PDT Stefan Koch via Digitalmars-d wrote: > On Saturday, 20 May 2017 at 13:06:01 UTC, Jonathan M Davis wrote: > > Yeah, especially when you're not allowed to manually allocate memory in CTFE. :) > > > > And given that Stephan thinks that ref is too hard to implement in newCTFE (IIRC from what he said at dconf anyway), I'd hate to think what it would take to allow something like malloc or free. > > Did I say that ? > > ref is working. I thought you did, but I could have misremembered. Regardless, if it's working now, all the better. :) > unions and other ABI-related things will be tricky. unions are useful for certain things, but really, they're pretty terrible once you get beyond stuff like int and float. They _can_ be used safely for more complicated stuff, but it definitely does get tricky. And I'm sure that it's that much worse with CTFE. :( - Jonathan M Davis

On Friday, 19 May 2017 at 15:45:28 UTC, Mike Parker wrote: > Destroy! > In catch blocks, e is regarded as scope so that it cannot escape the catch block. > ... > Code that needs to leak the thrown exception object can clone the object. There's a contradiction here. Generic cloning cannot be implemented without storing exceptions for rethrowing later (in case member destructors throw). Furthermore: > 2. Disallowing Exception objects with postblit fields. What about fields with destructors? I detect a double-free. If we need to clone, we need postblits and destructors, or neither. It looks as if that clause is added specifically to deal with cloning. Yet no information on cloning implementation is provided.

On Saturday, 20 May 2017 at 14:59:37 UTC, Ola Fosheim Grøstad wrote: > On Saturday, 20 May 2017 at 13:36:14 UTC, Stefan Koch wrote: >> unions and other ABI-related things will be tricky. > > Isn't the unions issue quite easily solved by tagging behind the scenes? Ah tagging behind the scene is an option, it comes with runtime cost though. And tagging would disallow the tricky and very common usecase of overlaying and int and a float. This I understand, is heavily used.

On Saturday, May 20, 2017 3:05:44 PM PDT Stefan Koch via Digitalmars-d wrote: > On Saturday, 20 May 2017 at 14:59:37 UTC, Ola Fosheim Grøstad > > wrote: > > On Saturday, 20 May 2017 at 13:36:14 UTC, Stefan Koch wrote: > >> unions and other ABI-related things will be tricky. > > > > Isn't the unions issue quite easily solved by tagging behind the scenes? > > Ah tagging behind the scene is an option, it comes with runtime > cost though. > And tagging would disallow the tricky and very common usecase of > overlaying and int and a float. > > This I understand, is heavily used. std.bitmanip definitely uses that sort of trick (it also does it between a static array of ubytes) and integer types for stuff like swapping endianness). - Jonathan M Davis

On Saturday, 20 May 2017 at 15:05:44 UTC, Stefan Koch wrote: > Ah tagging behind the scene is an option, it comes with runtime cost though. (I guess you meant compile-time cost) > And tagging would disallow the tricky and very common usecase of overlaying and int and a float. > > This I understand, is heavily used. So this is allowed by the language spec? Anyway, since compile-time is harder to debug than run-time that seems to be a reasonable restriction. (C++ has this restriction as a general rule: one are only allowed to read from a union field if it was the most recently written to.)

Forums