November 02, 2014
More papers on C bounds checking:

http://llvm.org/pubs/2006-05-24-SAFECode-BoundsCheck.html

Bounds checking on flight control software for Mars expedition:

http://ti.arc.nasa.gov/m/profile/ajvenet/pldi04.pdf



November 02, 2014
Walter Bright:

> D has writefln which does not have printf's issues. There's no reason to add a feature for printf.

The feature we are talking about is not just for D writeln, as I've tried to explain several times.

And D writeln is not verified at compile-time, this is silly for a language that tries to be reliable. (Rust printing function is actually a macro and it verifies the formatting string at compile-time when possible. That's the only good enough option for me for a modern statically compiled language).


> When I look at my code,

As the designer of the language you have to look at code written by other people too! Because your D code is probably very different from mine. Take a look at Haskell code, Rust code, Erlang code, and learn new idioms and new paradigms. In the long run this will help D more than fixing a couple more bugs.


> it is very rare that I pass arguments to functions that would benefit from compile time checking.

To me this happens. It doesn't happen all the time. As usual it's not easy to quantify the frequency of such cases. (On the other hand your "very rare" is unsupported by statistical evidence as well. Your judgement is probably better than mine, of course and I respect your opinions).


> For those that might, there's always a rethinking of the
> feature, such as with printf/writefln.

I regard D writefln as currently _broken_. D has static typing, templates and compile time execution, and yet such things are not used enough in one of the most common functions, the one to print on the console. Now even GCC catches many of those printf usage bugs at compile-time.

The desire for some compile-time enforcement of some contracts is not replaced by rethinking.

Bye,
bearophile
November 02, 2014
On Sat, Nov 01, 2014 at 06:04:21PM -0700, Walter Bright via Digitalmars-d wrote:
> On 11/1/2014 3:26 PM, bearophile wrote:
> >But you can run such compile-time tests only on template arguments, or on regular arguments of functions/constructors that are forced to run at compile-time. But for me this is _not_ enough. You can't implement the printf test example he shows (unless you turn the formatting string into a template argument of printf, this introduces template bloat
> 
> D has writefln which does not have printf's issues. There's no reason to add a feature for printf.
> 
> When I look at my code, it is very rare that I pass arguments to functions that would benefit from compile time checking. For those that might, there's always a rethinking of the feature, such as with printf/writefln.

I've been thinking about refactoring writefln (well, actually std.format.formattedWrite, which includes that and more) with compile-time validated format strings. I'd say that 90% of code that uses format strings use a static format string, so there's no reason to force everyone to use runtime format strings as is currently done.

The advantages of compile-time format strings are:

1) Compile-time verification of format arguments -- passing the wrong number of arguments or arguments of mismatching type will force compilation failure. Currently, it will compile successfully but fail at runtime.

2) Minimize dependencies: the actual formatting routines needed for a particular format string can be determined at compile-time, so that only the code necessary to format that particular format string will be referenced in the generated code. This is particularly important w.r.t. function attributes: currently, you can't use std.string.format from nothrow or @nogc code, because parts of the formatting code may throw or allocate, even if your particular format string never actually reaches those parts. Analysing the format string at compile-time would enable us to decouple the @nogc/nothrow parts of format() from the allocating / throwing parts, and only pull in the latter when the format string requires it, thereby making format() usable from nothrow / @nogc code as long as your format string doesn't require allocation / formatting code that may throw.

3) Compile-time parsing of format string: instead of the runtime code parsing the format string every time, you do it only once at compile-time, and at runtime it's just sequential list of calls to the respective formatting functions without the parsing overhead. This gives a slight performance boost. Granted, this is not that big a deal, but it's a nice side-benefit of having compile-time format strings.

The best part about doing this in D is that the same codebase can be used for processing both compile-time format strings and runtime format strings, so we can minimize code duplication; whereas if it were C++, you'd have to implement format() twice, once in readable code, and once as an unreadable tangle of C++ recursive templates.


T

-- 
Answer: Because it breaks the logical sequence of discussion. Question: Why is top posting bad?
November 02, 2014
Walter Bright:

Thank you for your answers.

>> D removes very little bound checks. No data flow is used for this.
>
> This is false.

Oh, good, what are the bound checks removed by the D front-end? I remember only one case (and I wrote the enhancement request for it). Recently I argued that we should add a little more removal of redundant bound checks. But probably the "GC and C++" mantra is more urgent thank everything else.


> This is on purpose, because otherwise about half of what enums are used for would no longer be possible - such as bit flags.

On the other hand we could argue that bit flags are a sufficiently different purpose to justify an annotation (as in C#) or a Phobos struct (like for the bitfields) that uses mixin that implements them (there is a pull request for Phobos, but I don't know how much good it is).


>> D module system has holes like Swiss cheese. And its design is rather simplistic.
>
> Oh come on.

ML modules are vastly more refined than D modules (and more refined than modules in most other languages). I am not asking to put ML-style modules in D (because ML modules are too much complex for C++/Python programmers and probably even unnecessary given the the kind of template-based generics that D has), but arguing that D modules are refined is unsustainable. (And I still hope Kenji fixes some of their larger holes).


>>> - no implicit type conversions
>> D has a large part of the bad implicit type conversions of C.
>
> D has removed implicit conversions that result in data loss. Removing the rest would force programs to use casting instead, which is far worse.

This is a complex situation, there are several things that are suboptimal in D management of implicit casts (one example is the signed/unsigned comparison situation). But I agree with you that this situation seems to ask for a middle ground solution. Yet there are functional languages without implicit casts (is Rust allowing implicit casts?), they use two kinds of casts, the safe and unsafe casts. I think the size casting that loses bits is still regarded as safe.


>>> - had a sane macro system
>> There's no macro system in D. Mixins are an improvement over the preprocessor,
>> but they lead to messy code.
>
> D doesn't have AST macros for very deliberate reasons, discussed at length here. It is not an oversight.

I am not asking for AST macros in D. I was just answering to a laundry list of things that C doesn't have (I was answering that D doesn't either).


>>> But I guess D already covers it...
>> D solves only part of the problems. And you have not listed several important
>> things. There's still a lot of way to go to have good enough system languages.
>
> D does more than any other system language.

Perhaps this is true (despite Rust is more more refined/safer regarding memory tracking), that's why I am using D instead of other languages, despite all the problems. But fifteen years from now I hope to use something much better than D for system programming :-)

Bye,
bearophile
November 02, 2014
On 11/1/2014 5:56 PM, "Ola Fosheim Grøstad" <ola.fosheim.grostad+dlang@gmail.com>" wrote:
> On Sunday, 2 November 2014 at 00:47:16 UTC, Walter Bright wrote:
>> On 11/1/2014 4:04 PM, "Ola Fosheim Grøstad"
>> <ola.fosheim.grostad+dlang@gmail.com>" wrote:
>>> Anyway, I believe you can turn on bound checks with some C-compilers if you want
>>> it,
>>
>> Won't work, because C arrays decay to pointers whenever passed to a function,
>> so you lose all hope of bounds checking except in the most trivial of cases.
>
> There are bounds-checking extensions to GCC.

Yup, -fbounds-check, and it only works for local arrays. Once the array is passed to a function, poof! no more bounds checking.

http://www.delorie.com/gnu/docs/gcc/gcc_13.html
November 02, 2014
On 11/1/2014 6:05 PM, "Ola Fosheim Grøstad" <ola.fosheim.grostad+dlang@gmail.com>" wrote:
> On Sunday, 2 November 2014 at 00:56:37 UTC, Ola Fosheim Grøstad wrote:
>> On Sunday, 2 November 2014 at 00:47:16 UTC, Walter Bright wrote:
>>> On 11/1/2014 4:04 PM, "Ola Fosheim Grøstad"
>>> <ola.fosheim.grostad+dlang@gmail.com>" wrote:
>>>> Anyway, I believe you can turn on bound checks with some C-compilers if you
>>>> want
>>>> it,
>>>
>>> Won't work, because C arrays decay to pointers whenever passed to a function,
>>> so you lose all hope of bounds checking except in the most trivial of cases.
>>
>> There are bounds-checking extensions to GCC.


I proposed a C extension, too.

   http://www.drdobbs.com/architecture-and-design/cs-biggest-mistake/228701625
November 02, 2014
On Sat, Nov 01, 2014 at 05:53:00PM -0700, Walter Bright via Digitalmars-d wrote:
> On 11/1/2014 3:32 PM, bearophile wrote:
> >Paulo Pinto:
[...]
> >>- no implicit type conversions
> >D has a large part of the bad implicit type conversions of C.
> 
> D has removed implicit conversions that result in data loss. Removing the rest would force programs to use casting instead, which is far worse.
[...]

While D has removed *some* of the most egregious implicit conversions in C/C++, there's still room for improvement.

For example, D still has implicit conversion between signed and unsigned types, which is a source of bugs. I argue that using casts to convert between signed and unsigned is a good thing, because it highlights the fact that things might go wrong, whereas right now, the compiler happily accepts probably-wrong code like this:

	uint x;
	int y = -1;
	x = y;	// accepted with no error

D also allows direct assignment of non-character types to character types and vice versa, which is another source of bugs:

	int x = -1;
	dchar c = x; // accepted with no error

Again, requiring the use of a cast in this case is a good thing. It highlights an operation that may potentially produce wrong or unexpected results. It also self-documents the intent of the code, rather than leaving it as a trap for the unwary.

On the other hand, D autopromotes arithmetic expressions involving sub-int quantities to int, thus requiring ugly casts everywhere such arithmetic is employed:

	byte a, b;
	a = b - a;	// Error: cannot implicitly convert expression (cast(int)b - cast(int)a) of type int to byte

You are forced to write this instead:

	byte a, b;
	a = cast(byte)(b - a);

I know the rationale is to prevent inadvertent overflow of byte values, but if we're going to be so paranoid about correctness, why not also require explicit casts for conversion between signed/unsigned, or between character and non-character values, which are just as error-prone? Besides, expressions like (b-a) can overflow for int values too, yet the compiler happily accepts them rather than promoting to long and requiring casts.


T

-- 
What do you mean the Internet isn't filled with subliminal messages? What about all those buttons marked "submit"??
November 02, 2014
On Sun, Nov 02, 2014 at 01:25:23AM +0000, bearophile via Digitalmars-d wrote: [...]
> I regard D writefln as currently _broken_. D has static typing, templates and compile time execution, and yet such things are not used enough in one of the most common functions, the one to print on the console. Now even GCC catches many of those printf usage bugs at compile-time.
[...]

GCC verification of printf usage bugs is a hack. It's something hardcoded into the compiler that only works for printf formats. You cannot extend it to statically verify other types of formats you might want to also verify at compile-time.

While writefln can be improved (Andrei has preapproved my enhancement request to support compile-time format string, for example), there's no way to make such improvements to GCC's format checking short of modifying the compiler itself.


T

-- 
Real Programmers use "cat > a.out".
November 02, 2014
On 11/1/2014 6:25 PM, bearophile wrote:
> As the designer of the language you have to look at code written by other people
> too! Because your D code is probably very different from mine. Take a look at
> Haskell code, Rust code, Erlang code, and learn new idioms and new paradigms. In
> the long run this will help D more than fixing a couple more bugs.

I don't see the use cases, in mine or other code. There's a reason why people always trot out printf - it's about the only one. Designing a language feature around printf is a mistake.


>> it is very rare that I pass arguments to functions that would benefit from
>> compile time checking.
>
> To me this happens. It doesn't happen all the time. As usual it's not easy to
> quantify the frequency of such cases. (On the other hand your "very rare" is
> unsupported by statistical evidence as well. Your judgement is probably better
> than mine, of course and I respect your opinions).

I've considered the feature, and looked at code. It just doesn't happen very often.

All features have a cost/benefit to them. The costs are never zero. Laying on more and more features of minor benefit will destroy the language, and even you won't use it.


>> For those that might, there's always a rethinking of the
>> feature, such as with printf/writefln.
>
> I regard D writefln as currently _broken_.

Oh come on. writefln is typesafe and will not crash.

You could also write:

   formattedwrite!"the format string %s %d"(args ...)

if you like. The fact that nobody has bothered to suggests that it doesn't add much value over writefln().

November 02, 2014
> * D does the check function thing using compile time function execution
> to check template arguments.
>
> * D also has full compile time function execution - it's a very heavily
> used feature. It's mainly used for metaprogramming, introspection,
> checking of template arguments, etc. Someone has written a ray tracer
> that runs at compile time in D. D's compile time execution doesn't go as
> far as running external functions in DLLs.

The video has actually got me thinking about how we can expand CTFE's capabilities while also keeping it secure-ish.

As an example having blocks such as:

__ctfe {
	pragma(msg, __ctfe.ast.getModules());
}

Could output at compile time all the modele names that's being compiled currently.
The way I'm looking at it is that files act how they do now but will ignore __ctfe blocks unless that file was passed with e.g. -ctfe=mymodule.d

Of course how we get symbols ext. into it is another thing all together. Compiler plugin? maybe. Or we do the dirty and go for extern support.

> * D has static assert, which runs the code at compile time, too. The
> space invaders won't run at compile time, because D's compile time code
> running doesn't call external functions in DLLs. I actually suspect that
> could be a problematic feature, because it allows the compiler to
> execute user supplied code which can do anything to your system - a
> great vector for supplying malware to an unsuspecting developer. The
> ascii_map function will work, however.

You really don't want arbitrary code to run with access to system libs. Agreed.

A __ctfe block could be rather interesting in that it can only exist at compile time and it is known it will execute only when it is passed via -ctfe.

Could also remove part of my need for livereload where it creates a file specifically to tell the binary what modules is compiled in. Not to mention gets round the whole but how do you know its the final compilation yada yada ya. Doesn't matter.

In the context of dub, to make it safe by default just require a --with-ctfe switch on e.g. build.

For people like me this would be really huge. Like ridiculously. But at the same time, I don't believe its a good idea to make it so easy that we have people writing games to run at compile time and being multi threaded.

Of course this does raise one question, about __traits compared to __ctfe.ast functionality.
Could be a little double up ish but at the same time, you shouldn't be able to use __ctfe.ast outside of a __ctfe block. For reference, __traits is a missing a LOT to the point I couldn't properly create a ctfe uml generator.

So recap: suggestion allowing __ctfe blocks that can run code at compile time which can utilise external code such as c functions. But to add them they must be specifically enabled on the compiler.
The purpose of having such functionality is for generation of document or registration of routes without any form of explicit registration.
Perhaps even going so far as to say, don't bother importing e.g. Cmsed if you use @Route UDA on a function.
Needs to be refined a lot, but could open up a lot of opportunities here.