November 02, 2014
On 11/1/2014 6:41 PM, bearophile wrote:
> Walter Bright:
>
> Thank you for your answers.
>
>>> D removes very little bound checks. No data flow is used for this.
>>
>> This is false.
>
> Oh, good, what are the bound checks removed by the D front-end?

It does some flow analysis based on previous bounds checks.


>> This is on purpose, because otherwise about half of what enums are used for
>> would no longer be possible - such as bit flags.
>
> On the other hand we could argue that bit flags are a sufficiently different
> purpose to justify an annotation (as in C#) or a Phobos struct (like for the
> bitfields) that uses mixin that implements them (there is a pull request for
> Phobos, but I don't know how much good it is).

More annotations => more annoyance for programmers. Jonathan Blow characterizes this as "friction" and he's got a very good point. Programmers have a limited tolerance for friction, and D must be very careful not to step over the line into being a "bondage and discipline" language that nobody uses.


>>> D module system has holes like Swiss cheese. And its design is rather
>>> simplistic.
>>
>> Oh come on.
>
> ML modules are vastly more refined than D modules (and more refined than modules
> in most other languages). I am not asking to put ML-style modules in D (because
> ML modules are too much complex for C++/Python programmers and probably even
> unnecessary given the the kind of template-based generics that D has), but
> arguing that D modules are refined is unsustainable. (And I still hope Kenji
> fixes some of their larger holes).

I didn't say they were "refined", whatever that means. I did take issue with your characterization. I don't buy the notion that more complex is better. Simple and effective is the sweet spot.


>>>> - no implicit type conversions
>>> D has a large part of the bad implicit type conversions of C.
>>
>> D has removed implicit conversions that result in data loss. Removing the rest
>> would force programs to use casting instead, which is far worse.
> This is a complex situation, there are several things that are suboptimal in D
> management of implicit casts (one example is the signed/unsigned comparison
> situation).

It is not suboptimal. There are lot of tradeoffs with this, and it has been discussed extensively. D is at a reasonable optimum point for this. The implication that this is thoughtlessly thrown together against all reason is just not correct.


> I think the size casting that loses bits is still regarded as safe.

It is memory safe.

November 02, 2014
On Sunday, 2 November 2014 at 01:43:32 UTC, Walter Bright wrote:
>> There are bounds-checking extensions to GCC.
>
> Yup, -fbounds-check, and it only works for local arrays. Once the array is passed to a function, poof! no more bounds checking.

No.

Please read the links.

There are solutions that do full checking by checking every pointer access at runtime. And there are other solutions.
November 02, 2014
On 11/1/2014 11:13 PM, "Ola Fosheim Grøstad" <ola.fosheim.grostad+dlang@gmail.com>" wrote:
> On Sunday, 2 November 2014 at 01:43:32 UTC, Walter Bright wrote:
>>> There are bounds-checking extensions to GCC.
>>
>> Yup, -fbounds-check, and it only works for local arrays. Once the array is
>> passed to a function, poof! no more bounds checking.
>
> No.
>
> Please read the links.
>
> There are solutions that do full checking by checking every pointer access at
> runtime. And there are other solutions.

Yeah, I looked at them. For example, http://www3.imperial.ac.uk/pls/portallive/docs/1/18619746.PDF has the money quote:

"The ’A’ series, which is a group of classic artificial benchmarks, and the ’B’ series, which is a selection of CPU-intensive real-world code, performed particularly poorly, ranging from several hundred to several thousand times slower."

This is not a solution. C has successfully resisted all attempts to add bounds checking.

November 02, 2014
On Sunday, 2 November 2014 at 06:39:14 UTC, Walter Bright wrote:
> This is not a solution. C has successfully resisted all attempts to add bounds checking.

That was a student project, but the paper presented an overview of techniques which is why I linked to it. A realistic solution is probably at 10-50 times slower on regular hardware and is suitable for debugging, and you can probably improve it a lot using global semantic analysis.

To quote the Nasa paper's conclusion:

«We have shown in this paper that the array bound checking of large C programs can be performed with a high level of precision (around 80%) in nearly the same time as compilation. The key to achieve this result is the specialization of the analysis towards a particular family of software.»

So no, C has not resisted all attempts at adding bounds checking.

People are doing it.
November 02, 2014
Am 02.11.2014 um 02:23 schrieb "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= <ola.fosheim.grostad+dlang@gmail.com>":
> More papers on C bounds checking:
>
> http://llvm.org/pubs/2006-05-24-SAFECode-BoundsCheck.html
>
> Bounds checking on flight control software for Mars expedition:
>
> http://ti.arc.nasa.gov/m/profile/ajvenet/pldi04.pdf
>
>
>

The amount of money that went into such (bad) design decision...

And it won't stop bleeding so long C and C++ exist.
November 02, 2014
On Sunday, 2 November 2014 at 07:29:25 UTC, Paulo Pinto wrote:
> The amount of money that went into such (bad) design decision...
>
> And it won't stop bleeding so long C and C++ exist.

Yes, that is true (if we ignore esoteric C dialects that add safer features). Ada is a better solution if you want reliable software.

On the plus side: the effort that goes into semantic analysis of C probably bring about some knowledge that is generally useful. But it is expensive, agree.
November 02, 2014
On 11/2/2014 12:06 AM, "Ola Fosheim Grøstad" <ola.fosheim.grostad+dlang@gmail.com>" wrote:
> On Sunday, 2 November 2014 at 06:39:14 UTC, Walter Bright wrote:
>> This is not a solution. C has successfully resisted all attempts to add bounds
>> checking.
>
> That was a student project, but the paper presented an overview of techniques
> which is why I linked to it.

Sorry, I had presumed you intended the links to be practical, workable solutions.


> A realistic solution is probably at 10-50 times
> slower on regular hardware and is suitable for debugging, and you can probably
> improve it a lot using global semantic analysis.
>
> To quote the Nasa paper's conclusion:
>
> «We have shown in this paper that the array bound checking of large C programs
> can be performed with a high level of precision (around 80%) in nearly the same
> time as compilation. The key to achieve this result is the specialization of the
> analysis towards a particular family of software.»
>
> So no, C has not resisted all attempts at adding bounds checking.
>
> People are doing it.

10 to 50 times slower is not a solution. If your app can stand such a degradation, it would be better off written in Python. If there was a practical solution for C, it likely would have been incorporated into clang and gcc.
November 02, 2014
On Sunday, 2 November 2014 at 08:22:08 UTC, Walter Bright wrote:
> 10 to 50 times slower is not a solution. If your app can stand such a degradation, it would be better off written in Python. If there was a practical solution for C, it likely would have been incorporated into clang and gcc.

Python is a dynamic language… so I don't think it is more stable than C at runtime, but the consequences are less severe.

For a practical solution, this paper suggests just checking bounds when you write to an array as a trade off:

http://www4.comp.polyu.edu.hk/~csbxiao/paper/2005/ITCC-05.pdf

There are also some proprietary C compilers for embedded programming that claim to support bound checks, but I don't know how far they go or if they require language extensions/restrictions.
November 02, 2014
On Sunday, 2 November 2014 at 08:39:26 UTC, Ola Fosheim Grøstad wrote:
> There are also some proprietary C compilers for embedded programming that claim to support bound checks, but I don't know how far they go or if they require language extensions/restrictions.

Btw, related to this is the efforts on bounded model checking:

http://llbmc.org/files/papers/VSTTE12.pdf

LLBMC apparently takes LLVM IR as input and checks the program using a SMT solver. Basically the same type of solver that proof systems use.

This is of course a more challenging problem than arrays as it aims to check a lot of things at the cost of putting some limits on recursion depth etc:

- Arithmetic overflow and underflow
- Logic or arithmetic shift exceeding the bit-width
- Memory access at invalid addresses
- Invalid memory allocation
- Invalid memory de-allocation
- Overlapping memory regions in memcpy
- Memory leaks
- User defined assertions
- Insufficient specified bounds for the checker
- C assert()

November 02, 2014
On Sunday, 2 November 2014 at 01:28:15 UTC, H. S. Teoh via Digitalmars-d wrote:
>
> 1) Compile-time verification of format arguments -- passing the wrong
> number of arguments or arguments of mismatching type will force
> compilation failure. Currently, it will compile successfully  but fail at
> runtime.

+1000! That would be awesome!

It would be a _great_ boost in productivity during the debugging phase, or when we are under pressure and can't do a great job in code coverage.

---
Paolo