View mode: basic / threaded / horizontal-split · Log in · Help
August 11, 2012
Re: Which D features to emphasize for academic review article
On Friday, August 10, 2012 15:10:47 Walter Bright wrote:
> What can I say? I run across this repeatedly, and that's exactly why Phobos
> (with Don's help) has its own implementations, rather than simply calling
> the corresponding C ones.

I think that it's pretty typical for programmers to think that something like 
a standard library function is essentially bug-free - especially for an older 
language like C. And unless you see results that are clearly wrong or someone 
else points out the problem, I don't know why you'd ever think that there was 
one. I certainly had no clue that C implementations had issues with floating 
point arithmetic before it was pointed out here. Regardless though, it's great 
that D gets it right.

- Jonathan M Davis
August 11, 2012
Re: Which D features to emphasize for academic review article
On Friday, 10 August 2012 at 22:11:23 UTC, Walter Bright wrote:
> On 8/10/2012 8:31 AM, TJB wrote:
>> On Thursday, 9 August 2012 at 18:35:22 UTC, Walter Bright 
>> wrote:
>>> On 8/9/2012 10:40 AM, dsimcha wrote:
>>>> I'd emphasize the following:
>>>
>>> I'd like to add to that:
>>>
>>> 1. Proper support for 80 bit floating point types. Many 
>>> compilers' libraries
>>> have inaccurate 80 bit math functions, or don't implement 80 
>>> bit floats at
>>> all. 80 bit floats reduce the incidence of creeping roundoff 
>>> error.
>>
>> How unique to D is this feature?  Does this imply that things 
>> like BLAS and
>> LAPACK, random number generators, statistical distribution 
>> functions, and other
>> numerical software should be rewritten in pure D rather than 
>> calling out to
>> external C or Fortran codes?
>
> I attended a talk given by a physicist a few months ago where 
> he was using C transcendental functions. I pointed out to him 
> that those functions were unreliable, producing wrong bits in a 
> manner that suggested to me that they were internally 
> truncating to double precision.
>
> He expressed astonishment and told me I must be mistaken.
>
> What can I say? I run across this repeatedly, and that's 
> exactly why Phobos (with Don's help) has its own 
> implementations, rather than simply calling the corresponding C 
> ones.
>
> I encourage you to run your own tests, and draw your own 
> conclusions.

Hopefully this will help make the case that D is the best choice 
for numerical programmers. I want to do my part to convince 
economists.

Another reason to implement BLAS and LAPACK in pure D is that the 
old routines like dgemm, cgemm, sgemm, and zgemm (all defined for 
different types) seem ripe for templatization.

Almost thou convinceth me ...

TJB
August 11, 2012
Re: Which D features to emphasize for academic review article
Walter Bright wrote:
> It catches only a subset of these at compile time. I can craft 
> any number of ways of getting it to miss diagnosing it. 
> Consider this one:
>
>     float z;
>     if (condition1)
>          z = 5;
>     ... lotsa code ...
>     if (condition2)
>          z++;
>
> To diagnose this correctly, the static analyzer would have to 
> determine that condition1 produces the same result as 
> condition2, or not. This is impossible to prove. So the static 
> analyzer either gives up and lets it pass, or issues an 
> incorrect diagnostic. So our intrepid programmer is forced to 
> write:
>
>     float z = 0;
>     if (condition1)
>          z = 5;
>     ... lotsa code ...
>     if (condition2)
>          z++;

Yes, but that's not really an issue since the compiler informs 
the coder of it's limitation. You're simply forced to initialize 
the variable in this situation.


> Now, as it may turn out, for your algorithm the value "0" is an 
> out-of-range, incorrect value. Not a problem as it is a dead 
> assignment, right?
>
> But then the maintenance programmer comes along and changes 
> condition1 so it is not always the same as condition2, and now 
> the z++ sees the invalid "0" value sometimes, and a silent bug 
> is introduced.
>
> This bug will not remain undetected with the default NaN 
> initialization.

I had a debate on here a few months ago about the merits of 
default-to-NaN and others brought up similar situations. but 
since we can write:

    float z = float.nan;
    ...

explicitly, then this could be thought of as a debugging feature 
available to the programmer. The problem I've always had with 
defaulting to NaN is that it's inconsistent with integer types, 
and while there may be merit to the idea of defaulting all types 
to NaN/Null, it's simply unavailable for half of the number 
spectrum. I can only speak for myself, but I much prefer 
consistency over anything else because it means there's less 
discrepancies I need to remember when hacking things together. It 
also steepens the learning curve.

More importantly, what we have now is code where bugs-- like the 
one you mentioned above --are still possible with Ints, but also 
easy to miss since "the other number type" behaves differently 
and programmers may accidentally assume a NaN will propagate 
where it will not.


> This is incorrect, as the optimizer is perfectly capable of 
> removing dead assignments like:
>
>    f = nan;
>    f = 0.0f;
>
> The first assignment is optimized away.

I thought there was some optimization by avoiding assignment, but 
IDK enough about memory at that level. Now I'm confused as to the 
point of 'float x = void' type annotations. :-\


> Whether you agree with it being a good feature or not, it is a 
> feature unique to D and merits discussion when talking about 
> D's suitability for numerical programming.

True, and I misspoke by saying it wasn't a "selling point". I 
only meant to raise issue with a feature that has been more of an 
annoyance rather than a boon to me personally. That said, I also 
agree that this thread was the wrong place to raise issue with it.
August 11, 2012
Re: Which D features to emphasize for academic review article
On 8/10/2012 9:01 PM, F i L wrote:
> I had a debate on here a few months ago about the merits of default-to-NaN and
> others brought up similar situations. but since we can write:
>
>      float z = float.nan;
>      ...

That is a good solution, but in my experience programmers just throw in an =0, 
as it is simple and fast, and they don't normally think about NaN's.

> explicitly, then this could be thought of as a debugging feature available to
> the programmer. The problem I've always had with defaulting to NaN is that it's
> inconsistent with integer types, and while there may be merit to the idea of
> defaulting all types to NaN/Null, it's simply unavailable for half of the number
> spectrum. I can only speak for myself, but I much prefer consistency over
> anything else because it means there's less discrepancies I need to remember
> when hacking things together. It also steepens the learning curve.

It's too bad that ints don't have a NaN value, but interestingly enough, 
valgrind does default initialize them to some internal NaN, making it a most 
excellent bug detector.


> More importantly, what we have now is code where bugs-- like the one you
> mentioned above --are still possible with Ints, but also easy to miss since "the
> other number type" behaves differently and programmers may accidentally assume a
> NaN will propagate where it will not.

Sadly, D has to map onto imperfect hardware :-(

We do have NaN values for chars (0xFF) and pointers (the villified 'null'). 
Think how many bugs the latter has exposed, and then think of all the floating 
point code with no such obvious indicator of bad initialization.

> I thought there was some optimization by avoiding assignment, but IDK enough
> about memory at that level. Now I'm confused as to the point of 'float x = void'
> type annotations. :-\

It would be used where the static analysis is not able to detect that the 
initializer is dead.
August 11, 2012
Re: Which D features to emphasize for academic review article
On 8/10/2012 9:32 PM, Walter Bright wrote:
> On 8/10/2012 9:01 PM, F i L wrote:
>> I had a debate on here a few months ago about the merits of default-to-NaN and
>> others brought up similar situations. but since we can write:
>>
>>      float z = float.nan;
>>      ...
>
> That is a good solution, but in my experience programmers just throw in an =0,
> as it is simple and fast, and they don't normally think about NaN's.

Let me amend that. I've never seen anyone use float.nan, or whatever NaN is in 
the language they were using. They always use =0. I doubt that yelling at them 
will change anything.
August 11, 2012
Re: Which D features to emphasize for academic review article
F i L wrote:
> Walter Bright wrote:
>> It catches only a subset of these at compile time. I can craft 
>> any number of ways of getting it to miss diagnosing it. 
>> Consider this one:
>>
>>    float z;
>>    if (condition1)
>>         z = 5;
>>    ... lotsa code ...
>>    if (condition2)
>>         z++;
>> 
>> [...]
>
> Yes, but that's not really an issue since the compiler informs 
> the coder of it's limitation. You're simply forced to 
> initialize the variable in this situation.

I just want to clarify something here. In C#, only class/struct 
fields are defaulted to a usable value. Locals have to be 
explicitly set before they're used.. so, expanding on your 
example above:

    float z;
    if (condition1)
        z = 5;
    else
        z = 6; // 'else' required

    ... lotsa code ...
    if (condition2)
        z++;

On the first condition, without an 'else z = ...', or if the 
condition was removed at a later time, then you'll get a compiler 
error and be forced to explicitly assign 'z' somewhere above 
using it. So C# and D work in "similar" ways in this respect 
except that C# catches these issues at compile-time, whereas in D 
you need to:

  1. run the program
  2. get bad result
  3. hunt down bug

NaNs in C# are "mostly" (citations needed) set to ensure fields 
are initialized in a constructor:

    class Foo
    {
        float f = float.NaN; // Can't 'f' use unless Foo is
                             // properly constructed.
    }
August 11, 2012
Re: Which D features to emphasize for academic review article
Walter Bright wrote:
> Sadly, D has to map onto imperfect hardware :-(
>
> We do have NaN values for chars (0xFF) and pointers (the 
> villified 'null'). Think how many bugs the latter has exposed, 
> and then think of all the floating point code with no such 
> obvious indicator of bad initialization.

Yes, if 'int' had a NaN state it would be great. (Though I 
remember hearing about a hardware that did support it.. 
somewhere).
August 11, 2012
Re: Which D features to emphasize for academic review article
On 8/10/2012 9:55 PM, F i L wrote:
> On the first condition, without an 'else z = ...', or if the condition was
> removed at a later time, then you'll get a compiler error and be forced to
> explicitly assign 'z' somewhere above using it. So C# and D work in "similar"
> ways in this respect except that C# catches these issues at compile-time,
> whereas in D you need to:
>
>    1. run the program
>    2. get bad result
>    3. hunt down bug

However, and I've seen this happen, people will satisfy the compiler complaint 
by initializing the variable to any old value (usually 0), because that value 
will never get used. Later, after other things change in the code, that value 
suddenly gets used, even though it may be an incorrect value for the use.
August 11, 2012
Re: Which D features to emphasize for academic review article
Walter Bright wrote:
> That is a good solution, but in my experience programmers just 
> throw in an =0, as it is simple and fast, and they don't 
> normally think about NaN's.

See! Programmers just want usable default values :-P


> It's too bad that ints don't have a NaN value, but 
> interestingly enough, valgrind does default initialize them to 
> some internal NaN, making it a most excellent bug detector.

I heard somewhere before there's actually an (Intel?) CPU which 
supports NaN ints... but maybe that's just hearsay.


> Sadly, D has to map onto imperfect hardware :-(
>
> We do have NaN values for chars (0xFF) and pointers (the 
> villified 'null'). Think how many bugs the latter has exposed, 
> and then think of all the floating point code with no such 
> obvious indicator of bad initialization.

Ya, but I don't think pointers/refs and floats are comparable 
because one is copy semantics and the other is not. Conceptually, 
pointers are only references to data while numbers are actual 
data. It makes sense that one would default to different things. 
Thought if Int did have a NaN value, I'm not sure which way I 
would side on this issue. I still think I would prefer having 
some level of compile-time indication or my errors simply because 
it saves time when you're making something.


> It would be used where the static analysis is not able to 
> detect that the initializer is dead.

Good to know.


> However, and I've seen this happen, people will satisfy the 
> compiler complaint by initializing the variable to any old 
> value (usually 0), because that value will never get used. 
> Later, after other things change in the code, that value 
> suddenly gets used, even though it may be an incorrect value 
> for the use.

Maybe the perfect solution is to have the compiler initialize the 
value to NaN, but it also does a bit of static analysis and gives 
a compiler error when it can determine your variable is being 
used before being assigned for the sake of productivity.

In fact, for the sake of consistency, you could always enforce 
that (compiler error) rule on every local variable, so even ints 
would be required to have explicit initialization before use.

I still prefer float class members to be defaulted to a usable 
value, for the sake of consistency with ints.
August 11, 2012
Re: Which D features to emphasize for academic review article
On Saturday, 11 August 2012 at 04:33:38 UTC, Walter Bright wrote:
> It's too bad that ints don't have a NaN value, but 
> interestingly enough, valgrind does default initialize them to 
> some internal NaN, making it a most excellent bug detector.

 The compiler could always have flags specifying if variables 
were used, and if they are false they are as good as NaN. Only 
downside is a performance hit unless you Mark it as a release 
binary. It really comes down to if it's worth implementing or 
considered a big change (unless it's a flag you have to specially 
turn on)

example:

  int a;

  writeln(a++); //compile-time error, or throws an exception on 
at runtime (read access before being set)

internally translated as:
  int a;
  bool _is_a_used = false;

  if (!_a__is_a_used)
    throw new exception("a not initialized before use!");
    //passing to functions will throw the exception,
    //unless the signature is 'out'
  writeln(a);

  ++a;
  _a__is_a_used= true;


> Sadly, D has to map onto imperfect hardware :-(

 Not so much imperfect hardware, just the imperfect 'human' 
variable.

> We do have NaN values for chars (0xFF) and pointers (the 
> villified 'null'). Think how many bugs the latter has exposed, 
> and then think of all the floating point code with no such 
> obvious indicator of bad initialization.
1 2 3 4 5 6
Top | Discussion index | About this forum | D home