August 22, 2022
On Monday, 22 August 2022 at 11:41:33 UTC, Steven Schveighoffer wrote:
> On 8/21/22 12:51 PM, Walter Bright wrote:
>> Consider the following pattern, which doesn't appear frequently, but frequently enough:
>> 
>>      double x;
>>      if (condition) {
>>          x = 5;
>>          ...
>>      }
>>      ...               // (1)
>>      if (condition) {
>>         foo(x);
>>      }
>> 
>> Imagine there's a lot more code omitted which obscures the pattern. This code is correct.
>> 
>> Now, maintainer adds `bar(x)` at (1).
>> 
>> The scenarios:
>> 
>> 1. x is default initialized to NaN. bar(x) produces a NaN result on everything dependent on x. User knows there's a problem.
>
> How? No exception is thrown, no error occurs. Unless bar somehow checks for NaN, nothing happens. Just like it is for 0. Perhaps it saves the result of bar computation to something for later use. Then that now gets propagated to some other use, and then, deep somewhere, NaN appears in something that appears completely unrelated to bar or x. How do you trace it back?
>

Assuming this isn't a rhetoric question...
It's not as convenient as a segfault but at some point, the error becomes obvious. I would start there to inspect variables, identify the NaNs.
Then I would trace them in a debugger and go up the call chain until I find the location where it became NaN. Then I would identify the source which introduced the NaN and trace that back until I found its origin.

The advantage I see in NaN is that it's always (instead of only almost always) immediately obvious that it's wrong whereas 0.0 can be valid or invalid so you need to figure out which one it is which requires an extra step.

August 22, 2022
On 8/22/22 14:41, Steven Schveighoffer wrote:
> On 8/21/22 12:51 PM, Walter Bright wrote:
>> Consider the following pattern, which doesn't appear frequently, but frequently enough:
>>
>>      double x;
>>      if (condition) {
>>          x = 5;
>>          ...
>>      }
>>      ...               // (1)
>>      if (condition) {
>>         foo(x);
>>      }
>>
>> Imagine there's a lot more code omitted which obscures the pattern. This code is correct.
>>
>> Now, maintainer adds `bar(x)` at (1).
>>
>> The scenarios:
>>
>> 1. x is default initialized to NaN. bar(x) produces a NaN result on everything dependent on x. User knows there's a problem.
> 
> How? No exception is thrown, no error occurs. Unless bar somehow checks for NaN, nothing happens. Just like it is for 0. Perhaps it saves the result of bar computation to something for later use. Then that now gets propagated to some other use, and then, deep somewhere, NaN appears in something that appears completely unrelated to bar or x. How do you trace it back?

For example, user has a callstack where called functions has these return values:

float.init == 0				float.init == NaN

	13 <==  wrong result detected here  ==> NaN
	0				 	NaN
	39					NaN
	56					NaN
	9 <==   wrong initialization here   ==> NaN
	12					12
	0				 	0
	0				 	0

In which case you find the reason faster?

> 
>> 2. x is default initialized to 0. bar(0) may exhibit problems, but these problems won't necessarily be noticed.
> 
> Just like NaN.

No. NaN is not a number but zero is a number. zero may be both wrong and right value. NaN is never right result. To check if zero is right result you need manually calculate it. In case of NaN all you need is to take a look at it.


> 
>> 3. compiler complains that `double x;` needs an initializer. To shut up the compiler, the user initializes it to 0, without putting much thought into it. bar(0) may exhibit problems, but these won't necessarily be noticed.
>>
>> 4. compiler complains that `double x;` needs an initializer. Coder just schlepps in a 0. Yes, this happens. Maintainer wastes time wondering why x is initialized to 0, as that may be a nonsense value for x. Maintainer wonders if this unused initialization has a purpose, maybe it is the result of a bad refactoring? Wastes more time investigating it.
> 
> Huh? Why are there 2 identical situations here?
> 
> I'll also point out that not initializing an integer is sometimes intentionally done (because it's equivalent to initializing to 0). If I see someone didn't assign a value to an int, I don't question if it was an accident, I expect that they meant it.
> 
> Also, you forgot:
> 
> 5. Maintainer expected x to default to 0 (because that's what most types do), and expected bar to be called with 0 or 5. Now, since bar saved the result of it's calculation elsewhere, and then far away from this code, the result is used in some computation that finally makes its way to output in some fashion (and possibly not a specific printing of the value), now there's a puzzle to solve, and no way to know it can be traced back to x without hours/days of searching.
> 
>> D chose option 1.
> 
> And there's probably no way that it changes. But in my mind the correct answer is to intialize to 0.
> 
> -Steve

August 22, 2022

On 8/22/22 9:17 AM, wjoe wrote:

>

On Monday, 22 August 2022 at 11:41:33 UTC, Steven Schveighoffer wrote:

>

On 8/21/22 12:51 PM, Walter Bright wrote:

>

Consider the following pattern, which doesn't appear frequently, but frequently enough:

     double x;
     if (condition) {
         x = 5;
         ...
     }
     ...               // (1)
     if (condition) {
        foo(x);
     }

Imagine there's a lot more code omitted which obscures the pattern. This code is correct.

Now, maintainer adds bar(x) at (1).

The scenarios:

  1. x is default initialized to NaN. bar(x) produces a NaN result on everything dependent on x. User knows there's a problem.

How? No exception is thrown, no error occurs. Unless bar somehow checks for NaN, nothing happens. Just like it is for 0. Perhaps it saves the result of bar computation to something for later use. Then that now gets propagated to some other use, and then, deep somewhere, NaN appears in something that appears completely unrelated to bar or x. How do you trace it back?

Assuming this isn't a rhetoric question...

It's not.

>

It's not as convenient as a segfault but at some point, the error becomes obvious.

Does it? And "at some point" it becomes obvious no matter what the error you made.

>

I would start there to inspect variables, identify the NaNs.

What if you can't? What if it only happens randomly, and you don't just happen to have a debugger attached?

I'm not saying it's easier with 0, but just not any different.

>

Then I would trace them in a debugger and go up the call chain until I find the location where it became NaN. Then I would identify the source which introduced the NaN and trace that back until I found its origin.

If you have a chance to use a debugger and it happens at that time.

>

The advantage I see in NaN is that it's always (instead of only almost always) immediately obvious that it's wrong whereas 0.0 can be valid or invalid so you need to figure out which one it is which requires an extra step.

It might be noticed that it's NaN. It also might not. It depends on how it's used.

Either way, you need to find the source, and NaN doesn't help unless you want to start either instrumenting all code (possibly including code you don't control), or use a debugger (which isn't always possible).

Can we have some kind of linting system that identifies NaNs that are used?

-Steve

August 22, 2022

On 8/22/22 9:33 AM, drug007 wrote:

>

On 8/22/22 14:41, Steven Schveighoffer wrote:

>

On 8/21/22 12:51 PM, Walter Bright wrote:

>

Consider the following pattern, which doesn't appear frequently, but frequently enough:

     double x;
     if (condition) {
         x = 5;
         ...
     }
     ...               // (1)
     if (condition) {
        foo(x);
     }

Imagine there's a lot more code omitted which obscures the pattern. This code is correct.

Now, maintainer adds bar(x) at (1).

The scenarios:

  1. x is default initialized to NaN. bar(x) produces a NaN result on everything dependent on x. User knows there's a problem.

How? No exception is thrown, no error occurs. Unless bar somehow checks for NaN, nothing happens. Just like it is for 0. Perhaps it saves the result of bar computation to something for later use. Then that now gets propagated to some other use, and then, deep somewhere, NaN appears in something that appears completely unrelated to bar or x. How do you trace it back?

For example, user has a callstack where called functions has these return values:

float.init == 0                float.init == NaN

    13 <==  wrong result detected here  ==> NaN
    0                     NaN
    39                    NaN
    56                    NaN
    9 <==   wrong initialization here   ==> NaN
    12                    12
    0                     0
    0                     0

In which case you find the reason faster?

Callstack printouts don't look like that. Plus, what if you don't have the call stack available for inspection? And even if you do, that 9 might be just as obviously wrong as the NaN, negating any real benefit.

One thing that everyone seems to be ignoring is that 99% of the time, when I find out I didn't initialize a float and it's NaN, it's because I didn't correctly initialize it to 0.

So yes, when a lack of initialization somewhere in some code has happened and is a mistake, a NaN starting value can make things slightly easier, as long as you have everything instrumented, and can use a debugger. But when it happens is reduced to near zero if the default value is the expected 0.

> > >
  1. x is default initialized to 0. bar(0) may exhibit problems, but these problems won't necessarily be noticed.

Just like NaN.

No. NaN is not a number but zero is a number. zero may be both wrong and right value. NaN is never right result. To check if zero is right result you need manually calculate it. In case of NaN all you need is to take a look at it.

As I've mentioned, you don't always see the NaN, just like you don't always see the 0.

Imagine you are making a 3-d model, and one vertex out of 100k is NaN. How will you notice it? A single missing triangle somewhere?

But make that vertex 0, and all of a sudden your model has this weird triangle sticking out extending to the origin, and is completely obvious.

-Steve

August 22, 2022
On 8/22/22 17:16, Steven Schveighoffer wrote:
> On 8/22/22 9:33 AM, drug007 wrote:
>> For example, user has a callstack where called functions has these return values:
>>
>> float.init == 0                float.init == NaN
>>
>>      13 <==  wrong result detected here  ==> NaN
>>      0                     NaN
>>      39                    NaN
>>      56                    NaN
>>      9 <==   wrong initialization here   ==> NaN
>>      12                    12
>>      0                     0
>>      0                     0
>>
>> In which case you find the reason faster?
> 
> Callstack printouts don't look like that. Plus, what if you don't have 

Of course it is not a real call stack

> the call stack available for inspection? And even if you do, that 9 

you can use printf debugging

> might be just as obviously wrong as the NaN, negating any real benefit.
>
Yes, that's the point! 9 might be obviously. But NaN would be obviously.

> One thing that everyone seems to be ignoring is that 99% of the time, when I find out I didn't initialize a float and it's NaN, it's because I didn't correctly initialize it to 0.
> 

I don't agree. There are cases where default initialization is non-zero.

> So yes, when a lack of initialization somewhere in some code has happened *and is a mistake*, a NaN starting value can make things slightly easier, as long as you have everything instrumented, and can use a debugger. But *when* it happens is reduced to near zero if the default value is the expected 0.
> 

you are exaggregating a little. instrumentation is not required and printf debugging is always available

>>>> 2. x is default initialized to 0. bar(0) may exhibit problems, but these problems won't necessarily be noticed.
>>>
>>> Just like NaN.
>>
>> No. NaN is not a number but zero is a number. zero may be both wrong and right value. NaN is never right result. To check if zero is right result you need manually calculate it. In case of NaN all you need is to take a look at it.
> 
> As I've mentioned, you don't always see the NaN, just like you don't always see the 0.
> 
> Imagine you are making a 3-d model, and one vertex out of 100k is NaN. How will you notice it? A single missing triangle somewhere?
> 
> But make that vertex 0, and all of a sudden your model has this weird triangle sticking out extending to the origin, and is completely obvious.
> 

Yes, in this specific case you right. But what if some of your valid vertices might be zero too?

> -Steve

August 22, 2022
On Sunday, 21 August 2022 at 17:56:58 UTC, drug007 wrote:
> On 8/21/22 20:28, claptrap wrote:
>> On Sunday, 21 August 2022 at 16:51:51 UTC, Walter Bright wrote:
>>> Consider the following pattern, which doesn't appear
>>>
>>>
>>> 1. x is default initialized to NaN. bar(x) produces a NaN result on everything dependent on x. User knows there's a problem.
>>>
>>> 2. x is default initialized to 0. bar(0) may exhibit problems, but these problems won't necessarily be noticed.
>> 
>> This is the problem, you suggest that if a variable is zero initialised in error the problems it causes "wont necessarily" be noticed.
>> 
>> I'm saying that's not true, I'm saying it will almost always be noticed.
>> 
>> 
>> 
>
> It will be noticed but what price? You've initialized all vars to 0 so how do you know that this exactly initialization to zero is wrong?

You dont initialise all variables to zero, Ive just looked at some om my code and in 4000 lines i found two default init ints and maybe 50+ explicitly initialised. You're just inventing nonsense scenarios.

And seriously if you're looking a variable and dont know what value it should be initialised too you literally *dont know what your doing*.


> To detect it you should track down manually checking the intermediate results that is manually calculate results and compare to what you get. It takes much more time than checking if the value is NaN.

Occasionally you might have to do a bit of mental arithmetic, but not often, I'm seriously wondering why you think it's so hard?



August 22, 2022
On 8/22/22 18:04, claptrap wrote:
> On Sunday, 21 August 2022 at 17:56:58 UTC, drug007 wrote:
>> On 8/21/22 20:28, claptrap wrote:
>>> On Sunday, 21 August 2022 at 16:51:51 UTC, Walter Bright wrote:
>>>> Consider the following pattern, which doesn't appear
>>>>
>>>>
>>>> 1. x is default initialized to NaN. bar(x) produces a NaN result on everything dependent on x. User knows there's a problem.
>>>>
>>>> 2. x is default initialized to 0. bar(0) may exhibit problems, but these problems won't necessarily be noticed.
>>>
>>> This is the problem, you suggest that if a variable is zero initialised in error the problems it causes "wont necessarily" be noticed.
>>>
>>> I'm saying that's not true, I'm saying it will almost always be noticed.
>>>
>>>
>>>
>>
>> It will be noticed but what price? You've initialized all vars to 0 so how do you know that this exactly initialization to zero is wrong?
> 
> You dont initialise all variables to zero, Ive just looked at some om my code and in 4000 lines i found two default init ints and maybe 50+ explicitly initialised. You're just inventing nonsense scenarios.

But that is my point - not all variables initialize to zero. It is my statement that this is nonsense scenarios. Reread the post carefully.

> 
> And seriously if you're looking a variable and dont know what value it should be initialised too you literally *dont know what your doing*.
> 
> 
>> To detect it you should track down manually checking the intermediate results that is manually calculate results and compare to what you get. It takes much more time than checking if the value is NaN.
> 
> Occasionally you might have to do a bit of mental arithmetic, but not often, I'm seriously wondering why you think it's so hard?
> 

Just because I've done math calculations before? And no, I didn't mean mental arithmetic. I meant numerical matrix operations from inputs to outputs just to track down where was wrong zero initialization. In some cases zero initialization is invalid, for example covariance of random variables. But NaN is invalid always. That is its advantage.

August 22, 2022
On Monday, 22 August 2022 at 15:04:25 UTC, claptrap wrote:

> You dont initialise all variables to zero, Ive just looked at some om my code and in 4000 lines i found two default init ints and maybe 50+ explicitly initialised.

So, if you rarely ever use default initialization, why do you care to which value it might be initialized?

For the analysis which default value is the better one, only the cases where it is used matters.
You need only count how often is a variable not initialized, and of those how many times is this (a) a bug or (b) intended.

Now imagine you review code from someone else (e.g. at an assessment). In this case you have to carefully check every case of uninitialized variable. If the code always explicitly say =void or =0, there are no such cases, so you might save a LOT of time.

So better get used to ALWAYS initialize explicitly and consider EVERY occurrence of uninitialized variable to be a bug. And bogus code should NEVER result in a value that could also occur as a valid result, so the compiler should help as much as it can by using an invalid value if one is available.
August 22, 2022

On Saturday, 20 August 2022 at 03:12:43 UTC, Steven Schveighoffer wrote:

>

In other words, NaN is silent. You can't even assert(x != double.init). You have to use an esoteric function isNaN for that.

I had some fun with isNaN the other day. We used it to check for initialisation in an access function to cache an expensive computation. This worked brilliantly until we noticed a malfunction in the release version. It took a while until I realised that I had given the LDC fastmath option to the release build, which assumes NaN does not occur, which makes isNaN misbehave.

What I learned from this is to not use this flag globally, and add select attributes to select functions instead. And instead of using NaN I now use std.typecons.Nullable to signal a dirty cache.

— Bastiaan.

August 22, 2022

On Monday, 22 August 2022 at 20:29:57 UTC, Bastiaan Veelo wrote:

>

On Saturday, 20 August 2022 at 03:12:43 UTC, Steven Schveighoffer wrote:

>

In other words, NaN is silent. You can't even assert(x != double.init). You have to use an esoteric function isNaN for that.

I had some fun with isNaN the other day. We used it to check for initialisation in an access function to cache an expensive computation. This worked brilliantly until we noticed a malfunction in the release version. It took a while until I realised that I had given the LDC fastmath option to the release build, which assumes NaN does not occur, which makes isNaN misbehave.

Is there a simple example of this behavior?