August 22, 2022
On Monday, 22 August 2022 at 15:56:16 UTC, drug007 wrote:
> On 8/22/22 18:04, claptrap wrote:
>> On Sunday, 21 August 2022 at 17:56:58 UTC, drug007 wrote:
>>> On 8/21/22 20:28, claptrap wrote:
>>>> On Sunday, 21 August 2022 at 16:51:51 UTC, Walter Bright
>>>
>>> It will be noticed but what price? You've initialized all vars to 0 so how do you know that this exactly initialization to zero is wrong?
>> 
>> You dont initialise all variables to zero, Ive just looked at some om my code and in 4000 lines i found two default init ints and maybe 50+ explicitly initialised. You're just inventing nonsense scenarios.
>
> But that is my point - not all variables initialize to zero. It is my statement that this is nonsense scenarios. Reread the post carefully.

"You've initialized all vars to 0"

I can only respond to what you write, (which was a nonsense scenario.)


>> Occasionally you might have to do a bit of mental arithmetic, but not often, I'm seriously wondering why you think it's so hard?
>> 
>
> Just because I've done math calculations before? And no, I didn't mean mental arithmetic. I meant numerical matrix operations from inputs to outputs just to track down where was wrong zero initialization. In some cases zero initialization is invalid, for example covariance of random variables. But NaN is invalid always. That is its advantage.

I've been programming for over 30 years, mostly DSP and some numerical stuff, statistical analysis. My experience is that its not a big deal.

My point is people are massively overselling how much of a problem bad inits are to track down.
August 22, 2022
On Monday, 22 August 2022 at 19:49:02 UTC, Dom Disc wrote:
> On Monday, 22 August 2022 at 15:04:25 UTC, claptrap wrote:
>
>> You dont initialise all variables to zero, Ive just looked at some om my code and in 4000 lines i found two default init ints and maybe 50+ explicitly initialised.
>
> So, if you rarely ever use default initialization, why do you care to which value it might be initialized?

NaN or Zero I dont really care, I just got sucked in, there's a fair amount of BS floating around.
August 22, 2022
On 8/22/2022 7:06 AM, Steven Schveighoffer wrote:
>> It's not as convenient as a segfault but at some point, the error becomes obvious.
> Does it? And "at some point" it becomes obvious no matter what the error you made.

The point is, since 0.0 is a common value for a floating point value to be, just when does it become obvious that it is wrong? Are you really going to notice if your computation is 5% off? Isn't it a *lot* more obvious that it is wrong if it is NaN?


>> I would start there to inspect variables, identify the NaNs.
> What if you can't? What if it only happens randomly, and you don't just happen to have a debugger attached?
> I'm not saying it's easier with 0, but just not any different.

0.0 is hardly a rare value, even in correct calculations. NaN is always wrong.


>> Then I would trace them in a debugger and go up the call chain until I find the location where it became NaN. Then I would identify the source which introduced the NaN and trace that back until I found its origin.
> If you have a chance to use a debugger and it happens at that time.

0 initialization wouldn't make it better.


>> The advantage I see in NaN is that it's always (instead of only almost always) immediately obvious that it's wrong whereas 0.0 can be valid or invalid so you need to figure out which one it is which requires an extra step.
> It might be noticed that it's NaN. It also might not. It depends on how it's used.

NaN propagates. 0.0 does not.


> Either way, you need to find the source, and NaN doesn't help unless you want to start either instrumenting *all* code (possibly including code you don't control), or use a debugger (which isn't always possible).

Such a situation is objectively worse with 0.0. Is instrumenting all the code to detect 0.0 going to work? Nope, too many false positives, as 0.0 is a common value for floating point numbers.


> Can we have some kind of linting system that identifies NaNs that are used?

I have no objection to a linter that flags default initializations of floating point values. It shouldn't be part of the D compiler, though.

August 22, 2022
On 8/20/2022 3:17 PM, Dukc wrote:
> it goes against what C, C++ and C# (and probably many other languages) do so one is easily surprised.

C and C++ initialize them to garbage unless it is a global/static variable. That's probably the worst option, as it results in Heisenbugs that are very hard to track down.
August 22, 2022

On 8/22/22 8:46 PM, Walter Bright wrote:

>

On 8/22/2022 7:06 AM, Steven Schveighoffer wrote:

> >

It's not as convenient as a segfault but at some point, the error becomes obvious.
Does it? And "at some point" it becomes obvious no matter what the error you made.

The point is, since 0.0 is a common value for a floating point value to be, just when does it become obvious that it is wrong? Are you really going to notice if your computation is 5% off? Isn't it a lot more obvious that it is wrong if it is NaN?

This is a highly dependent situation. It could be 0, which is 100% off. It could be 5%. It could be 0.0001% off, which might actually not be a problem that is noticed.

So I have an actual true story. One of the calculation spreadsheets we use had a fudge factor that someone inserted. Essentially, they added a value of 0.35 to a cost field (which is in the tens of thousands of dollars range). Given this is Excel we have no way of knowing who did it or when (probably to make it match some utility-provided tool value). But we didn't catch it for months. Only until we had a job where the cost was 100% covered by the utility, and the cost came out to $0.35, we caught it.

This happened because it added 0.35 to 0 (the default value of an empty cell). If instead it printed NaN I would have ignored that price, and just put 0 in at a later calculation to prevent errors showing up in the final proposal. Then I would have missed the fudge factor someone sneaked in.

The situations are completely dependent on the situation for finding a problem, for diagnosing a problem and for fixing the problem. It's impossible to predict how people will behave or how they will write code to cope with the situation they have.

I think it's a wash in using either 0 or NaN for a default value when that value is incorrect. But I think in terms of frequency, a default value of 0 for a float that isn't explicitly initialized is 99% of the time correct, which means you will have less of these problems to find.

> > >

I would start there to inspect variables, identify the NaNs.
What if you can't? What if it only happens randomly, and you don't just happen to have a debugger attached?
I'm not saying it's easier with 0, but just not any different.

0.0 is hardly a rare value, even in correct calculations. NaN is always wrong.

It's not rare because it's a very very common initial value.

> > >

Then I would trace them in a debugger and go up the call chain until I find the location where it became NaN. Then I would identify the source which introduced the NaN and trace that back until I found its origin.
If you have a chance to use a debugger and it happens at that time.

0 initialization wouldn't make it better.

I will concede that if you have a debugger attached and can watch the things change in real time, seeing NaN show up can give you a better clue as to where the problem came from.

> > >

The advantage I see in NaN is that it's always (instead of only almost always) immediately obvious that it's wrong whereas 0.0 can be valid or invalid so you need to figure out which one it is which requires an extra step.
It might be noticed that it's NaN. It also might not. It depends on how it's used.

NaN propagates. 0.0 does not.

Someone has to look at it, to "obviously" see that it's wrong.

> >

Either way, you need to find the source, and NaN doesn't help unless you want to start either instrumenting all code (possibly including code you don't control), or use a debugger (which isn't always possible).

Such a situation is objectively worse with 0.0. Is instrumenting all the code to detect 0.0 going to work? Nope, too many false positives, as 0.0 is a common value for floating point numbers.

Either way, it's a mess. Better to just logically trace it based on where it's assigned from, instead of instrumenting, and trying to find NaNs in random places.

> >

Can we have some kind of linting system that identifies NaNs that are used?

I have no objection to a linter that flags default initializations of floating point values. It shouldn't be part of the D compiler, though.

Something with semantic capabilities has to be used to prove it's not set before being used. Is there anything besides the compiler front end that can do this?

-Steve

August 22, 2022

On 8/22/22 8:50 PM, Walter Bright wrote:

>

On 8/20/2022 3:17 PM, Dukc wrote:

>

it goes against what C, C++ and C# (and probably many other languages) do so one is easily surprised.

C and C++ initialize them to garbage unless it is a global/static variable. That's probably the worst option, as it results in Heisenbugs that are very hard to track down.

Agreed that C/C++ initializing with garbage is the worst option.

C# (as also mentioned) uses 0.

I'm curious what all languages do that actually use an initial value?

-Steve

August 23, 2022

Did NaN replace any float/double values in memory?

August 23, 2022
On 8/23/22 02:08, claptrap wrote:
> On Monday, 22 August 2022 at 15:56:16 UTC, drug007 wrote:
>> On 8/22/22 18:04, claptrap wrote:
>>> On Sunday, 21 August 2022 at 17:56:58 UTC, drug007 wrote:
>>>> On 8/21/22 20:28, claptrap wrote:
>>>>> On Sunday, 21 August 2022 at 16:51:51 UTC, Walter Bright
>>>>
>>>> It will be noticed but what price? You've initialized all vars to 0 so how do you know that this exactly initialization to zero is wrong?
>>>
>>> You dont initialise all variables to zero, Ive just looked at some om my code and in 4000 lines i found two default init ints and maybe 50+ explicitly initialised. You're just inventing nonsense scenarios.
>>
>> But that is my point - not all variables initialize to zero. It is my statement that this is nonsense scenarios. Reread the post carefully.
> 
> "You've initialized all vars to 0"
> 
> I can only respond to what you write, (which was a nonsense scenario.)
> 

I'm so sorry, but you failed to reread post above carefully again. My point is not all data initialize zero. That is the reason why floating point number should be defaulted as NaN. Because you clearly can see what data hasn't been initialized at all.

> 
>>> Occasionally you might have to do a bit of mental arithmetic, but not often, I'm seriously wondering why you think it's so hard?
>>>
>>
>> Just because I've done math calculations before? And no, I didn't mean mental arithmetic. I meant numerical matrix operations from inputs to outputs just to track down where was wrong zero initialization. In some cases zero initialization is invalid, for example covariance of random variables. But NaN is invalid always. That is its advantage.
> 
> I've been programming for over 30 years, mostly DSP and some numerical stuff, statistical analysis. My experience is that its not a big deal.
> 

My experience is it is a big deal enough.

> My point is people are massively overselling how much of a problem bad inits are to track down.
August 23, 2022

I have seen how polar people's view points are on this issue. This is why I have created a library to solve this. Now people who wish floating points init to 0 can have it!

https://code.dlang.org/packages/j_init

August 23, 2022

I have seen how polar people's view points are on this issue. This is why I have created a library to solve this. Now people who wish floating points init to 0 can have it!

https://code.dlang.org/packages/j_init

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18