Thread overview
I think Associative Array should throw Exception
Sep 01, 2020
Jesse Phillips
Sep 01, 2020
Adam D. Ruppe
Sep 02, 2020
James Blachly
Sep 03, 2020
Jesse Phillips
Sep 04, 2020
Jesse Phillips
Sep 04, 2020
Paul Backus
September 01, 2020
This is going to be a hard one for me to argue but I'm going to give it a try.

Today if you attempt to access a key from an associative array (AA) that does not exist inside the array, a RangeError is thrown. This is similar to when an array is accessed outside the bounds.

```
string[string] dict;
dict["Hello"] // RangeError

string[] arr;
arr[6] // RangeError
```

There are many things written[1][2] on the difference between Exception and Error.

* Errors are for programming bugs and Exceptions are environmental
* Exceptions can be caught, Errors shouldn't be (stack may not unwind)
* Exceptions are recoverable, Errors aren't recoverable
* Errors can live in nothrow, Exceptions can't
* Always verify input

I don't have an issue with the normal array RangeError, there is a clear means for claiming your access is a programming bug. However associative arrays tend to have both the key and value as "input."

Trying to create a contract for any given method to validate the AA as input becomes cumbersome. I would say it is analogous to `find` throwing an error if it didn't find the value requested.

Using RangeError is nice as it allows code to use array index inside `nothrow.` But again, can we really review code and say this will be the case? We'd  have to enforce all access to associative arrays to be done using `in` and checked for null. Then what use is the [] syntax?

Is it recoverable? I would say yes. We aren't actually trying to access memory outside the application ownership, we haven't put the system state into a critical situation (out of memory). And a higher portion of the code could easily decide to take a different path due to the failure of its call.

"if exceptions are thrown for errors instead, the programmer has to deliberately add code if he wishes to ignore the error."

1. https://stackoverflow.com/questions/5813614/what-is-difference-between-errors-and-exceptions
2. https://forum.dlang.org/thread/m8tkfm$ret$1@digitalmars.com
September 01, 2020
On 9/1/20 2:20 PM, Jesse Phillips wrote:

> Using RangeError is nice as it allows code to use array index inside `nothrow.`

This is the big sticking point -- code that is nothrow would no longer be able to use AAs. It makes the idea, unfortunately, a non-starter.

What is wrong with using `in`? I use this mostly:

if(auto v = key in aa) { /* use v */ }

Note, that in certain cases, I want to turn *normal* array access errors into exceptions, because I always want bounds checking on, but I don't want a (caught) programming error to bring down my whole vibe.d server.

So I created a simple wrapper around arrays which throws exceptions on out-of-bounds access. You could do a similar thing with AAs. It's just that the declaration syntax isn't as nice.

-Steve
September 01, 2020
On Tuesday, 1 September 2020 at 18:55:20 UTC, Steven Schveighoffer wrote:
> This is the big sticking point -- code that is nothrow would no longer be able to use AAs. It makes the idea, unfortunately, a non-starter.

You could always catch it though.

But I kinda like things the way they are exactly because you can check with the in operator ahead of time. Or the .get helper method is pretty convenient too.
September 01, 2020
On 9/1/20 2:55 PM, Steven Schveighoffer wrote:
> On 9/1/20 2:20 PM, Jesse Phillips wrote:
> 
>> Using RangeError is nice as it allows code to use array index inside `nothrow.`
> 
> This is the big sticking point -- code that is nothrow would no longer be able to use AAs. It makes the idea, unfortunately, a non-starter.
...
> -Steve

Steve, are there not several (probably better, faster) alternatives to the built-in AA that are nothrow? I think a nice way to look at the built-in AA is an easy default for quick scripts, new users, etc., much like the default of `throw` status of a function or code block.

Advanced users, (i.e. those using nothrow annotation) could select a more efficient AA implementation anyway.
September 02, 2020
On 9/1/20 10:46 PM, James Blachly wrote:
> On 9/1/20 2:55 PM, Steven Schveighoffer wrote:
>> On 9/1/20 2:20 PM, Jesse Phillips wrote:
>>
>>> Using RangeError is nice as it allows code to use array index inside `nothrow.`
>>
>> This is the big sticking point -- code that is nothrow would no longer be able to use AAs. It makes the idea, unfortunately, a non-starter.
> ....
> 
> Steve, are there not several (probably better, faster) alternatives to the built-in AA that are nothrow? I think a nice way to look at the built-in AA is an easy default for quick scripts, new users, etc., much like the default of `throw` status of a function or code block.
> 
> Advanced users, (i.e. those using nothrow annotation) could select a more efficient AA implementation anyway.

The problem is not the requirement but the resulting code breakage if you change it now.

-Steve
September 03, 2020
On Tuesday, 1 September 2020 at 18:55:20 UTC, Steven Schveighoffer wrote:
> On 9/1/20 2:20 PM, Jesse Phillips wrote:
>
>> Using RangeError is nice as it allows code to use array index inside `nothrow.`
>
> This is the big sticking point -- code that is nothrow would no longer be able to use AAs. It makes the idea, unfortunately, a non-starter.
>
> What is wrong with using `in`? I use this mostly:
>
> if(auto v = key in aa) { /* use v */ }
>

I think that actually might be my point. If you need nothrow then this is what you need to do.

For breaking nothrow code using the [] syntax, I'd say it is already broken because the behavior is to throw and the above is how you would check that it won't.

The issue is, associative arrays throw an "uncatchable" error. Meaning code is written to catch the error (because it works). And correctly written `nothrow` code needs to use `in` to be properly nothrow.
September 03, 2020
On 9/3/20 10:43 AM, Jesse Phillips wrote:
> On Tuesday, 1 September 2020 at 18:55:20 UTC, Steven Schveighoffer wrote:
>> On 9/1/20 2:20 PM, Jesse Phillips wrote:
>>
>>> Using RangeError is nice as it allows code to use array index inside `nothrow.`
>>
>> This is the big sticking point -- code that is nothrow would no longer be able to use AAs. It makes the idea, unfortunately, a non-starter.
>>
>> What is wrong with using `in`? I use this mostly:
>>
>> if(auto v = key in aa) { /* use v */ }
>>
> 
> I think that actually might be my point. If you need nothrow then this is what you need to do.
> 
> For breaking nothrow code using the [] syntax, I'd say it is already broken because the behavior is to throw and the above is how you would check that it won't.

int[int] aa;
aa[4] = 5;
auto b = aa[4];

How is this code broken? It's valid, will never throw, and there's no reason that we should break it by adding an exception into the mix.

> The issue is, associative arrays throw an "uncatchable" error. Meaning code is written to catch the error (because it works). And correctly written `nothrow` code needs to use `in` to be properly nothrow.

The big issue is -- is accessing an invalid index a programming error or an environmental error? The answer is -- it depends. D has declared, if you use the indexing syntax, then it's a programming error. If you want it not to be a programming error, you use the key in aa syntax, and handle it.

The other thing you can do is use a different type, if you don't want to deal with the verbose syntax, but still want to catch environmental errors. A wrapper type is possible.

-Steve
September 04, 2020
On Thursday, 3 September 2020 at 15:12:14 UTC, Steven Schveighoffer wrote:
> int[int] aa;
> aa[4] = 5;
> auto b = aa[4];
>
> How is this code broken? It's valid, will never throw, and there's no reason that we should break it by adding an exception into the mix.
>

int foo() nothrow {
    return "1".to!int;
}

The following code is valid, will never throw, why does the compiler prevent it?
September 04, 2020
On 9/4/20 1:48 AM, Jesse Phillips wrote:
> On Thursday, 3 September 2020 at 15:12:14 UTC, Steven Schveighoffer wrote:
>> int[int] aa;
>> aa[4] = 5;
>> auto b = aa[4];
>>
>> How is this code broken? It's valid, will never throw, and there's no reason that we should break it by adding an exception into the mix.
>>
> 
> int foo() nothrow {
>      return "1".to!int;
> }
> 
> The following code is valid, will never throw, why does the compiler prevent it?

You are still missing the point ;)

Your example doesn't compile today. Mine does. It's not a question of which way is better, but that we already have code that depends on the chosen solution, and changing now means breaking all such existing code.

My point of bringing up the example is that your assertion that "it is already broken" isn't true.

To put it another way, if the above to!int call compiled, and we switched to exceptions, it would be the same problem, even if the right choice is to use Exceptions.

-Steve
September 04, 2020
On Tuesday, 1 September 2020 at 18:20:17 UTC, Jesse Phillips wrote:
> This is going to be a hard one for me to argue but I'm going to give it a try.
>
> Today if you attempt to access a key from an associative array (AA) that does not exist inside the array, a RangeError is thrown. This is similar to when an array is accessed outside the bounds.
>
> [...]
>
> I don't have an issue with the normal array RangeError, there is a clear means for claiming your access is a programming bug. However associative arrays tend to have both the key and value as "input."
>
> [...]
>
> Is it recoverable? I would say yes. We aren't actually trying to access memory outside the application ownership, we haven't put the system state into a critical situation (out of memory). And a higher portion of the code could easily decide to take a different path due to the failure of its call.

Any time you have an operation that can only succeed if some precondition is met, there are two possible ways you can implement it:

1. Make it the caller's responsibility to check the precondition.
2. Make it the function's responsibility to check the precondition.

If you have version #1, you can always use it to implement version #2, but the converse is not true. So, while you would ideally provide both versions and let the user choose the one they prefer, you should always *at least* provide version #1.

In this case, for D's associative arrays, the [] operator is version #1. You could make a reasonable case that the [] operator should have been reserved for version #2, and version #1 should have been named something else, but at this point, it's not worth breaking backwards compatibility to change it.