bearophile can say "i told you so" (re uint->int implicit conv) (page 5) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » bearophile can say "i told you so" (re uint->int implicit conv) (page 5)

April 02, 2013

Re: bearophile can say "i told you so" (re uint->int implicit conv)

Posted by Walter Bright
in reply to Andrei Alexandrescu

Walter Bright

Posted in reply to Andrei Alexandrescu

On 4/2/2013 12:47 PM, Andrei Alexandrescu wrote:
> I used to lean a lot more toward this opinion until I got to work on a C++
> codebase using signed integers as array sizes and indices. It's an pain all over
> the code - two tests instead of one or casts all over, more cases to worry
> about... changing the code to use unsigned throughout ended up being an
> improvement.

For example, with a signed array index, a bounds check is two comparisons rather than one.

April 03, 2013

Re: bearophile can say "i told you so" (re uint->int implicit conv)

Posted by Steven Schveighoffer
in reply to Walter Bright

Steven Schveighoffer

Posted in reply to Walter Bright

On Tue, 02 Apr 2013 16:32:21 -0400, Walter Bright <newshound2@digitalmars.com> wrote:

> On 4/2/2013 12:47 PM, Andrei Alexandrescu wrote:
>> I used to lean a lot more toward this opinion until I got to work on a C++
>> codebase using signed integers as array sizes and indices. It's an pain all over
>> the code - two tests instead of one or casts all over, more cases to worry
>> about... changing the code to use unsigned throughout ended up being an
>> improvement.
>
> For example, with a signed array index, a bounds check is two comparisons rather than one.

Why?

struct myArr
{
   int length;
   int opIndex(int idx) { if(cast(uint)idx >= cast(uint)length) throw new RangeError(); ...}
}

-Steve

April 03, 2013

Re: bearophile can say "i told you so" (re uint->int implicit conv)

Posted by Andrei Alexandrescu
in reply to Steven Schveighoffer

Andrei Alexandrescu

Posted in reply to Steven Schveighoffer

On 4/2/13 11:10 PM, Steven Schveighoffer wrote:
> On Tue, 02 Apr 2013 16:32:21 -0400, Walter Bright
> <newshound2@digitalmars.com> wrote:
>
>> On 4/2/2013 12:47 PM, Andrei Alexandrescu wrote:
>>> I used to lean a lot more toward this opinion until I got to work on
>>> a C++
>>> codebase using signed integers as array sizes and indices. It's an
>>> pain all over
>>> the code - two tests instead of one or casts all over, more cases to
>>> worry
>>> about... changing the code to use unsigned throughout ended up being an
>>> improvement.
>>
>> For example, with a signed array index, a bounds check is two
>> comparisons rather than one.
>
> Why?
>
> struct myArr
> {
> int length;
> int opIndex(int idx) { if(cast(uint)idx >= cast(uint)length) throw new
> RangeError(); ...}
> }
>
> -Steve

As I said - either two tests or casts all over.

Andrei

April 03, 2013

Re: bearophile can say "i told you so" (re uint->int implicit conv)

Posted by Don
in reply to Andrei Alexandrescu

Don

Posted in reply to Andrei Alexandrescu

On Wednesday, 3 April 2013 at 03:26:54 UTC, Andrei Alexandrescu wrote:
> On 4/2/13 11:10 PM, Steven Schveighoffer wrote:
>> On Tue, 02 Apr 2013 16:32:21 -0400, Walter Bright
>> <newshound2@digitalmars.com> wrote:
>>
>>> On 4/2/2013 12:47 PM, Andrei Alexandrescu wrote:
>>>> I used to lean a lot more toward this opinion until I got to work on
>>>> a C++
>>>> codebase using signed integers as array sizes and indices. It's an
>>>> pain all over
>>>> the code - two tests instead of one or casts all over, more cases to
>>>> worry
>>>> about... changing the code to use unsigned throughout ended up being an
>>>> improvement.
>>>
>>> For example, with a signed array index, a bounds check is two
>>> comparisons rather than one.
>>
>> Why?
>>
>> struct myArr
>> {
>> int length;
>> int opIndex(int idx) { if(cast(uint)idx >= cast(uint)length) throw new
>> RangeError(); ...}
>> }
>>
>> -Steve
>
> As I said - either two tests or casts all over.
>
> Andrei

Yeah, but I think that what this is, is demonstrating what a useful concept a positive integer type is. There's huge value in statically knowing that the sign bit is never negative. Unfortunately, using uint for this purpose gives the wrong semantics, and introduces these signed/unsigned issues, which are basically silly.

Personally I suspect there aren't many uses for unsigned types of sizes other than the full machine word. In all the other sizes, a positive integer would be more useful.

April 03, 2013

Re: bearophile can say "i told you so" (re uint->int implicit conv)

Posted by Steven Schveighoffer
in reply to Andrei Alexandrescu

Steven Schveighoffer

Posted in reply to Andrei Alexandrescu

On Tue, 02 Apr 2013 23:26:54 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> On 4/2/13 11:10 PM, Steven Schveighoffer wrote:
>> On Tue, 02 Apr 2013 16:32:21 -0400, Walter Bright
>> <newshound2@digitalmars.com> wrote:
>>
>>> On 4/2/2013 12:47 PM, Andrei Alexandrescu wrote:
>>>> I used to lean a lot more toward this opinion until I got to work on
>>>> a C++
>>>> codebase using signed integers as array sizes and indices. It's an
>>>> pain all over
>>>> the code - two tests instead of one or casts all over, more cases to
>>>> worry
>>>> about... changing the code to use unsigned throughout ended up being an
>>>> improvement.
>>>
>>> For example, with a signed array index, a bounds check is two
>>> comparisons rather than one.
>>
>> Why?
>>
>> struct myArr
>> {
>> int length;
>> int opIndex(int idx) { if(cast(uint)idx >= cast(uint)length) throw new
>> RangeError(); ...}
>> }
>>
>> -Steve
>
> As I said - either two tests or casts all over.

But this is not "all over", it's in one place, for bounds checking.

I find that using unsigned int doesn't really hurt much, but it can make things awkward.

For example, it's better to do addition than subtraction:

for(int i = 0; i < arr.length - 1; ++i)
{
   if(arr[i] >= arr[i+1])
      throw new Exception("Not sorted!");
}

This has a bug, and is better written as:

for(int i = 0; i + 1 < arr.length; ++i)

These are the kinds of things that can get you into trouble.  With a signed length, then both loops are equivalent, and we don't have that error.

I'm not sure which is better.  It feels to me that if you CAN achieve the correct performance (even if this means casting), but the default errs on the side of safety, that might be a better option.

-Steve

April 03, 2013

Re: bearophile can say "i told you so" (re uint->int implicit conv)

Posted by Steven Schveighoffer
in reply to Don

Steven Schveighoffer

Posted in reply to Don

On Wed, 03 Apr 2013 07:33:05 -0400, Don <turnyourkidsintocash@nospam.com> wrote:

> Yeah, but I think that what this is, is demonstrating what a useful concept a positive integer type is. There's huge value in statically knowing that the sign bit is never negative. Unfortunately, using uint for this purpose gives the wrong semantics, and introduces these signed/unsigned issues, which are basically silly.
>
> Personally I suspect there aren't many uses for unsigned types of sizes other than the full machine word. In all the other sizes, a positive integer would be more useful.

Hm.. would it be useful to have a "guaranteed non-negative" integer type?  Like array length.  Then the compiler could make that assumption, and do something like what I did as an optimization?

Subtracting from that type would result in a plain-old int.

-Steve

April 04, 2013

Re: bearophile can say "i told you so" (re uint->int implicit conv)

Posted by Don
in reply to Steven Schveighoffer

Don

Posted in reply to Steven Schveighoffer

On Wednesday, 3 April 2013 at 14:54:03 UTC, Steven Schveighoffer wrote:
> On Wed, 03 Apr 2013 07:33:05 -0400, Don <turnyourkidsintocash@nospam.com> wrote:
>
>> Yeah, but I think that what this is, is demonstrating what a useful concept a positive integer type is. There's huge value in statically knowing that the sign bit is never negative. Unfortunately, using uint for this purpose gives the wrong semantics, and introduces these signed/unsigned issues, which are basically silly.
>>
>> Personally I suspect there aren't many uses for unsigned types of sizes other than the full machine word. In all the other sizes, a positive integer would be more useful.
>
> Hm.. would it be useful to have a "guaranteed non-negative" integer type?  Like array length.  Then the compiler could make that assumption, and do something like what I did as an optimization?
>
> Subtracting from that type would result in a plain-old int.
>
> -Steve

I think it would be extremely useful. I think "always positive" is a fundamental mathematical property that isn't captured by the type system. But I fear the heritage from C just has too much momentum.

One thing we could do immediately, without changing anything in the language definition at all, is add range propagation for array length.

So that, for any array A, A.length is in the range 0 .. (size_t.max/A[0].sizeof)
which would mean that unless A is of type byte, ubyte, void, or char, the length is known to be a positive integer. And of course for a static array, the exact length is known.

Although that has very limited applicability (only works within a single expression), I think it might help quite a lot.

April 04, 2013

Re: bearophile can say "i told you so" (re uint->int implicit conv)

Posted by Kagamin
in reply to Jonathan M Davis

Kagamin

Posted in reply to Jonathan M Davis

On Tuesday, 2 April 2013 at 09:43:37 UTC, Jonathan M Davis wrote:
> Naturally, the biggest reason to have size_t be unsigned is so that you can
> access the whole address space

Length exists to limit access to memory. If you want unlimited access, use just a pointer.

> For some people though, it _is_ a big deal on 32-bit machines. For
> instance, IIRC, David Simcha need 64-bit support for some of the stuff he was
> doing (biology stuff I think), because he couldn't address enough memory on a
> 32-bit machine to do what he was doing. And I know that one of the products
> where I work is going to have to move to 64-bit OS, because they're failing at
> keeping its main process' memory footprint low enough to work on a 32-bit box.
> Having a signed size_t would make it even worse. Granted, they're using C++,
> not D, but the issue is the same.

I'm afraid, those applications are not tied to 32-bit ints. They just want a lot of memory because they have a lot of data. It means they want more than 4 gigs, so uint won't help in the slightest: it can't address more than 4 gigs, and applications will keep failing. There's a technology to use more than 4 gigs on 32-bit system:
http://en.wikipedia.org/wiki/Address_Windowing_Extensions
but uint still has no advantage over int, as it still can't address all the needed memory (which is more than 4 gigs).

April 04, 2013

Re: bearophile can say "i told you so" (re uint->int implicit conv)

Posted by Jonathan M Davis
in reply to Kagamin

Jonathan M Davis

Posted in reply to Kagamin

On Thursday, April 04, 2013 15:20:26 Kagamin wrote:
> I'm afraid, those applications are not tied to 32-bit ints. They just want a lot of memory because they have a lot of data. It means they want more than 4 gigs, so uint won't help in the slightest: it can't address more than 4 gigs, and applications will keep failing.

It's a difference of a factor of 2. You can access twice as much memory with a uint than an int. It's quite possible to need enough memory that an int wouldn't be enough and a uint would be. Of course, going 64-bit pretty much solves the problem, because you're not going to have enough memory to need anywhere near 64-bits of address space any time soon (and probaly not ever), but uint _can_ make a difference or 32-bit machines, because it gives you twice as much memory to play around with.

- Jonathan M Davis

April 04, 2013

Re: bearophile can say "i told you so" (re uint->int implicit conv)

Posted by Walter Bright
in reply to Steven Schveighoffer

Walter Bright

Posted in reply to Steven Schveighoffer

On 4/2/2013 8:10 PM, Steven Schveighoffer wrote:
> On Tue, 02 Apr 2013 16:32:21 -0400, Walter Bright <newshound2@digitalmars.com>
> wrote:
>> For example, with a signed array index, a bounds check is two comparisons
>> rather than one.
>
> Why?
>
> struct myArr
> {
>     int length;
>     int opIndex(int idx) { if(cast(uint)idx >= cast(uint)length) throw new
> RangeError(); ...}
> }

Being able to cast to unsigned implies that the unsigned types exist. So no improvement.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation