March 31, 2013
Kagamin:

> I vaguely remember Walter said those diagnostics are mostly false positives. Though I don't remember whether if was about implicit conversions.

I agree several of them seem innocuous.

Bye,
bearophile
April 01, 2013
On Friday, 29 March 2013 at 05:34:07 UTC, Kagamin wrote:
> On Friday, 29 March 2013 at 01:18:03 UTC, Jonathan M Davis wrote:
>> On Thursday, March 28, 2013 15:11:02 H. S. Teoh wrote:
>>> Maybe it's time to introduce cast(signed) or cast(unsigned) to the
>>> language, as bearophile suggests?
>>
>> It's not terribly pretty, but you can always do this
>>
>> auto foo = cast(Unsigned!(typeof(var))var;
>>
>> or
>>
>> auto bar = to!(Unsigned!(typeof(var)))(var);
>>
>> - Jonathan M Davis
>
> short signed(ushort n){ return cast(short)n; }
> int signed(uint n){ return cast(int)n; }
> long signed(ulong n){ return cast(long)n; }
>
> int n = va_arg!uint(_argptr).signed;

BTW phobos already has the function:
http://dlang.org/phobos/std_traits.html#.unsigned

I'm not sure if it's enough without `signed` counterpart.
April 02, 2013
On Thursday, 28 March 2013 at 20:03:08 UTC, Adam D. Ruppe wrote:
> I was working on a project earlier today that stores IP addresses in a database as a uint. For some reason though, some addresses were coming out as 0.0.0.0, despite the fact that if(ip == 0) return; in the only place it actually saves them (which was my first attempted quick fix for the bug).
>
> Turns out the problem was this:
>
> if (arg == typeid(uint)) {
> 	int e = va_arg!uint(_argptr);
> 	a = to!string(e);
> }
>
>
> See, I copy/pasted it from the int check, but didn't update the type on the left hand side. So it correctly pulled a uint out of the varargs, but then assigned it to an int, which the compiler accepted silently, so to!string() printed -blah instead of bigblah... which then got truncated by the database, resulting in zero being stored.
>
> I've since changed it to be "auto e = ..." and it all works correctly now.
>
>
>
> Anyway I thought I'd share this just because one of the many times bearophile has talked about this as a potentially buggy situation, I was like "bah humbug"... and now I've actually been there!
>
> I still don't think I'm for changing the language though just because of potential annoyances in other places unsigned works (such as array.length) but at least I've actually felt the other side of the argument in real world code now.

IMHO, array.length is *the* place where unsigned does *not* work. size_t should be an integer. We're not supporting 16 bit systems, and the few cases where a size_t value can potentially exceed int.max could be disallowed.

The problem with unsigned is that it gets used as "positive integer", which it is not. I think it was a big mistake that D turned C's  "unsigned long" into "ulong", thereby making it look more attractive. Nobody should be using unsigned types unless they have a really good reason. Unfortunately, size_t forces you to use them.


April 02, 2013
On Tuesday, 2 April 2013 at 07:49:04 UTC, Don wrote:
[cut]
> IMHO, array.length is *the* place where unsigned does *not* work. size_t should be an integer. We're not supporting 16 bit systems, and the few cases where a size_t value can potentially exceed int.max could be disallowed.
>
> The problem with unsigned is that it gets used as "positive integer", which it is not. I think it was a big mistake that D turned C's  "unsigned long" into "ulong", thereby making it look more attractive. Nobody should be using unsigned types unless they have a really good reason. Unfortunately, size_t forces you to use them.

You forgot something: an explanation why you feel that way..
I do consider unsigned int as "positive integer", why do you think that isn't the case?
IMHO the issue with unsigned are
1) implicit conversion: a C mistake and an even worst mistake to copy it from C knowing that this will lead to many errors!
2) lack of overflow checks by default.
April 02, 2013
On Tuesday, 2 April 2013 at 08:29:41 UTC, renoX wrote:
> On Tuesday, 2 April 2013 at 07:49:04 UTC, Don wrote:
> [cut]
>> IMHO, array.length is *the* place where unsigned does *not* work. size_t should be an integer. We're not supporting 16 bit systems, and the few cases where a size_t value can potentially exceed int.max could be disallowed.
>>
>> The problem with unsigned is that it gets used as "positive integer", which it is not. I think it was a big mistake that D turned C's  "unsigned long" into "ulong", thereby making it look more attractive. Nobody should be using unsigned types unless they have a really good reason. Unfortunately, size_t forces you to use them.
>
> You forgot something: an explanation why you feel that way..
> I do consider unsigned int as "positive integer", why do you think that isn't the case?

You can actually see it from the name. An unsigned number is exactly that -- it's a value with *no sign*. That's quite different from a positive integer, which is a number where the sign is known to be positive.

If it has no sign, that means that the interpretation of the sign requires further information. For example, it may be the low digits of a multi-byte number. (In fact, in the Intel docs, multi-word operations are the primary reason for the existence of unsigned operations). It might also be a bag of bits.

Mathematically, a positive integer is Z+, just with a limited range. If an operation exceeds the range, it's really an overflow error, the representation has broken down.

An uint, however, is a value mod 2^^32, and follows completely normal modular arithmetic rules. It's the responsibility of the surrounding code to add meaning to it.

But very often, people use 'uint' when they really want an int, whose sign bit is zero.

> IMHO the issue with unsigned are
> 1) implicit conversion: a C mistake and an even worst mistake to copy it from C knowing that this will lead to many errors!
> 2) lack of overflow checks by default.

I'm not sure how (2) is relevant.
Note that overflow of unsigned operations is impossible. Only signed numbers can overflow. Unsigned numbers wrap instead, and this is not an error, it's the central feature of their semantics.
April 02, 2013
On Tuesday, April 02, 2013 09:49:03 Don wrote:
> On Thursday, 28 March 2013 at 20:03:08 UTC, Adam D. Ruppe wrote:
> > I was working on a project earlier today that stores IP addresses in a database as a uint. For some reason though, some addresses were coming out as 0.0.0.0, despite the fact that if(ip == 0) return; in the only place it actually saves them (which was my first attempted quick fix for the bug).
> > 
> > Turns out the problem was this:
> > 
> > if (arg == typeid(uint)) {
> > 
> > int e = va_arg!uint(_argptr);
> > a = to!string(e);
> > 
> > }
> > 
> > 
> > See, I copy/pasted it from the int check, but didn't update the type on the left hand side. So it correctly pulled a uint out of the varargs, but then assigned it to an int, which the compiler accepted silently, so to!string() printed -blah instead of bigblah... which then got truncated by the database, resulting in zero being stored.
> > 
> > I've since changed it to be "auto e = ..." and it all works correctly now.
> > 
> > 
> > 
> > Anyway I thought I'd share this just because one of the many times bearophile has talked about this as a potentially buggy situation, I was like "bah humbug"... and now I've actually been there!
> > 
> > I still don't think I'm for changing the language though just because of potential annoyances in other places unsigned works (such as array.length) but at least I've actually felt the other side of the argument in real world code now.
> 
> IMHO, array.length is *the* place where unsigned does *not* work. size_t should be an integer. We're not supporting 16 bit systems, and the few cases where a size_t value can potentially exceed int.max could be disallowed.
> 
> The problem with unsigned is that it gets used as "positive integer", which it is not. I think it was a big mistake that D turned C's "unsigned long" into "ulong", thereby making it look more attractive. Nobody should be using unsigned types unless they have a really good reason. Unfortunately, size_t forces you to use them.

Naturally, the biggest reason to have size_t be unsigned is so that you can access the whole address space, though on 64-bit machines, that's not particularly relevant, since you're obviouly not going to have a machine with that much RAM (you're extremely unlikely to even have machine with that much hard drive space, though I think that I've heard of some machines existing which have run into that problem on 64-bit machines as crazy as that would be). For some people though, it _is_ a big deal on 32-bit machines. For instance, IIRC, David Simcha need 64-bit support for some of the stuff he was doing (biology stuff I think), because he couldn't address enough memory on a 32-bit machine to do what he was doing. And I know that one of the products where I work is going to have to move to 64-bit OS, because they're failing at keeping its main process' memory footprint low enough to work on a 32-bit box. Having a signed size_t would make it even worse. Granted, they're using C++, not D, but the issue is the same.

So, it's arguably important on 32-bit machines that size_t be unsigned, but 64-bit doesn't really have that excuse. However, making size_t unsigned on 32- bit machines and signed on 64-bit machines would create its own set of problems, and I suspect that would be an even worse idea than making size_t signed on 64-bit machines.

I do agree though that in general, unsigned types should be used with discretion, and they tend to be overused IMHO. I'm not convinced that that's the case with size_t though, since 32-bit machines do make it a necessity sometimes.

- Jonathan M Davis
April 02, 2013
Don:

> But very often, people use 'uint' when they really want an int, whose sign bit is zero.

Sometimes you need the modular nature of unsigned values, and some other times you just need an integer that according to the logic of the program never gets negative and you want the full range of a word, not throwing away one bit, but you don't want it to wrap-around. In programs I'd like to use:

1) integers of various sizes (with error if you try to go outside their range);
2) subranges of 1 (with error if you try to go outside their range);
3) unsigned integers of various sizes (with error if you try to go outside their range);
4) subranges of 3 (with error if you try to go outside their range);
5) unsigned integers with wrap-around;
6) multi precision integer;

Bye,
bearophile
April 02, 2013
On Friday, 29 March 2013 at 19:29:21 UTC, Jonathan M Davis wrote:
> No. -w makes it so that warnings are errors, so you generally can't make
> anything a warning unless you're willing for it to be treated as an error at
> least some of the time (and a lot of people compile with -w), and this sort of
> thing is _supposed_ to work without a warning - primarily because if it
> doesn't, you're forced to cast all over the place when you're dealing with
> both signed and unsigned types, and the casts actually make your code more
> error-prone, because you could end up casting something other than uint to int
> or int to uint by accident (e.g. long to uint) and end up with bugs due to
> that.
This reason alone ain't good enough to justify the implicit cast
from unsigned to signed and vice-versa. When I sum 2 short values
I am forced to manually cast the result to short if I want to
assign it to a short variable. Isn't that prone to errors, too?
Yet the compiler forces me to cast. I really think we should
eliminate this discrepancy.
April 02, 2013
On Tuesday, 2 April 2013 at 09:43:37 UTC, Jonathan M Davis wrote:
> On Tuesday, April 02, 2013 09:49:03 Don wrote:
>> On Thursday, 28 March 2013 at 20:03:08 UTC, Adam D. Ruppe wrote:
>> > I was working on a project earlier today that stores IP
>> > addresses in a database as a uint. For some reason though, some
>> > addresses were coming out as 0.0.0.0, despite the fact that
>> > if(ip == 0) return; in the only place it actually saves them
>> > (which was my first attempted quick fix for the bug).
>> > 
>> > Turns out the problem was this:
>> > 
>> > if (arg == typeid(uint)) {
>> > 
>> > int e = va_arg!uint(_argptr);
>> > a = to!string(e);
>> > 
>> > }
>> > 
>> > 
>> > See, I copy/pasted it from the int check, but didn't update the
>> > type on the left hand side. So it correctly pulled a uint out
>> > of the varargs, but then assigned it to an int, which the
>> > compiler accepted silently, so to!string() printed -blah
>> > instead of bigblah... which then got truncated by the database,
>> > resulting in zero being stored.
>> > 
>> > I've since changed it to be "auto e = ..." and it all works
>> > correctly now.
>> > 
>> > 
>> > 
>> > Anyway I thought I'd share this just because one of the many
>> > times bearophile has talked about this as a potentially buggy
>> > situation, I was like "bah humbug"... and now I've actually
>> > been there!
>> > 
>> > I still don't think I'm for changing the language though just
>> > because of potential annoyances in other places unsigned works
>> > (such as array.length) but at least I've actually felt the
>> > other side of the argument in real world code now.
>> 
>> IMHO, array.length is *the* place where unsigned does *not* work.
>> size_t should be an integer. We're not supporting 16 bit systems,
>> and the few cases where a size_t value can potentially exceed
>> int.max could be disallowed.
>> 
>> The problem with unsigned is that it gets used as "positive
>> integer", which it is not. I think it was a big mistake that D
>> turned C's "unsigned long" into "ulong", thereby making it look
>> more attractive. Nobody should be using unsigned types unless
>> they have a really good reason. Unfortunately, size_t forces you
>> to use them.
>
> Naturally, the biggest reason to have size_t be unsigned is so that you can
> access the whole address space, though on 64-bit machines, that's not
> particularly relevant, since you're obviouly not going to have a machine with
> that much RAM (you're extremely unlikely to even have machine with that much
> hard drive space, though I think that I've heard of some machines existing
> which have run into that problem on 64-bit machines as crazy as that would
> be). For some people though, it _is_ a big deal on 32-bit machines. For
> instance, IIRC, David Simcha need 64-bit support for some of the stuff he was
> doing (biology stuff I think), because he couldn't address enough memory on a
> 32-bit machine to do what he was doing. And I know that one of the products
> where I work is going to have to move to 64-bit OS, because they're failing at
> keeping its main process' memory footprint low enough to work on a 32-bit box.
> Having a signed size_t would make it even worse. Granted, they're using C++,
> not D, but the issue is the same.

My feeling is, that since the 16 bit days, using more than half of the address space is such an usual activity that it deserves special treatment in the code.
I don't think its unreasonable to require a cast for every use of those super-sized sizes.
Even if you have an array which doesn't fit into an int, you can only have one such array in your program!

This really, really obscure corner case doesn't deserve to be polluting the language.
All those signed/unsigned issues basically come from it. It's a helluva price to pay.

It's looking like an even worse deal now, because anybody with large memory requirements will be on 64 bits. We've made this sacrifice for the sake of a situation that is no longer relevant.

April 02, 2013
On 4/2/13 3:49 AM, Don wrote:
> IMHO, array.length is *the* place where unsigned does *not* work. size_t
> should be an integer. We're not supporting 16 bit systems, and the few
> cases where a size_t value can potentially exceed int.max could be
> disallowed.
>
> The problem with unsigned is that it gets used as "positive integer",
> which it is not. I think it was a big mistake that D turned C's
> "unsigned long" into "ulong", thereby making it look more attractive.
> Nobody should be using unsigned types unless they have a really good
> reason. Unfortunately, size_t forces you to use them.

I used to lean a lot more toward this opinion until I got to work on a C++ codebase using signed integers as array sizes and indices. It's an pain all over the code - two tests instead of one or casts all over, more cases to worry about... changing the code to use unsigned throughout ended up being an improvement.

Andrei