Jump to page: 1 2
Thread overview
[Issue 3850] New: Signed/unsigned bytes type name
Oct 22, 2012
Andrej Mitrovic
Oct 22, 2012
Don
Oct 22, 2012
Andrej Mitrovic
Oct 23, 2012
Don
Oct 23, 2012
Daniel Kozak
Oct 23, 2012
Daniel Kozak
Oct 23, 2012
Daniel Kozak
February 24, 2010
http://d.puremagic.com/issues/show_bug.cgi?id=3850

           Summary: Signed/unsigned bytes type name
           Product: D
           Version: 2.040
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: nobody@puremagic.com
        ReportedBy: bearophile_hugs@eml.cc


--- Comment #0 from bearophile_hugs@eml.cc 2010-02-24 02:33:14 PST ---
While programming in D I have seen that you can forget that the "byte" is signed. (Because normally I think of bytes as unsigned entities. Other people share the same idea). (It's similar but not equal to the situation of signed and unsigned chars in C).

There are several ways to solve this small problem. One of the simpler ways I can think of is to deprecate the "byte" type name and introduce a "sbyte" type name (that replaces the "byte" type name). Using a sbyte it's probably quite more easy to not forget that it's a signed value.

This introduces an inconstancy in the naming scheme of D integral values (they are now symmetric, ubyte, byte, int, uint, etc), but it can help avoid some bugs, especially from D newbies.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
March 15, 2010
http://d.puremagic.com/issues/show_bug.cgi?id=3850



--- Comment #1 from bearophile_hugs@eml.cc 2010-03-14 18:19:24 PDT ---
The signed/unsigned bytes in C# are:
- The sbyte type represents signed 8-bit integers with values between -128 and
127.
- The byte type represents unsigned 8-bit integers with values between 0 and
255.

Choosing ubyte/sbyte is acceptable too.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
October 22, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=3850


Andrej Mitrovic <andrej.mitrovich@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |andrej.mitrovich@gmail.com


--- Comment #2 from Andrej Mitrovic <andrej.mitrovich@gmail.com> 2012-10-21 19:52:53 PDT ---
Although I agree with you I think it's way too late to fix this without breaking tons of code. You can always use an alias in your own code. Adding it to Phobos would probably be unwise too (people would ask what's the difference between byte and sbyte).

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
October 22, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=3850


Don <clugdbug@yahoo.com.au> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |clugdbug@yahoo.com.au


--- Comment #3 from Don <clugdbug@yahoo.com.au> 2012-10-22 02:02:18 PDT ---
This is not a newbie issue. I make this mistake myself, fairly often. *Walter* made this mistake once, in the header generation tool! My experience is that 90% of uses of "byte", should instead be "ubyte". It is really, really unusual to be using signed bytes.

I wish we could change this. (I would do it by changing the type to "sbyte" and then adding "alias byte = sbyte;" to object.d).

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
October 22, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=3850



--- Comment #4 from Andrej Mitrovic <andrej.mitrovich@gmail.com> 2012-10-22 08:58:02 PDT ---
(In reply to comment #3)
> This is not a newbie issue. I make this mistake myself, fairly often.

Absolutely, it happens to me all the time as well.

> I wish we could change this. (I would do it by changing the type to "sbyte" and then adding "alias byte = sbyte;" to object.d).

That still won't prevent you from making the mistake of typing 'byte' instead of 'ubyte' though. :)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
October 22, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=3850



--- Comment #5 from bearophile_hugs@eml.cc 2012-10-22 09:52:06 PDT ---
(In reply to comment #4)

> That still won't prevent you from making the mistake of typing 'byte' instead of 'ubyte' though. :)

If you have sbyte and ubyte, and you keep using them consistently, I think this alone helps reduce mistakes a little.

And once few years have passed, and using "byte" is considered a bad idiom, D programs in the wild use "byte" less and less, we can even consider deprecating it.

There are tons of C++ code that represents null as "0", yet in C++11 there is nullptr, and G++ from version 4.7 has a warning (-Wzero-as-null-pointer-constant) that allows to find usage of "0" to represent null pointer.

The most important thing is the desire to improve the situation, then some slow deprecation paths exist.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
October 23, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=3850



--- Comment #6 from Don <clugdbug@yahoo.com.au> 2012-10-23 03:15:33 PDT ---
>> I wish we could change this. (I would do it by changing the type to "sbyte" and then adding "alias byte = sbyte;" to object.d).

> That still won't prevent you from making the mistake of typing 'byte' instead of 'ubyte' though. :)

By itself, no, but anybody can modify their local copy of object.d to remove
the alias...
A very slow deprecation path is possible.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
October 23, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=3850


Daniel Kozak <kozzi11@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kozzi11@gmail.com


--- Comment #7 from Daniel Kozak <kozzi11@gmail.com> 2012-10-23 07:02:09 PDT ---
(In reply to comment #0)
> While programming in D I have seen that you can forget that the "byte" is signed. (Because normally I think of bytes as unsigned entities. Other people share the same idea). (It's similar but not equal to the situation of signed and unsigned chars in C).
> 
> There are several ways to solve this small problem. One of the simpler ways I can think of is to deprecate the "byte" type name and introduce a "sbyte" type name (that replaces the "byte" type name). Using a sbyte it's probably quite more easy to not forget that it's a signed value.
> 
> This introduces an inconstancy in the naming scheme of D integral values (they are now symmetric, ubyte, byte, int, uint, etc), but it can help avoid some bugs, especially from D newbies.

I think byte should be unsigned by default. So I am for sbyte(signed byte - Is
there really anyone who need it?) and byte (unsigned byte)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
October 23, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=3850



--- Comment #8 from bearophile_hugs@eml.cc 2012-10-23 09:32:37 PDT ---
(In reply to comment #7)

> I think byte should be unsigned by default. So I am for sbyte(signed byte - Is
> there really anyone who need it?) and byte (unsigned byte)

Ideally I agree with you. In practice D built-in types are prefixed by "u" when unsigned, so a more practical solution is the C# one, that is using the "ubyte" and "sbyte" names pair.

Regarding the usefulness of signed bytes: small data types like ubyte, sbyte, short, ushort and even float are mostly useful in aggregates, like arrays and arrays of structs. They are not so useful if you need only one of them.

Recently I have used an array of sbyte values to represent indexes in a short array (statically known to be shorter than 127 items). Using 1 byte instad of an int/uint/size_t saves space if you have many of such indexes. And saving space means reducing cache misses. And to represent those indexes I used a sbyte instead of a ubyte because I have used -1 to represent "missing value").

sbyte values are not used often, but it's right to have them too in a system language.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
October 23, 2012
On Tuesday, 23 October 2012 at 16:32:38 UTC, bearophile_hugs@eml.cc wrote:
> And to represent those indexes I used a
> sbyte instead of a ubyte because I have used -1 to represent "missing value").

You still have to use 0xFF :-). But yes, I understand, that sbyte and ubyte is better way, how to solve this issue.

« First   ‹ Prev
1 2