Jump to page: 1 2 3
Thread overview
[suggestion] std type aliases (updated)
Feb 11, 2005
Mark Junker
Feb 11, 2005
Norbert Nemec
Feb 11, 2005
Norbert Nemec
Feb 11, 2005
Ivan Senji
Feb 11, 2005
Jason Mills
Feb 11, 2005
Lionello Lunesu
Feb 11, 2005
Lionello Lunesu
Feb 11, 2005
Alex Stevenson
Feb 11, 2005
Alex Stevenson
Feb 11, 2005
Ben Hinkle
Feb 11, 2005
Kris
Feb 11, 2005
Ben Hinkle
Feb 11, 2005
Kris
February 11, 2005
Revised,
Here is the full list of my suggested type aliases for D:

   TYPE        ALIAS   // RANGE

   void                // void

Integer: (std.stdint)
   byte        int8_t  // 8-bit signed
  ubyte       uint8_t  // 8-bit unsigned (0x00-0xFF)

  short       int16_t  // 16-bit signed
 ushort      uint16_t  // 16-bit unsigned (0x0000-0xFFFF)

    int       int32_t  // 32-bit signed
   uint      uint32_t  // 32-bit unsigned (0x00000000-0xFFFFFFFF)

   long       int64_t  // 64-bit signed (could be two int registers)
  ulong      uint64_t  // 64-bit unsigned (could be two uint registers)

   cent      int128_t  // 128-bit signed (reserved for future use)
  ucent     uint128_t  // 128-bit unsigned (reserved for future use)

Floating Point: (std.stdfloat)
     float  float32_t  // 32-bit single precision (about 6 digits)
    double  float64_t  // 64-bit double precision (about 15 digits)
  extended  float80_t  // 64/80/128-bit extended precision (platform)

    ifloat   imag32_t  // \
   idouble   imag64_t  // imaginary versions of the above real ones
 iextended   imag80_t  // /

    cfloat   comp32_t  // \
   cdouble   comp64_t  // complex (with both real and imaginary parts)
 cextended   comp80_t  // /

Character: (std.stdutf)
   char        utf8_t  // \x00-\x7F (ASCII)
  wchar       utf16_t  // \u0000-\uD7FF, \uE000-\uFFFF
  dchar       utf32_t  // \U00000000-\U0010FFFF (Unicode)

Boolean: (std.stdbool)
    bit          bool  // false (0) | true (1)
   byte         wbool  // false (zero) | true (non-zero)
    int         dbool  // false (zero) | true (non-zero)

String: (std.stdstr)
  char[]          str    // UTF-8, optimized for US-ASCII
 wchar[]         wstr    // UTF-16, optimized for Unicode
 dchar[]         dstr    // UTF-32, easy codepoint access


This is the updated version, after discussions...

It requires renaming "real" back to "extended".
(and adding the keywords "cent" and "ucent" too)

--anders

Implementation: (Public Domain)

> module std.stdint;
> 
> /* Exact sizes */
> 
> alias  byte    int8_t;
> alias ubyte   uint8_t;
> alias  short  int16_t;
> alias ushort uint16_t;
> alias   int   int32_t;
> alias  uint  uint32_t;
> alias  long   int64_t;
> alias ulong  uint64_t;
> alias  cent  int128_t;
> alias ucent uint128_t;


> module std.stdfloat;
> 
> /* floating point types */
> 
> alias     float  float32_t; // 32-bit single precision
> alias    double  float64_t; // 64-bit double precision
> alias  extended  float80_t; // 64|80|128-bit extended
> 
> alias     float   real32_t; // \
> alias    double   real64_t; // Real
> alias  extended   real80_t; // /
> 
> alias    ifloat   imag32_t; // \
> alias   idouble   imag64_t; // Imaginary
> alias iextended   imag80_t; // /
> 
> alias    cfloat   comp32_t; // \
> alias   cdouble   comp64_t; // Complex (Real + Imaginary)
> alias cextended   comp80_t; // /


> module std.stdutf;
> 
> /* UTF code units */
> 
> alias  char  utf8_t; // UTF-8 code unit
> alias wchar utf16_t; // UTF-16 code unit
> alias dchar utf32_t; // UTF-32 code point


> module std.stdbool;
> 
> /* boolean types */
> 
> alias   bit   bool;  // boolean (true/false)
> alias  byte  wbool;  // wide boolean (like wchar)
> alias   int  dbool;  // double boolean (like dchar)


> module std.stdstr;
> 
> /* string types */
> 
> alias  char[]   str; // ASCII-optimized
> alias wchar[]  wstr; // Unicode-optimized
> alias dchar[]  dstr; // codepoint-optimized
February 11, 2005
Anders F Björklund schrieb:

> String: (std.stdstr)
>   char[]          str    // UTF-8, optimized for US-ASCII
>  wchar[]         wstr    // UTF-16, optimized for Unicode
>  dchar[]         dstr    // UTF-32, easy codepoint access

Are you sure that you'll always use Multi-Byte-Character strings? Maybe you you can use UCS32 for dchar[]?

Regards,
Mark
February 11, 2005
Anders F Björklund wrote:

> Revised,
> Here is the full list of my suggested type aliases for D:
> 
> Integer: (std.stdint)
>     byte        int8_t  // 8-bit signed
>    ubyte       uint8_t  // 8-bit unsigned (0x00-0xFF)
[...]

are those _t suffices really needed?


>      ifloat   imag32_t  // \
>     idouble   imag64_t  // imaginary versions of the above real ones
>   iextended   imag80_t  // /
>
>      cfloat   comp32_t  // \
>     cdouble   comp64_t  // complex (with both real and imaginary parts)
>   cextended   comp80_t  // /

I'm not really happy with these. I would suggest to

* either change 'imag'->'imaginary' and 'comp'->'complex' (or 'cpx' if you
 really want to stay short)

*  or even better: Stick with

      float   float32  // \
     double   float64  // imaginary versions of the above real ones
   extended   float80  // /

     ifloat   ifloat32  // \
    idouble   ifloat64  // imaginary versions of the above real ones
  iextended   ifloat80  // /

     cfloat   cfloat32  // \
    cdouble   cfloat64  // complex (with both real and imaginary parts)
  cextended   cfloat80  // /

which avoids any questionable mixing of numerical and mathematical names and is very structured.


February 11, 2005
Norbert Nemec wrote:

>>Here is the full list of my suggested type aliases for D:
>>
>>Integer: (std.stdint)
>>    byte        int8_t  // 8-bit signed
>>   ubyte       uint8_t  // 8-bit unsigned (0x00-0xFF)
> 
> [...]
> 
> are those _t suffices really needed?

There are two reasons:

1) to avoid mixing them up with integers.

   int8_t x = 8;
   int16_t y = 16;
   int32_t z = 32;

2) They are already in use, in ISO C99...

   #include <stdint.h>

So it's just an extension into "float"/"utf" ?

> I'm not really happy with these. I would suggest to
> 
> * either change 'imag'->'imaginary' and 'comp'->'complex' (or 'cpx' if you
>  really want to stay short)
> 
> *  or even better: Stick with
> 
>       float   float32  // \
>      double   float64  // imaginary versions of the above real ones
>    extended   float80  // /
> 
>      ifloat   ifloat32  // \
>     idouble   ifloat64  // imaginary versions of the above real ones
>   iextended   ifloat80  // /
> 
>      cfloat   cfloat32  // \
>     cdouble   cfloat64  // complex (with both real and imaginary parts)
>   cextended   cfloat80  // /
> 
> which avoids any questionable mixing of numerical and mathematical names and
> is very structured.

Sure, that works much better! (but with the _t suffix)

imaginary and complex are too much like "unsigned",
and we are already using the "u" prefix for those...


Dropping the "real" names from DMD is really simple.
Just a matter of will and decisions, in the end...

> /lexer.c:1924:    {	"real",		TOKfloat80	},
> ./lexer.c:1933:    {	"ireal",	TOKimaginary80	},
> ./lexer.c:1937:    {	"creal",	TOKcomplex80	},

> ./mtype.c:666:			c = "real";
> ./mtype.c:681:			c = "ireal";
> ./mtype.c:696:			c = "creal";

I guess the old names will have to be provided as aliases,
in order to not break existing code... (in object.d)

"real" is OK, but maybe deprecate "ireal" and "creal" ?
("imaginary" and "complex" could be added too, if wanted,
but they are a little like "unsigned" or even "integer")


Put a page up at:
http://www.prowiki.org/wiki4d/wiki.cgi?StdTypeAliases

Will change to the above, and drop "imag" and "comp"
And maybe start a petition to get extended back ? :-)

--anders
February 11, 2005
Hi..

And the discussion continues....

I think it should rather be the other way around:

int8 should be the internal type, with "byte" as an alias.

If there's a change eminent, we should take this opportunity to get rid of those strange C names like: short, long, double. They're adjectives for cyring out loud.

The only types that actually mean something are bit, "int" for integers and "float" for floating point numbers. The others are basically based on these.

Weren't "short" and "long" originally meant as type modifiers? "short int" for a 16-bit number, "long int" for a "longer than normal" (multi-register) 64-bit number. "double float" for a float with double precision? What about "short real" for float and "long real" for a double?

The current types are too much based on C, their origin long since forgotten.

Come to think of it, "complex" can be both a noun and an adjective, and since "complex int" will never be popular (and useful), I guess it'll be one of the bases: bit, int, float, complex ?

Ah, finally a coherent naming system: "short bool" for a bit/bool, "bool" for a byte (C++), "long bool" for dbool. Char's tricky: there's nothing shorter and two longer variants.. I'll have to think about this.

"This would break all existing programs" - make aliases;
"Too late to make such a change" - with aliases nothing breaks;
"We're only fixing bugs, not changing the language" - press delete (no wait,
it's a newsgroup).

Lionello.


February 11, 2005
Mark Junker wrote:

>> String: (std.stdstr)
>>   char[]          str    // UTF-8, optimized for US-ASCII
>>  wchar[]         wstr    // UTF-16, optimized for Unicode
>>  dchar[]         dstr    // UTF-32, easy codepoint access
> 
> Are you sure that you'll always use Multi-Byte-Character strings? Maybe you you can use UCS32 for dchar[]?

What is UCS32 ? (I've only heard of UCS-4, which is obsolete)

And the decision that D should "only" support Unicode is not
mine at all but Walter's, and something that I agree with...

It's still possible to use e.g. Latin-1 strings by using the
ubyte[] data type, but you can only use C functions then...

All D functions take Unicode strings, like: str, wstr, dstr

--anders
February 11, 2005
Anders F Björklund wrote:
> Revised,
> Here is the full list of my suggested type aliases for D:
> /* snipped out loads'o'stuff */
> 
> 
> 
>> module std.stdfloat;
>>
>> /* floating point types */
>>
>> alias     float  float32_t; // 32-bit single precision
>> alias    double  float64_t; // 64-bit double precision
>> alias  extended  float80_t; // 64|80|128-bit extended
>>
>> alias     float   real32_t; // \
>> alias    double   real64_t; // Real
>> alias  extended   real80_t; // /
>>
>> alias    ifloat   imag32_t; // \
>> alias   idouble   imag64_t; // Imaginary
>> alias iextended   imag80_t; // /
>>
>> alias    cfloat   comp32_t; // \
>> alias   cdouble   comp64_t; // Complex (Real + Imaginary)
>> alias cextended   comp80_t; // /

Is it a good idea to use the bit length in the real/extended aliases (float80_t) when the length of these types is platform defined?

It would be misleading to call a type float80_t on platforms where it's implemented as a 64 or 128-bit type.
February 11, 2005
Alex Stevenson wrote:

> Is it a good idea to use the bit length in the real/extended aliases (float80_t) when the length of these types is platform defined?

No, but in reality the type is *always* 80 bits (when implemented)

It's just that on CPU platforms that *do not have* the type, it falls
back to using one or two doubles (which is better than simply failing)

On the offical platforms, it's always 80 bits. (Win32 and Linux X86)

--anders
February 11, 2005
Lionello Lunesu wrote:

> And the discussion continues....
> 
> I think it should rather be the other way around:
> int8 should be the internal type, with "byte" as an alias.

I think that ship has sailed, like a few years ago...
Besides, it would work the same in the end wouldn't it?

> If there's a change eminent, we should take this opportunity to get rid of those strange C names like: short, long, double. They're adjectives for cyring out loud.

I don't think that D aims to get rid of C, just improve it...
(and not too much either, Java and C# have changed much more)

How about changing { for BEGIN and } for END,
or using := for assignment and = for equality ?

Me, I'll think I will pass ;-)

> The only types that actually mean something are bit, "int" for integers and "float" for floating point numbers. The others are basically based on these.

Yup. And "char" should probably be renamed as "codeunit", too.
(Since there are no characters in technical Unicode lingo...)

Only problem is that the "float" precision is too poor,
just like the "int" was - back when it was a 16-bit type...

So today, int and double are now the "standard" types  -
short and float can be used when space/time is an issue
more than precision and long and extended can both be used
if the hardware so allows (i.e. on a 64-bit / X86 machine)
and bit and byte are provided for other special uses

> Weren't "short" and "long" originally meant as type modifiers? "short int" for a 16-bit number, "long int" for a "longer than normal" (multi-register) 64-bit number. "double float" for a float with double precision? What about "short real" for float and "long real" for a double?

That was then, this is now. And two-word keywords are *not* in D.
(and a "long int" was a 32-bit type, the "long long" was 64-bits)

Shouldn't that include "regular int" and "single float" too then ?
Then you can have "unsigned regular int" and "imaginary single float"

That isn't really pratical, IMHO.

> The current types are too much based on C, their origin long since forgotten.
> 
> Come to think of it, "complex" can be both a noun and an adjective, and since "complex int" will never be popular (and useful), I guess it'll be one of the bases: bit, int, float, complex ?

But complex is now a type modifier: "complex float" (cfloat)
A lot like "unsigned integer" (uint) or "wide character" (wchar).

signed is the default for the integers, so it has no prefix.
(could have been "sint")
real is the default for the floats, so it has no prefix either.
(could have been "rfloat")

> Ah, finally a coherent naming system: "short bool" for a bit/bool, "bool" for a byte (C++), "long bool" for dbool. Char's tricky: there's nothing shorter and two longer variants.. I'll have to think about this.

Please do, but I don't think it will change the D types...

Which reminds me, there is no boolean type. Live with it :-)

> "This would break all existing programs" - make aliases;
> "Too late to make such a change" - with aliases nothing breaks;
> "We're only fixing bugs, not changing the language" - press delete (no wait, it's a newsgroup).

How about "There are more important things to fix before release" ?

And it would be nice if cent and ucent could be hacked in, so that
I can get a type with 16-byte alignment to use for the vector types...
(for linking to SIMD extensions outside of D, such as AltiVec or SSE)

--anders
February 11, 2005
Anders F Björklund wrote:
> Alex Stevenson wrote:
> 
>> Is it a good idea to use the bit length in the real/extended aliases (float80_t) when the length of these types is platform defined?
> 
> 
> No, but in reality the type is *always* 80 bits (when implemented)
> 
> It's just that on CPU platforms that *do not have* the type, it falls
> back to using one or two doubles (which is better than simply failing)
> 
> On the offical platforms, it's always 80 bits. (Win32 and Linux X86)
> 
> --anders

Hmm, all I can find in the D spec is this note in http://www.digitalmars.com/d/type.html is:

"real : largest hardware implemented floating point size (Implementation Note: 80 bits for Intel CPU's)"

This suggests to me that while on x86 platforms it should be 80 bits, but on other platforms it will change.

This is fine while D only officially supports x86 platforms, but this situation isn't guaranteed to hold forever.

Plus should the language spec really limit the hardware platform? The situation as I understood it was:

D Language Spec: Does not limit to specific platforms (but provides implementation guidance)
DMD compiler: Supports x86 Linux/Win32 only
GDC : Supports x86/Mac/Solaris

If my understanding is correct, the 'always 80-bit' assumption is based on the DMD compiler rather than the D language spec.
« First   ‹ Prev
1 2 3