March 25, 2010
Ellery Newcomer wrote:
> What do you think of Erlang's bit syntax if you've looked at it, or could you if you haven't?

I know nothing about it.
March 25, 2010
On 03/25/2010 05:26 PM, Nick Sabalausky wrote:
> "Walter Bright"<newshound1@digitalmars.com>  wrote in message
> news:hogmgm$oco$1@digitalmars.com...
>> Ellery Newcomer wrote:
>>> I guess what I'm trying to say is it doesn't make sense that I can
>>> implicitly import FooA, an external symbol, but not bar(FooA), an
>>> external symbol defined on an external symbol which cannot be implicitly
>>> converted to a local symbol.
>>
>> And I believe it makes perfect sense! Everywhere else in the language,
>> when you define a local name it *overrides* names in other scopes, it
>> doesn't overload them.
>
> Well, the result of that is that I'm forced to make my "genEnum" library
> utility generate "enum{name of enum}ToString({name of enum} e)" instead of
> "enumToString({name of enum} e)" or else users won't be able to use it
> without a bunch of odd alias contortions that I'm not sure I can wave away
> by including them in the original mixin. (I would have just called it
> "toString", but at the time, that had been giving me some strange troubles
> so I changed it to "enumToString" instead. In retrospect, it was probably
> giving me trouble because of this very same issue.)

Suggestion: forget about enums. Implement something based on structs or classes a la Java, which gives you

e.toString

instead of

toString(e)

From this conversation, I'm getting the idea that enums are more an antifeature than anything. And you're using mixins already anyways.
March 26, 2010
On 03/25/2010 06:21 PM, Walter Bright wrote:
> Ellery Newcomer wrote:
>> What do you think of Erlang's bit syntax if you've looked at it, or
>> could you if you haven't?
>
> I know nothing about it.

I suppose you could think of it as sort of a regex-for-generic-data, but you can use it as a pattern matcher or a data constructor.

The example Armstrong gives in Programming Erlang is

<<2#11111111111:11,B:2,C:2,_D:1,E:4,F:2,G:1,Bits:9>> = X

where X is assumedly a byte array or something.

the first 11 bits are asserted to be 1, and the remaining bits get associated with the following variables.

Yeah, that example isn't much more impressive than bitfields.

You could do something like

<<N:32, Data:N/binary, _/binary>> = X

which would interpret the first four bytes as a length field, bind the next N bytes to Data, and ignore the rest..

The general syntax for them should look something like

Bit:   << >>
       << E >>

E:     E , E1
       E1

E1:    Value
       Value : Size
       Value / TypeSpecList
       Value : Size / TypeSpecList

//integral, fp, or binary (byte arrays, strings, etc) type
Value: Expression

// integral type
Size:  Expression

TypeSpecList: TypeSpecList - TypeSpec
              TypeSpec

TypeSpec: Endianness
          Sign
          Type
          Unit

// default is big
Endianness:  big
             little
             native

Sign:  signed
       unsigned

Type:  integer
       float
       binary

//Size*Unit is the number of bits that get matched
// Unit defaults to 1 for Type = integer or float,
                    8 for Type = binary
Unit:  IntegerLiteral
March 26, 2010
Walter Bright:

>Yes, we can endlessly rename keywords, but in the end, what does that accomplish that would compensate for upending every D program in existence?<

I can list few pro/cons, but then the decision is not mine.
I'll close my "bug" report on the base of the answers, because this is only one of the about 15 little breaking changes I have proposed (the disallowing of the octals syntax was another one of them, but after your last answer I consider it a closed problem, the http://d.puremagic.com/issues/show_bug.cgi?id=2656  can be closed).

The type names in D have a nice symmetry, and they are not ex-novo, they are diffused in other languages. I have appreciated them.

I think of a "byte" as unsigned value. This has produced a small bug in a D program of mine. I think changing the names of the signed/unsigned values can solve this.

In C# the type names have the same symmetry as in D, with u for the unsigned ones, but later I have found that it seems C# devs agree with me in thinking of bytes as unsigned, because they break the symmetry using:

The signed/unsigned bytes in C# are:
- The sbyte type represents signed 8-bit integers with values between -128 and
127.
- The byte type represents unsigned 8-bit integers with values between 0 and
255.

C# is usually a carefully designed language, much better designed than C++, so its example is not negligible. Yet, I don't fully like the C# solution, because even if I think of bytes as signed, all the other C#/D types that don't start with "u" are signed (well, chars too are an exception). So C#/D newbies can follow the symmetry and write wrong code again.

So I have suggested ot keep the "ubyte", deprecate the "byte", and introduce a "sbyte" (signed byte). Now it's easy to tell apart what's the signed and what's the unsigned. "byte" can later be removed.

-----------------

The wchar/dchar are short names, easy to write, but for me and a person I've shown/taught D it doesn't result easy to remember their size in bytes. "w" stands for wide, "d" for double, this is easy to remember. But how wide is wide? That's why I have suggested to adopt more descriptive names for them.

A way to invent descriptive names is to use names similar to the byte/shot/int/long integers. Or to use numbers after the "char". I guess now it can be too much late to change type names...

Bye,
bearophile
March 26, 2010
Walter Bright wrote:
> bearophile wrote:
>> Regarding base type names I have proposed :
>> byte => sbyte
>> wchar => char16 (or shortchar)
>> dchar => char32 (or intchar)
> 
> 
> Yes, we can endlessly rename keywords, but in the end, what does that accomplish that would compensate for upending every D program in existence?

Removing a frequent bug.
IMHO, wchar and dchar are fine as is. But byte --> sbyte I support. 'byte' is a really, really awful name.

EVERYONE makes the mistake of thinking 'byte' is unsigned. I still do it, really frequently. I believe that almost every existing use of 'byte' is a bug!

BTW I don't think that "everyone" is much of an exaggeration. For example, YOU have done it! (The first version of the htod utility used 'byte' where it should have been 'ubyte').
And if even you find it unintuitive, I think the entire planet finds it unintuitive.

March 26, 2010
"bearophile" <bearophileHUGS@lycos.com> wrote in message news:hoh07b$16km$1@digitalmars.com...
> Walter Bright:
>
>>Yes, we can endlessly rename keywords, but in the end, what does that accomplish that would compensate for upending every D program in existence?<
>
> I can list few pro/cons, but then the decision is not mine.
>
> The wchar/dchar are short names, easy to write, but for me and a person I've shown/taught D it doesn't result easy to remember their size in bytes. "w" stands for wide, "d" for double, this is easy to remember. But how wide is wide? That's why I have suggested to adopt more descriptive names for them.
>
> A way to invent descriptive names is to use names similar to the byte/shot/int/long integers. Or to use numbers after the "char". I guess now it can be too much late to change type names...
>

As long as we're bikeshedding on type names, I do find it misleading that "char" represents a code-unit while still calling itself a "character". Don't get me wrong, I don't mind that at the language level D operates on code-units instead of code-points (Tango and Phobos2 have pretty darned good handling of code-points anyway). It's just that ever since learning how Unicode works, it seems rather a misleading misnomer to call a code-unit "char". I can live with it, of course, now that I know, but I don't envy the newbies who may come across it.


March 26, 2010
bearophile wrote:
> The wchar/dchar are short names, easy to write, but for me and a person I've
> shown/taught D it doesn't result easy to remember their size in bytes. "w"
> stands for wide, "d" for double, this is easy to remember. But how wide is
> wide? That's why I have suggested to adopt more descriptive names for them.

The wchar and dchar stem from the popular WORD and DWORD sizes on the x86 platforms. wchar_t is of course "wide character" for C, and is often used for UTF-16 at least on Windows platforms.

Confusingly, wchar_t on Linux is 32 bits.

March 26, 2010
Nick Sabalausky:
> As long as we're bikeshedding on type names,

"There are only two hard things in Computer Science: cache invalidation and naming things" ^_^


>I do find it misleading that "char" represents a code-unit while still calling itself a "character".<

The development of the BitC language has restarted, it's a system language that looks like Scheme. It's efficient. This is is the start of a small thread about unicode in BitC:

http://www.coyotos.org/pipermail/bitc-dev/2010-March/001812.html http://www.coyotos.org/pipermail/bitc-dev/2010-March/thread.html#1812

They agree with me that 32 bit chars can be not really true chars, so I think essentially even UFT32 is an bidirectional Range.

Bye,
bearophile
March 26, 2010
"Walter Bright" <newshound1@digitalmars.com> wrote in message news:hoh85i$1kbj$1@digitalmars.com...
> bearophile wrote:
>> The wchar/dchar are short names, easy to write, but for me and a person
>> I've
>> shown/taught D it doesn't result easy to remember their size in bytes.
>> "w"
>> stands for wide, "d" for double, this is easy to remember. But how wide
>> is
>> wide? That's why I have suggested to adopt more descriptive names for
>> them.
>
> The wchar and dchar stem from the popular WORD and DWORD sizes on the x86 platforms. wchar_t is of course "wide character" for C, and is often used for UTF-16 at least on Windows platforms.
>

I think that's why I never had a problem with wchar/dchar. I've dealt with x86 WORD/DWORD. Though I guess that also indicates why other people may find it unintuitive, not everyone has gone low-level like that.


March 26, 2010
"bearophile" <bearophileHUGS@lycos.com>wrote :
> Regarding base type names I have proposed :
> byte => sbyte
> wchar => char16 (or shortchar)
> dchar => char32 (or intchar)
>
> http://d.puremagic.com/issues/show_bug.cgi?id=3850 http://d.puremagic.com/issues/show_bug.cgi?id=3936
>
> Bye,
> bearophile
>

In my embedded C projects I always use
u8, i8
u16, i16
u32, i32
f32 for floats
and they are all defined in user's header file, surely IDE highlights them
as keywords :), so no need for language changes =).