March 26, 2010
KennyTM~ Wrote:

> On Mar 26, 10 05:46, yigal chripun wrote:
> >
> > while it's true that '?' has one unicode value for it, it's not true for all sorts of diacritics and combine code-points. So your approach is to pass the responsibility for that to the end user which in 99.9999% will not handle this correctlly.
> >
> 
> Non-issue. Since when can a character literal store > 1 code-point?

character != code-point

D chars are really as you say code-points and not always complete characters.

here's a use case for you:
you want to write a fully unicode aware search engine.
If you just try to match the given sequnce of code-points in the search term, you will miss valid matches since, for instance you do not take into account permutations of the order of combining marks.
you can't just assume that the code-point value identifies the character.
March 26, 2010
Walter Bright:

>The wchar and dchar stem from the popular WORD and DWORD sizes on the x86 platforms. wchar_t is of course "wide character" for C, and is often used for UTF-16 at least on Windows platforms. Confusingly, wchar_t on Linux is 32 bits.<

From such comments of yours, and from the lack of interest from Don, Nick and others, I guess that idea of changing char names is closed, even if I don't like those names. I will update the bugzilla with this.

The proposal of renaming the byte => sbyte is not closed/resolved yet. It's a little more important, because the current situation is bug-prone.

Bye,
bearophile
March 26, 2010
Aelxx:
> In my embedded C projects I always use
> u8, i8
> u16, i16
> u32, i32
> f32 for floats
> and they are all defined in user's header file, surely IDE highlights them
> as keywords :), so no need for language changes =).

D tries to give sensible defaults, good for most programmers, also to increase code uniformity across different codebases. Python shows that having common style and idioms is very useful to build better a community of shared modules.

Bye,
bearophile
March 26, 2010
Walter Bright wrote:
> I'm not sure why this error is happening, it definitely has something to do with the mixin. Let me look into it some more.

Bug report and fix: http://d.puremagic.com/issues/show_bug.cgi?id=4011
March 26, 2010
On Mar 26, 10 18:52, yigal chripun wrote:
> KennyTM~ Wrote:
>
>> On Mar 26, 10 05:46, yigal chripun wrote:
>>>
>>> while it's true that '?' has one unicode value for it, it's not true for all sorts of diacritics and combine code-points. So your approach is to pass the responsibility for that to the end user which in 99.9999% will not handle this correctlly.
>>>
>>
>> Non-issue. Since when can a character literal store>  1 code-point?
>
> character != code-point
>
> D chars are really as you say code-points and not always complete characters.
>
> here's a use case for you:
> you want to write a fully unicode aware search engine.
> If you just try to match the given sequnce of code-points in the search term, you will miss valid matches since, for instance you do not take into account permutations of the order of combining marks.
> you can't just assume that the code-point value identifies the character.

Stop being off-topic. '?' is of type char, not string. A char always holds an octet of UTF-8 encoded sequence. The numerical content is unique and well-defined*. Therefore adding 4 to '?' also has a meaning.

* If you're paranoid you may request the spec to ensure the character is in NFC form.
March 27, 2010
Nick Sabalausky wrote:
> Anyway, this is what I'm doing, and it's giving me a conflict error on the call to 'bar' in 'main' with DMD 1.056 (fortunately, however, it seems to work fine in 2.042, so I guess the situation's improved in D2):

The mixins obscure what is going on. The same error is reproduced with the simpler:

--- a.d ---
enum FooA { foo }
void bar(FooA e) { }

--- b.d ---
enum FooB { foo }
void bar(FooB e) { }

--- test.d ---
import a;
import b;

void main()
{
    bar(FooB.foo); // Error! 'bar' conflict
}
--------------

D1 uses the earlier anti-hijacking system which disallows overloading of functions from different imports, unless they are specifically combined using an alias declaration.

D2 improves this using Andrei's suggestion that such overloading be allowed if and only if functions from exactly one of those imports are potential matches. This is why the example code works on D2.

If we change the declaration of bar in a.d to:

--- a.d ---
void bar(int e) { }
-----------

then the compilation under D2 fails, because both a.bar and b.bar are now candidates, due to implicit conversion rules.
March 27, 2010
On 03/26/2010 02:01 PM, Walter Bright wrote:
> Walter Bright wrote:
>> I'm not sure why this error is happening, it definitely has something
>> to do with the mixin. Let me look into it some more.
>
> Bug report and fix: http://d.puremagic.com/issues/show_bug.cgi?id=4011

Awesome!
March 27, 2010
I can add something to this long thread.

I don't like C octal literals for aesthetic reasons, because in mathematics leading zeros before the decimal point are not significant. Programming languages are not forced to follow math notation conventions, but experience clearly shows me that when possible it's convenient to follow math notation because it's widely known, by newbie programmers too, and expert programmers to know it well, sometimes from primary school, so it's well interiorized. That's why for example the missing operator precedence rules of Smalltalk are bad. So I think C octal literals are bad-looking small traps that a modern language is better without. If octal literals are seen as useful, then a better, more explicit and safer octal literal can be invented. This is what Python3 has done, and this is what I think D should do. This is what I have asked in bug report 3837.

On the other hand no octal number has ever caused a bug in my Python2.x or D programs, and I think it has caused no bugs even in my C code. So despite being little traps for me they are not so bug-prone.

On the other hand, the name of signed bytes has caused one or more hard-to-find bugs in my D code. I have never put such bug in Delphi/C programs (but in C the signedness of chars has caused me few troubles in the past. Thanks Walter D chars don't come in signed and unsigned versions, avoiding that chars bugs, and introduces bugs on bytes...). It's not just a matter of bug count: D forces me to keep some of my attention on the signedness of bytes when I program, this slows down programming a bit and distracts a small part of my attention away from the things that truly matter, that is problem solving, the things that the program has to do, etc. So I think on this D is worse than Delphi/C, and deserves to be fixed.

My D1 dlibs are something like 80-90 thousand lines of code, and then there is all the code that uses those dlibs, so I have probably written 250_000 or more lines of D1 code, this can be more than the D1 code written by Walter. So I am able to see what things in D1 have caused me bugs or require some undeserved attention to write bug-free code.

Bye,
bearophile
March 27, 2010
KennyTM~ <kennytm@gmail.com> wrote:

> On Mar 26, 10 05:46, yigal chripun wrote:
>>
>> while it's true that '?' has one unicode value for it, it's not true for all sorts of diacritics and combine code-points. So your approach is to pass the responsibility for that to the end user which in 99.9999% will not handle this correctlly.
>>
>
> Non-issue. Since when can a character literal store > 1 code-point?

Not only that, but one does not routinely go about adding random
numbers to randomly chosen code points. When '?' + 3 is used, it's
because it was the best (fastest/easiest/most readable) way to do it.
Or somebody was just showing off, but that can be done in more
horrible ways.

-- 
Simen
March 28, 2010
bearophile wrote:

> To invent a software you can first find the best syntax. This seems a nice syntax, very similar to the enum one (that ubyte is optional):
> 
> flags ubyte Todo {
>     do_nothing,
>     walk_dog,
>     cook_breakfast,
>     deliver_newspaper,
>     visit_miss_kerbopple,
>     wash_covers
> }
> 
> Todo todo = Todo.walk_dog | Todo.deliver_newspaper | Todo.wash_covers;
> if (todo == (Todo.walk_dog | Todo.deliver_newspaper)) { ...
> if ((Todo.walk_dog | Todo.deliver_newspaper) in todo) { ...
> if ((Todo.walk_dog | Todo.deliver_newspaper) & todo) { ...
> assert((Todo.walk_dog | Todo.walk_dog) == Todo.walk_dog); // OK
> 
> 
> A way to implement it with current D2 syntax:
> 
> 
> alias Flags!(ubyte, "do_nothing",
>                     "walk_dog"
>                     "cook_breakfast"
>                     "deliver_newspaper"
>                     "visit_miss_kerbopple"
>                     "wash_covers") Todo;
> 
> 
> Where Flags defines a struct, "do_nothing" are compile-time constants. It
> can overload 8 operators:
> =   ==   |    |=    in   &   &=  opBool
> 
> The operator ! too can be defined, but I think it looks too much like the | so it can be omitted (other operators like ^ and ~ are possible).
> 

I like this idea of implementing a flag type and tried to work something out. Instead of implementing the overloads, it is also possible to generate an enum via CTFE inside a struct and forward with alias this, what do you think? I have tried this syntax, seems to work ok:

alias Flags!q{ do_nothing,
               walk_dog,
               cook_breakfast,
               deliver_newspaper,
               visit_miss_kerbopple,
               wash_covers } Todo;

It does allow this though, but perhaps that can fixed:

Todo todo = Todo.walk_dog;
todo |= 4;

With such a type, it is easy to add some small convenience features, such as
an  enumToString, define property .max and implement a range that iterates
over all flags set or possible values.