May 19, 2022
On 5/19/2022 1:01 PM, Steven Schveighoffer wrote:
> And let the compiler come up with whatever funky stuff it wants to in order to make it fast.

You never write things like:

   a += (b < c);

? I do. And, as I remarked before, GPUs favor this style of coding, as does SIMD code, as does cryto code.

Hoping the compiler will transform the code into this style, if it is not specified to, is just that, hope :-/

Sometimes this style is not necessarily faster, either, even though the user may desire it for crypto reasons.

May 19, 2022
On 5/19/2022 1:24 PM, H. S. Teoh wrote:
> IME, gcc and ldc2 are well able to convert the above ?: expression into
> the latter, without uglifying the code.  Why are we promoting (or even
> allowing) this kind of ugly code just because dmd's optimizer is so
> lackluster you have to manually spell things out this way?

See my reply to Steven.

BTW, consider auto-vectorizing compilers. A common characteristic of them is that sometimes a loop looks like it should be vectorized, but the compiler didn't, for reasons that are opaque to users. The compiler then substitutes a slow emulation to give the *appearance* of being vectorized.

The only way to tell what is happening is to dump the generate assembler. This is especially troublesome you're attempting to write vector code that is portable among various SIMD instruction sets. It doesn't scale, at all.

This is based on many conversations about this with Manu Evans, who's career was based on writing vector code. Manu has been very influential in the design of D's vector semantics.

Hence D's approach is different. You can write vector code in D. If it won't compile to the target instruction set, it doesn't replace it with emulation. It signals an error. Thus, the user knows if he writes vector code, he gets vector code. It makes it easy for him to use versioning to adjust the shape of the expressions to line up with the vector capabilities of each target.

To sum up, if you want a particular instruction mix in the output stream, a systems programming language must enable expression of that desired mix. It must not rely on undocumented and inconsistent compiler transformations.
May 19, 2022
On Thursday, 19 May 2022 at 04:10:04 UTC, Walter Bright wrote:
> There's nothing wrong with:
>
>     if ('A' <= c && c <= 'Z')
>         c = c | 0x20;
>

Tell me you are American without telling me you are American.

May 19, 2022

On Thursday, 19 May 2022 at 14:33:14 UTC, Steven Schveighoffer wrote:

>

I hope we are not depending on the type system to the degree where a bool must be an integer in order to have this kind of optimization.

-Steve

It is routine to use different types in the front end and backend. Types in the front are there for semantic, correctness, and generally help the developper. Types in the backend are there to help the optimizer and the code generator.

bool is going to be an integer in the backend, for sure. This doesn't mean it has to in the frontend.

May 19, 2022
On Thursday, 19 May 2022 at 16:42:58 UTC, matheus wrote:
> On Thursday, 19 May 2022 at 04:35:45 UTC, Walter Bright wrote:
>> On 5/18/2022 5:47 PM, Steven Schveighoffer wrote:
>>> But I have little hope for it, as Walter treats a boolean as an integer.
>>
>> They *are* integers.
>> 
>
> I always thought them as integers, yesterday I was adding some new features do addam_d_ruppes' IRC client and I did:
>
>    auto pgdir = (ev.key == Keyboard.Key.PageDown)-(ev.key == KeyboardEvent.Key.PageUp);
>
> So to get: -1, 0 or 1, and do the next action according the input given from the user.
>
> Matheus.

This doesn't imply they are integer, but that they are convertible to integers.

You could do the same operation with the key being a short, and pgdir would be an int. It doesn't mean that shorts are int.
May 19, 2022
On Thursday, 19 May 2022 at 18:20:26 UTC, Walter Bright wrote:
> On 5/19/2022 7:33 AM, Steven Schveighoffer wrote:
>> I hope we are not depending on the type system to the degree where a bool must be an integer in order to have this kind of optimization.
>
>
> Does that mean you prefer:
>
>     a = 3 + cast(int)(b < c) * 5;
>
> ? If so, I don't see what is gained by that.

No. The `*` imply a promotion of its argument. It is very easy to define bool has promoting to int without making bool an int.
May 19, 2022
On Thursday, 19 May 2022 at 20:48:47 UTC, Ali Çehreli wrote:
[...]
> > "However, the assumption that setting bit 5 of the
> representation will
> > convert uppercase letters to lowercase is not valid for
> EBCDIC." [1]
> >
> > [1] Does C and C++ guarantee the ASCII of [a-f] and [A-F]
> characters?
> >      https://ogeek.cn/qa/?qa=669486/
>
> In D, char is UTF-8 and ASCII is a subset of UTF-8.

The latter part, that ASCII is a subset of UTF-8, is 1†. I disagree with the wording of the former part, that in D a char "is" UTF-8.

> Walter's code above is valid without making any ASCII assumption.

Walter made the 0 claim "It does not assume it, it tests for if it would be valid [ascii and not unicode]" [2] ‡

Okay. Let's do UTF-8:

```
import std.stdio;
import std.string;
import std.utf;

char char_tolower_bright (char c)
{
   if ('A' <= c && c <= 'Z')
      c = c | 0x20;
   return c;
}

string tolower_bright (string s)
{
   string t;
   foreach (c; s.byCodeUnit)
      t ~= c.char_tolower_bright;
   return t;
}

void process_strings (string s)
{
   writefln!"input            : %s" (s);
   auto t = s.tolower_bright;
   writefln!"bright           : %s" (t);
   auto u = s.toLower;
   writefln!"toLower (std.utf): %s" (u);
}

void main ()
{
   process_strings ("A Ä");
   process_strings ("A Ä");
}
```

Free of charge I compiled and ran this for you:

   $ dmd lcb
   $ ./lcb
   input            : A Ä
   bright           : a Ä
   toLower (std.utf): a ä
   input            : A Ä
   bright           : a ä
   toLower (std.utf): a ä

See the problem?

  † Hint for interpretation: booleans "are" integers.
[2] http://forum.dlang.org/post/t662ll$tnm$1@digitalmars.com
  ‡ There is probably no consensus about what "it" means.
May 19, 2022
On Thursday, 19 May 2022 at 20:48:47 UTC, Ali Çehreli wrote:
> In D, char is UTF-8 and ASCII is a subset of UTF-8. Walter's code above is valid without making any ASCII assumption.
>

Sure, it also doesn't perform any useful operation. other than "Uncapitalize English, do nothing for non latin languages, and create a mess with any non English latin language", which, while it certainly is a valid program, it is doesn't looks like it is something anyone would actually want to do for other reasons than it's easy to write and good enough.
May 19, 2022
On Thursday, 19 May 2022 at 21:44:55 UTC, Walter Bright wrote:
> You never write things like:
>
>    a += (b < c);
>
> ? I do. And, as I remarked before, GPUs favor this style of coding, as does SIMD code, as does cryto code.
>
> Hoping the compiler will transform the code into this style, if it is not specified to, is just that, hope :-/
>
> Sometimes this style is not necessarily faster, either, even though the user may desire it for crypto reasons.

That doesn't strike me as very convincing, because the compiler will sometime do the opposite too, so either way, at least for crypto, you have to look at the disassembly.

In our case, we even instrumentalized valgrind to cause a CI failure when such a branch occurs and run it on every patch.
May 19, 2022
On Thursday, 19 May 2022 at 22:14:31 UTC, kdevel wrote:
> Free of charge I compiled and ran this for you:
>
>    $ dmd lcb
>    $ ./lcb
>    input            : A Ä
>    bright           : a Ä
>    toLower (std.utf): a ä
>    input            : A Ä
>    bright           : a ä
>    toLower (std.utf): a ä
>
> See the problem?
>

You could have use "Ali Çehreli" as a test case :)