Should you be able to initialize a float with a char? (page 6)

On 5/19/2022 5:50 PM, kdevel wrote: > Not "or" but "and". The first input contains UTF-8 of the (normalized) codepoint, which is left unchanged by Walters lowercase function. The second input contains UTF-8 of the same codepoint in canonically decomposed form (NFD). Should stick with normalized forms for good reason. Having two different sequences supposedly compare equal is an abomination. Though none of this supports the notion that arithmetic should not be done on chars. Heck, UTF-8 cannot be decoded without such arithmetic.

On 5/19/2022 3:20 PM, deadalnix wrote: > That doesn't strike me as very convincing, because the compiler will sometime do the opposite too, I know this is technically possible, but have you ever seen this?

On 5/18/22 15:11, max haughton wrote: > For example: > > float x = 'a'; > > Currently compiles. Going a little off-topic, I recommend Don Clugston's very entertaining DConf 2016 presentation "Using Floating Point Without Losing Your Sanity": http://dconf.org/2016/talks/clugston.html Ali

On 5/19/2022 5:06 PM, Steven Schveighoffer wrote: > I'd happily write that in exchange for not having this happen: > > ```d > enum A : int { a } > > Json j = A.a; > writeln(j); // false > ``` I presume Json is a bool. And the bool is written as false. If it's bad that 0 implicitly converts to a bool, then it should also be bad that 0 implicitly converts to char, ubyte, byte, int, float, etc. It implies all implicit conversions should be removed. While that is a reasonable point of view, I used a language that did that (Wirth's Pascal) and found it annoying and unpleasant. >> And, as I remarked before, GPUs favor this style of coding, as does SIMD code, as does cryto code. > > If the optimizer can't see through the ternary expression with 2 constants, then maybe it needs updating. To make the examples understandable, I use trivial cases. > I want to write the clearest code possible, and let the optimizer wizards do their magic. I appreciate you want to write clear code. I do, too. The form I wrote is perfectly clear. Maybe it's just me, but I've never had any difficulty with the equivalence of: true, 1, +5V, On, Yes, T, +10V, etc. I doubt that this gives anyone trouble, either: enum Flags { A = 1, B = 2, C = 4, } int flags = A | C; if (flags & C) ... It's clear to me that there is no set of rules that will please everyone and is objectively better than the others. At some point it ceases to be useful to continue to debate it, as no resolution will satisfy everyone.

On 5/19/2022 3:51 PM, max haughton wrote: > Good compilers can actually print a report of why they didn't vectorize things. I guess Manu never used good compilers :-) Manu asked that a report be given in the form of an error message. Since it's what he did all day, I gave that a lot of weight. Also, the point was Manu could then adjust the code with version statements to write loops that worked best on each target, rather than suffer unacceptable degradation from the fallback emulations. > If you're writing SIMD code without dumping the assembler anyway you're not paying enough attention. If you're going to go to all that effort you're going to be profiling the code, and any good profiler will show you the disassembly alongside. Maybe it doesn't scale in some minute sense but in practice I don't think it makes that much difference because you have to either do the work anyway, or it doesn't matter. Manu did this all day and I gave a lot of weight to what he said would work best for him. If you're writing vector operations, for a vector instruction set, the compiler should give errors if it cannot do it. Emulation code is not acceptable. I advocate disassembling, too, (remember the -vasm switch?) but disassembling and inspecting manually does not scale at all. > LDC doesn't do this, GCC does. I don't think it actually matters, whereas if you're consuming a library from someone who didn't do the SIMD parts properly, it will at very least compile with LDC. At least compiling is not good enough if you're expecting vector speed.

May 20, 2022

Re: Should you be able to initialize a float with a char?

Posted by max haughton
in reply to Walter Bright

Permalink

max haughton

Posted in reply to Walter Bright

Permalink

On Friday, 20 May 2022 at 03:42:06 UTC, Walter Bright wrote:
> Manu asked that a report be given in the form of an error message. Since it's what he did all day, I gave that a lot of weight.
>
> Also, the point was Manu could then adjust the code with version statements to write loops that worked best on each target, rather than suffer unacceptable degradation from the fallback emulations.
>

I think you're talking about writing SIMD code not autovectorization. The report is *not* an error message, neither literally in this case nor spiritually, it's telling you what the compiler was able to infer from your code. Automatic vectorization is *not* writing code that uses SIMD instructions directly, they're two different beasts.

Typically the direct-SIMD algorithm is much faster, at the expense of being orders of magnitude slower to write: The instruction selection algorithms GCC and LLVM use simply aren't good enough to exploit all 15 billion instructions Intel have in their ISA, but they're almost literally hand-beaten to be good at SPEC benchmarks so many patterns are recognized and optimized just fine.

>> If you're writing SIMD code without dumping the assembler anyway you're not paying enough attention. If you're going to go to all that effort you're going to be profiling the code, and any good profiler will show you the disassembly alongside. Maybe it doesn't scale in some minute sense but in practice I don't think it makes that much difference because you have to either do the work anyway, or it doesn't matter.
>
> Manu did this all day and I gave a lot of weight to what he said would work best for him. If you're writing vector operations, for a vector instruction set, the compiler should give errors if it cannot do it. Emulation code is not acceptable.

It's not an unreasonable thing to do I just don't think it it's that much of a showstopper either way. If I *really* care about being right per platform I'm probably going to be checking CPUID at runtime anyway.

LDC is the compiler people who actually ship performant D code use and I've never actually seen anyone complain about this.

> I advocate disassembling, too, (remember the -vasm switch?) but disassembling and inspecting manually does not scale at all.

You *have* to do it or you are lying to yourself - even if the compiler was perfect, which they often aren't. When I use VTune I see a complete breakdown of the disassembly, source code, pipeline state, memory hierarchy, how much power the CPU used etc, temperature (Cat blocking the computer's conveniently warm exhaust?)

This isn't so much about the actual instructions/intrinsics you end up with , that's just a means to an end, but rather that if you aren't keeping an eye on the performance effects of each line you add and where the performance is happening then you aren't being a good engineer e.g. you can spend too much time working on the SIMD parts of an algorithm and get distracted from the parts that are the new bottleneck (the memory hierarchy, also note that ).

Despite this I do think it's still a huge failure of programming as an industry that it's a site like Compiler Explorer, or a flag like -vasm, actually needs to exist. This should be something much more deeply ingrained into our workflows, programming lags behind more serious forms of engineering when it comes to the correlation of what we think things do versus what they actually do.

Aside for anyone reading:
See Sites's classic article/note "It's the memory stupid" https://www.ardent-tool.com/CPU/docs/MPR/101006.pdf DEC died but he was right.

>> LDC doesn't do this, GCC does. I don't think it actually matters, whereas if you're consuming a library from someone who didn't do the SIMD parts properly, it will at very least compile with LDC.
>
> At least compiling is not good enough if you're expecting vector speed.

You still have "vector speed" in a sense. The emulated SIMD is still good it's just not optimal, as I was saying previously there are targets where even though you *have* (say) 256 bit registers, you actually might want to use 128 bit ones in some places because newer instructions tend to be emulated (in a sense) so might not actually be worth the port pressure inside the processor.

Basically everything has (a lot of) SIMD units these days, so even this emulated computation will still be pretty fast. You see SIMD instruction sets included in basically anything for more than the price of a pint of beer (Sneaky DConf Plug...), e.g. the Allwinner D1 is a cheapo RISC-V core from China, comes with a reasonably standard-compliant vector instruction set implementation. Even microcontrollers.

For anyone interested the core inside the D-1 is open source https://github.com/T-head-Semi/openc906

On Friday, 20 May 2022 at 03:27:12 UTC, Walter Bright wrote: > If it's bad that 0 implicitly converts to a bool, then it should also be bad that 0 implicitly converts to char, ubyte, byte, int, float, etc. It implies all implicit conversions should be removed. Why should that be so? Why do you take for granted that whatever happens to bool should happen to ubyte, int etc...

On Friday, 20 May 2022 at 03:45:23 UTC, Walter Bright wrote: > On 5/19/2022 3:04 PM, deadalnix wrote: >> Tell me you are American without telling me you are American. > > I didn't know the Aussies and Brits used umlauts. The Brits Charlotte Brontë, Emily Brontë (and other members of the Brontë family), Noël Coward, Zoë Wanamaker, Zoë Ball, Emeli Sandé, John le Carré and the Australians Renée Geyer and Zoë Badwi and the Americans Beyoncé Knowles, Chloë Grace Moretz, Chloë Sevigny, Renée Fleming, Renée Zellweger, Zoë Baird, Zoë Kravitz, Donté Stallworth, John C. Frémont, Robert M. Gagné, Roxanne Shanté, Janelle Monáe, Jhené Aiko might want to have a word with you ;-)

Forums