ARM bare-metal programming in D (cont) - volatile (page 7) - D Programming Language Discussion Forum

On Monday, 28 October 2013 at 08:42:12 UTC, Walter Bright wrote: > On 10/28/2013 1:13 AM, Russel Winder wrote: > Ask any two people, even ones in this thread, what "volatile" means, and you'll get two different answers. Note that the issues of reordering, caching, cycles, and memory barriers are separate and distinct issues. Those issues also vary dramatically from one architecture to the next. "volatile" => "fickle" > (For example, what really happens with a+=1 ? Should it generate an INC, or an ADD, or a MOV/ADD/MOV triple for MMIO? Where do the barriers go? Do you even need barriers? Should a LOCK prefix be emitted? How is the compiler supposed to know just how the MMIO works on some particular computer board?) read [address] into registry (mov) registry++ (add) write registry to [address] (mov) You cannot do it otherwise (that is, a shortcut operator). "Shortcut" operators on fickle memory location shall be simply forbidden. Compiler is able to complain about that. Only explicit reads and writes shall be possible. OK, go with peek() and poke() if you feel it's better and easier (this avoids the a+=1 problem). At least as a first step. But put those into the compiler/phobos, not force somebody to write ASM or C for that. If D send people back to a C compiler, it would never displace C. Templated peek() and poke() are 5 LOCs. Put those in a std.hardware module and, if you prefer, leave it undocumented. Since we discuss this matter, it could have been solved 10 times.

On 10/28/2013 2:33 AM, eles wrote: >> (For example, what really happens with a+=1 ? Should it generate an INC, or an >> ADD, or a MOV/ADD/MOV triple for MMIO? Where do the barriers go? Do you even >> need barriers? Should a LOCK prefix be emitted? How is the compiler supposed >> to know just how the MMIO works on some particular computer board?) > > read [address] into registry (mov) > registry++ (add) > write registry to [address] (mov) > > You cannot do it otherwise (that is, a shortcut operator). That overlooks what happens if another thread changes the memory in between the read and the write. Hence the issues of memory barriers, lock prefixes, etc. > Since we discuss this matter, it could have been solved 10 times. Pull requests are welcome!

October 28, 2013

Re: ARM bare-metal programming in D (cont) - volatile

Posted by Walter Bright
in reply to Russel Winder

Permalink

Walter Bright

Posted in reply to Russel Winder

Permalink

On 10/28/2013 12:49 AM, Russel Winder wrote:
> On Sun, 2013-10-27 at 02:12 -0700, Walter Bright wrote:
> […]
>> Bitfield code generation for C compilers has generally been rather crappy. If
>> you wanted performant code, you always had to do the masking yourself.
>
> Endianism and packing have always been the bête noir of bitfields due to
> it not being part of the standard but left as compiler specific – sort
> of essentially in a way due to the vast difference in targets. Given a
> single compiler for a given target I never found the generated code
> poor. Using the UNIX compiler in early 1980s and the AVR compiler suites
> we used in the 2000s generated code always seemed fine. What's your
> evidence for hand crafted code being better than compiler generated
> code?

Generally the shifting is unnecessary, but the compiler doesn't know that as the spec says the values need to be right-justified. Also, I often set/reset/test many fields at once - doesn't work to well with bitfields.

Endianism should not be an issue if you're dealing with MMIO, since MMIO is going to be extremely target-specific and hence so is your code to deal with it.


>> I've written device drivers, and have designed, built, and programmed single
>> board computers. I've never found dealing with the oddities of memory mapped I/O
>> and bit flags to be of any difficulty.
>
> But don't you find:
>
> 	*x = (1 << 7) & (1 << 9)
>
> to lead directly to the use of macros:
>
> 	SET_SOMETHING_READY(x)
>
> to hide the lack of immediacy of comprehension of the purpose of the
> expression?

My bit code usually looks like:

      x |= FLAG_X | FLAG_Y;
      x &= ~(FLAG_Y | FLAG_Z);
      if (x & (FLAG_A | FLAG_B)) ...

You'll find stuff like that all through the dmd source code :-)


>> Do you really find & and | operations to be ugly? I don't find them any uglier
>> than + and *. Maybe that's because of my hardware background.
>
> It's not the operations that are the problem, it is the expressions
> using them that lead to code that is the antithesis of self-documenting.
> Almost all code using <<, >>, & and | invariable ends up being replaced
> with macros in C and C++ so as to avoid using functions.
>
> The core point here is that this sort of code fails as soon as a
> function call is involved, functions cannot be used as a tool of
> abstraction. At least with C and C++.

I thought that with modern inlining, this was no longer an issue.


> Clearly D has a USP over C and C++ here in that macros can be replaced
> by CTFE. But how to guarantee that a function is fully evaluated at
> compile time and not allowed to generate a function call. Only then can
> functions be used instead of macros to make such code self documenting.

    enum X = foo(args);

guarantees that foo(args) is evaluated at compile time. I.e. in any context that requires a value at compile time guarantees that it will get evaluated at compile time. If it is not required at compile time, it will not attempt CTFE on it.

On Monday, 28 October 2013 at 16:06:48 UTC, Walter Bright wrote: > On 10/28/2013 2:33 AM, eles wrote: > That overlooks what happens if another thread changes the memory in between the read and the write. Hence the issues of memory barriers, lock prefixes, etc. Synchronizing the access to the resource is the job of the programmer. He will take a mutex for it. You do that inside the kernel space, not in the user space. There is just one kernel, and it is able to synchronize with itself. Put this into perspective. > Pull requests are welcome! You pre-approve?

Forums