June 01, 2014
I did not yet read the dip but here are some of my thoughts:

At the old days peripherals were simple. An uart might have a control register, a status register and a data register, 8 bit each. It just did not matter how they were accessed. Now a peripheral like usb or ethernet may have tens of 32 bit registers.

The Basic language did not have pointers or any way to address a certain location of memory. Several extensions were made to get access to system locations. One common extension was poke and peek functions. They had 16 bit address and 8 bit data. Basic did not have any data types or stuctures.
D has pointers that can be used to access memory. It also has several data types. A library function does not know if it should do 8/16/32/64 bit access without templates. That would be too complicated for such a low level operation like register access.

The registers of a peripheral may be defined as a struct and a pointer of this struct type is used to access the registers. There are individual registers but there may also be some sub register sets inside the register set. A peripheral may have common registers and then per channel registers. The register struct may then have substructs or an array of register sets that may be accessed as structs or arrays.

Yes, there are different kind of registers.
- Normal registers can be read and written. These are used as normal control and status registers. Some bits may be changed by hardware at any time. This may be a problem because it is impossible to have a fully atomic access. The time between read and write should be as short as possible.
- Read only registers may be used to represent the capabilities of the peripheral or calibration values. They always return the same data. Status registers represent the current state of the hardware and may change any time when the conditions change. Write to these registers has no effect.
- Write only registers are used to send data. The data packet is written byte by byte to this same address. These type of registers are also used to clear status. Reading the register may return the last data or zero or anything else and the value should be ignored.
- Bidirectional registers are used as data registers. A read will return the last received byte and a write will transmit the byte written.

Usually it does not matter if these registers are accessed wrong way (write a read only or read a write only) so there is no need to mark them different. They can all be volatile.


It is also common that one register has mixed read/write, read only and write only bits. Many registers have also undefined/reserved bits, which sometimes should be written with zeros and sometimes left as they are.

One of the most common operations is to wait some status:
while ((regs.status&0x80)==0)  { /* check timeout here */ }
The way to clear the status may be one of:
- write directly to the status bit
  regs.status &= 0xffffff7f;
- write a 1 to the bit
  regs.status |= 0x80;
- sometimes writing 0 to other bits has no effect and there is no need to read-modify-write
  regs.status = 0x80;
- sometimes status is cleared by writing to another bit
  regs.status |= 0x200;
- sometimes there is a separate clear register
  regs.statusclear = 0x80;
- sometimes accessing the data register clears status automatically
- sometimes reading the status register clears the status. In this case all status bits have to be checked at once.

Many of these have the result that reading the register does not give back the data that was written.

And no, I did not read this on Wikipedia. All these forms of access exist in the processor I use (STM32F407) It seems that several teams have made the peripherals on the chip and every peripheral has its own way to access it.


Another thing is: do I need to mark every member in a struct volatile or is it enough to mark the struct definition or the struct pointer. Will it go transitively to every member of an array of substructs or do I need mark the substructs volatile?

One thing is the struct pointer. The peripherals have a fixed address. If there is only one peripheral, the address can be a compile time constant. If there are several similar peripherals, the address may be known at compile time or it could be immutable that is initialized at start.
Now I have to make the pointer shared to have the struct members shared. This means the pointer is also mutable and volatile. The pointer can not be cached and has to be fetched again every time the struct variables are accessed. This decreases performance.


D has been marketed as a system language. Accessing registers is an essential part of system programming. Whatever the method is, it has to be in the language, not an external library function.

June 02, 2014
http://bugzilla.gdcproject.org/show_bug.cgi?id=126

--- Comment #11 from Mike <slavo5150@yahoo.com> ---
Ok, clearly I have not fully understood shared semantics.

> In future, the compiler would memoize the loop and go straight for the assignment.
> 
> mov    $12, _D4test4globi(%rip)

That would be very bad for my memory-mapped I/O needs.

> Regarding peek/poke functions: Don't you think that's too cumbersome?
> I also think shared + peek/poke has the drawback that you can still
> accidentally access it like a normal shared variable, without the peek/poke
> guarantees.

I do find it cumbersome, but I'm ok with it because I'll be wrapping it in a mixin or a template.  But you make a good point about "accidental" access.

> BTW: I finally finished the volatile DIP, see http://wiki.dlang.org/DIP62. It'd be great to get some early feedback from you and Iain, and feel free to edit the DIP :-)

The DIP is extremely well written.  I've read it a couple of times and I'm currently studying some of the references.  I think you've made a very compelling case for adding volatile semantics, and I support it, but I must be honest, peek() and poke() intrinsics would also be fine for me (more about that later).  I'm under the impression, however, that this DIP will be a very tough sell to Walter.

I've created a "design discussion" around this debate on the D wiki:
http://wiki.dlang.org/Language_design_discussions#volatile
If we are to lobby the core design team to accept this DIP it would probably be
wise to review past discussions and prepare offensive and defensive arguments.

Walter said in the past that there is debate about what 'volatile' really means
(http://forum.dlang.org/post/l4afr7$2pj8$1@digitalmars.com) and argues that
peek() and poke() intrinsics is the "correct and guaranteed way" to do
memory-mapped I/O (http://forum.dlang.org/post/l4abnd$2met$1@digitalmars.com)

Daniel Murphy argued that it is property of the load/store operation and not the variable (http://forum.dlang.org/post/l4b1j4$acl$1@digitalmars.com) and, I think this is the core of the debate.

DIP62 makes a compelling case for why 'volatile' should be a property of the type, but I think it would help to justify why 'volatile' is not a property of the load/store operation.  I can actually see it both ways, and am therefore somewhat on the fence.

-- 
You are receiving this mail because:
You are watching all bug changes.


June 02, 2014
http://bugzilla.gdcproject.org/show_bug.cgi?id=126

--- Comment #12 from Johannes Pfau <johannespfau@gmail.com> ---
> I've created a "design discussion" around this debate on the D wiki: http://wiki.dlang.org/Language_design_discussions#volatile

Thanks, that's very useful. I did not even remember that we already had DIPs for peek/poke functions.

> I'm under the impression, however, that this DIP will be a very tough sell to Walter.

Regarding the old discussions I have to agree. But I think we need to make using volatile memory as simple or even simpler as in C. The Hello World of embedded programming is blinking a LED. If a C programmer sees something like poke(addr, peek(addr) | 0b1); to set a bit in a register they'll probably discard D immediately. I think this is a really important point.

> If we are to lobby the core design team to accept this DIP it would probably be wise to review past discussions and prepare offensive and defensive arguments.

Yes, that's a good point. I'll revisit these old discussions and make some notes or update the DIP.

> Walter said in the past that there is debate about what 'volatile' really means (http://forum.dlang.org/post/l4afr7$2pj8$1@digitalmars.com)

Fortunately with this DIP this should no longer be a valid point. In the end what C does or did does not matter as long as we clearly specify what volatile is supposed to do in D and I think the "Effects of volatile on code generation" should describe all guarantees a compiler needs to provide.

> and argues > that peek() and poke() intrinsics is the "correct and guaranteed > way" to do memory-mapped I/O (http://forum.dlang.org
> /post/l4abnd$2met$1@digitalmars.com)

Well, that's not an argument ;-) It's also easy to prove wrong, nobody uses peek/poke on AVR, MSP430, ARM so it's probably not as undisputed as Walter claims.

> Daniel Murphy argued that it is property of the load/store operation and not
> the variable (http://forum.dlang.org/post/l4b1j4$acl$1@digitalmars.com) and,
> I think this is the core of the debate.
> DIP62 makes a compelling case for why 'volatile' should be a property of the
> type, but I think it would help to justify why 'volatile' is not a property
> of the load/store operation.  I can actually see it both ways, and am
> therefore somewhat on the fence.

Yes, this is an important point and I'll extend the DIP a little in this
regard.
I think the point that you must _always_ access such memory obeying volatile
rules shows why it is a property of the memory and not the access. There's no
reasonable example where you want one access to the same location to be
volatile and another time not. It's even dangerous sometimes (DIP 4.2.2).

One other way to think about it is that all 'only the access is volatile' arguments apply in exactly the same way to 'shared' or 'immutable' variables. Only the access is ever affected, cause in the end that's the only thing you can do with variables. But we nevertheless have shared and immutable qualifiers, simply because we want this memory to be _always_ accessed in threadsafe/readonly ways, and it's exactly the same for volatile.

Another point is that using peek/poke without a special qualifier relies only on conventions to ask programmers to always use the correct functions to access a pointer, the type system doesn't help. People already admitted this was the biggest mistake of D's volatile statement and I don't see how peek/poke would be different in this regard.

And of course without a type qualifier there can't be transitivity. The programmer always has to be careful to access struct members, array members, and other types 'connected' via indirection with peek/poke.

-- 
You are receiving this mail because:
You are watching all bug changes.


June 02, 2014
Am Sun, 01 Jun 2014 15:37:04 +0000
schrieb "Timo Sintonen" <t.sintonen@luukku.com>:

> I did not yet read the dip but here are some of my thoughts:
> 

Thanks for joining this discussion, your input is very appreciated.

> At the old days peripherals were simple. An uart might have a control register, a status register and a data register, 8 bit each. It just did not matter how they were accessed. Now a peripheral like usb or ethernet may have tens of 32 bit registers.
> 
> The Basic language did not have pointers or any way to address a
> certain location of memory. Several extensions were made to get
> access to system locations. One common extension was poke and
> peek functions. They had 16 bit address and 8 bit data. Basic did
> not have any data types or stuctures.
> D has pointers that can be used to access memory. It also has
> several data types. A library function does not know if it should
> do 8/16/32/64 bit access without templates. That would be too
> complicated for such a low level operation like register access.

I'll have to agree. Ironically the simple 'peek/poke' templates discussed before would fail miserably - such short templates are almost always inlined and then all guarantees gained by using normal functions are lost... OTOH it must inline for performance reasons. So in the end library templates can't work, we'd at least need compiler intrinsics.

> The registers of a peripheral may be defined as a struct and a pointer of this struct type is used to access the registers. There are individual registers but there may also be some sub register sets inside the register set. A peripheral may have common registers and then per channel registers. The register struct may then have substructs or an array of register sets that may be accessed as structs or arrays.
> 
> Yes, there are different kind of registers.
> - Normal registers can be read and written. These are used as
> normal control and status registers. Some bits may be changed by
> hardware at any time. This may be a problem because it is
> impossible to have a fully atomic access. The time between read
> and write should be as short as possible.
> - Read only registers may be used to represent the capabilities
> of the peripheral or calibration values. They always return the
> same data. Status registers represent the current state of the
> hardware and may change any time when the conditions change.
> Write to these registers has no effect.
> - Write only registers are used to send data. The data packet is
> written byte by byte to this same address. These type of
> registers are also used to clear status. Reading the register may
> return the last data or zero or anything else and the value
> should be ignored.
> - Bidirectional registers are used as data registers. A read will
> return the last received byte and a write will transmit the byte
> written.
> 
> Usually it does not matter if these registers are accessed wrong way (write a read only or read a write only) so there is no need to mark them different. They can all be volatile.

I guess for such complicated cases the solution presented by Mike are
nice:
https://github.com/JinShil/D_Runtime_ARM_Cortex-M_study/blob/master/1.4-memory_mapped_io/source/mmio.d

However, I'm not sure if it can work for simple cases where you need zero memory/instruction overhead. (8/16bit controllers, GBA like devices)

> 
> It is also common that one register has mixed read/write, read only and write only bits. Many registers have also undefined/reserved bits, which sometimes should be written with zeros and sometimes left as they are.

I guess we can't do much about this. Proper bitfields could help, but try to sell these to Walter ;-) However, a struct + property functions + forceinline + volatile could work.

> One of the most common operations is to wait some status:
> while ((regs.status&0x80)==0)  { /* check timeout here */ }
> The way to clear the status may be one of:
> - write directly to the status bit
>    regs.status &= 0xffffff7f;
> - write a 1 to the bit
>    regs.status |= 0x80;
> - sometimes writing 0 to other bits has no effect and there is no
> need to read-modify-write
>    regs.status = 0x80;
> - sometimes status is cleared by writing to another bit
>    regs.status |= 0x200;
> - sometimes there is a separate clear register
>    regs.statusclear = 0x80;
> - sometimes accessing the data register clears status
> automatically
> - sometimes reading the status register clears the status. In
> this case all status bits have to be checked at once.
> 
> Many of these have the result that reading the register does not give back the data that was written.
> 
> And no, I did not read this on Wikipedia. All these forms of access exist in the processor I use (STM32F407) It seems that several teams have made the peripherals on the chip and every peripheral has its own way to access it.
> 
> 
> Another thing is: do I need to mark every member in a struct volatile or is it enough to mark the struct definition or the struct pointer. Will it go transitively to every member of an array of substructs or do I need mark the substructs volatile?
> 
> One thing is the struct pointer. The peripherals have a fixed
> address. If there is only one peripheral, the address can be a
> compile time constant. If there are several similar peripherals,
> the address may be known at compile time or it could be immutable
> that is initialized at start.
> Now I have to make the pointer shared to have the struct members
> shared. This means the pointer is also mutable and volatile. The
> pointer can not be cached and has to be fetched again every time
> the struct variables are accessed. This decreases performance.

Maybe I misunderstood, but you can always do this:
---------------------------------
struct Timer
{
    uint current;
    uint step;
    uint control;
    bool isActive() volatile
    {
        return control & 0b1 ? true : false;
    }
}

enum volatile(Timer)* TimerA = cast(volatile(Timer)*)0xABCD;

TimerA.isActive();
TimerA.isActive();
---------------------------------

Although we often say the 'this' pointer is qualified this is sloppy speaking. The destination of the this pointer is qualified, but not the pointer itself. So this is volatile(Timer)* in the above example, not volatile(Timer*) and it's the same for shared, const, etc.

It is however true that you cannot mark the pointer immutable and keep the data mutable, that's the usual head-const problem.


I think you should be mostly happy with the DIP ;-) Is there anything you find lacking in C's volatile implementation? This DIP mostly follows the C way, but adds transitivity and some D specific stuff (necessary as our structs can have methods)

> 
> D has been marketed as a system language. Accessing registers is an essential part of system programming. Whatever the method is, it has to be in the language, not an external library function.

I see we agree on the importance of this ;-)


June 02, 2014
On Monday, 2 June 2014 at 17:27:52 UTC, Johannes Pfau wrote:
> And of course without a type qualifier there can't be transitivity. The
> programmer always has to be careful to access struct members, array members,
> and other types 'connected' via indirection with peek/poke.

I too think that a) volatile is necessary, and b) that it should apply to variables, not operations. However, I'm not convinced of transitivity. It makes sense to treat members of a volatile struct as volatile, too, but I don't see why this needs to be the case for pointers. Are there even cases of volatile pointers at all? Usually, hardware registers don't contain pointers, and when they do (DMA-like things maybe, but those typically use physical addresses, not (virtual) pointers), what they point to would probably be normal memory, wouldn't it?
June 03, 2014
Am Mon, 02 Jun 2014 17:51:39 +0000
schrieb "Marc Schütz" <schuetzm@gmx.net>:

> On Monday, 2 June 2014 at 17:27:52 UTC, Johannes Pfau wrote:
> > And of course without a type qualifier there can't be
> > transitivity. The
> > programmer always has to be careful to access struct members,
> > array members,
> > and other types 'connected' via indirection with peek/poke.
> 
> I too think that a) volatile is necessary, and b) that it should apply to variables, not operations. However, I'm not convinced of transitivity. It makes sense to treat members of a volatile struct as volatile, too, but I don't see why this needs to be the case for pointers. Are there even cases of volatile pointers at all? Usually, hardware registers don't contain pointers, and when they do (DMA-like things maybe, but those typically use physical addresses, not (virtual) pointers), what they point to would probably be normal memory, wouldn't it?

I'm on the fence with transitivity as well cause it's really rarely used with volatile. For DMA, strictly speaking the memory locations are volatile cause the processor can modify the destination memory at any time and if you have cached some value there you get problems. But in practice you likely won't access the destination memory until the transfer completed so it's usually not a problem.

More important points are:
* consistency with shared/const/immutable. Less special cases is always
  good.
* The main argument for shared to be transitive was 'if another thread
  can access ptr, it can also access *ptr so that location is shared as
  well'. This also applies for volatile, it's just very rare that
  indirection occurs. The main point for transitivity are interrupt
  handlers that use 'normal' volatile memory to pass information. Here
  volatile should be transitive.

These two points are not very strong and if you can give some examples where transitivity hurts I'd be glad to change this. But I couldn't think of an example where transitivity actually is a problem, mainly because indirections are uncommon. (The DMA example is one example where transitivity can be slightly annoying, but it's also one example where you can use a cast easily to avoid these problems.)

June 11, 2014
http://bugzilla.gdcproject.org/show_bug.cgi?id=126

Martin Nowak <code@dawg.eu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |code@dawg.eu

--- Comment #13 from Martin Nowak <code@dawg.eu> ---
My answer to the topic from a mail exchange with Michael.

On 06/11/2014 12:45 PM, Mike Franklin wrote:
> Martin,
>
> If you are interested in this low level programming in D, you might also be interested in DIP62 (http://wiki.dlang.org/DIP62).
>
> If you remember from my presentation, there was a glaring problem with my implementation:  using 'shared' for 'volatile' semantics.  This is fundamentally incorrect.  There is a discussion about it here (http://forum.dlang.org/post/mailman.1081.1400818840.2907.d.gnu@puremagic.com).
>
> There aren't many people in the D community doing such low-level work, so I fear that this issue will be marginalized.  I would like to see more discussion about it, and hopefully some action.  Your input would be quite valuable.
>
> Mike
> 
Didn't knew about this proposal, but it's flawed IMO, because read-modify-write
operations might get interrupted. So you do need atomic updates for volatile
data that is accessed from an interrupt handler. Adding a type qualifier for
memory mapped I/O is overkill IMO.
The simple solution is to use shared+core.atomic for interrupt sensitive data
and a few functions for volatile.

void volatileSet(T)(ref shared(T) t, HeadUnshared!T val) if (T.sizeof == 4)
{ asm { naked; str r1, [r0]; bx lr; } }
HeadUnshared!T volatileGet(T)(ref shared(T) t) if (T.sizeof == 4)
{ asm { naked; ldr r0, [r0]; bx lr; } }

And because it's very difficult to ensure that certain memory-mapped I/O (which
is essentially shared global state) isn't affected by an interrupt routine,
it's safer to use core.atomic all the time. I don't think that atomic ops
(using MemoryOrder.raw) will a big performance penalty, because memory-mapped
I/O is rare.
If it is, the volatile workaround is simple enough to be implemented in a
library.

-- 
You are receiving this mail because:
You are watching all bug changes.


June 11, 2014
http://bugzilla.gdcproject.org/show_bug.cgi?id=126

--- Comment #14 from Martin Nowak <code@dawg.eu> ---
Bit-bending I/O is a counter-example against the "no performance needed" argument, but I still think this can be achieved easily enough with the existing tools, i.e. by inlining the asm code in your hot loop or writing an asm function.

for (uint i = 0; i < 32; ++i)
{
    if (data & 1)
        asm { mov r0, #1; str r0, [r3]; }
    else
        asm { mov r0, #2; str r0, [r3]; }
    data >>= 1;
}

Compiler intrinsics for volatile loading/storing could be implemented with zero overhead and it would be useful for at least one other cases (forced float rounding to fix excess precision). So I favor this solution, but until then core.atomic can be used.

-- 
You are receiving this mail because:
You are watching all bug changes.


June 12, 2014
http://bugzilla.gdcproject.org/show_bug.cgi?id=126

--- Comment #15 from Johannes Pfau <johannespfau@gmail.com> ---
> Didn't knew about this proposal
It's not yet been announced/discussed, though I'll probably start a discussion on the newsgroup this week.

> but it's flawed IMO, because read-modify-write operations might get interrupted. So you do need atomic updates for volatile data that is accessed from an interrupt handler. Adding a type qualifier for memory mapped I/O is overkill IMO.

That's partially true, but in embedded programming you know exactly what the
interrupt handler does, as you write it. And if an interrupt handler gets
invoked on timer updates and you only rearm the timer and probably write a
variable, then there's no need / it's wasteful to use atomic access to other
registers (like GPIO control, ADC) in normal code. And if you modify the same
registers in interrupt handlers and normal code, atomic access usually won't
suffice, you often need critical sections anyway.
Some architectures do not even provide atomic instructions (AVR). Instead you
globally disable all interrupts if you want to do something atomic, then enable
them again. Now think about how wasteful this gets if every register access is
causing this. Obviously you want to leave atomic access to the programmer in
this case.

Also as this DIP tries to explain shared does not provide enough guarantees to replace volatile(4.2.4). Adding these guarantees to shared would prevent possible valid optimizations for real shared data.


Whether ASM solutions or peek/poke are acceptable is probably a point of view thing. If the amount of code dealing with volatile/MMIO registers is low you might get away writing ASM. But for small microcontrollers doing only simple tasks you might end up writing ASM or peek/poke all the time and it'd be very hard for D to compete with C then.

The classical hello world for these devices is usually blinking a LED
(http://www.micahcarrick.com/tutorials/avr-microcontroller-tutorial/getting-started.html).
This usually requires accessing two registers. Now if somebody asks us 'How
does Hello World look like in D?' and we present a small D main function + 10
lines of ASM they'll laugh at us and will immediately stop considering D. It's
the same for peek/poke.

For small, embedded devices D hasn't got much to offer. Cleaner syntax, a little bit CTFE but accessing MMIO in a comfortable way is a deal breaker here.

So from this point of view I'd say a type qualifier is well justified. I could also say a 'shared' type qualifier is not justified, because I don't even have multiple threads on embedded devices - as you see it's only a point of view thing.

-- 
You are receiving this mail because:
You are watching all bug changes.


June 14, 2014
http://bugzilla.gdcproject.org/show_bug.cgi?id=126

--- Comment #16 from Martin Nowak <code@dawg.eu> ---
(In reply to Johannes Pfau from comment #15)
> The classical hello world for these devices is usually blinking a LED (http://www.micahcarrick.com/tutorials/avr-microcontroller-tutorial/getting- started.html). This usually requires accessing two registers. Now if somebody asks us 'How does Hello World look like in D?' and we present a small D main function + 10 lines of ASM they'll laugh at us and will immediately stop considering D. It's the same for peek/poke.
> 
Nope, that would only be in the header/library that defines all those MM I/O
registers. Those "indirections" are used for any embedded programming. For
example in avr-libc you'll find this macro expansion.
GPIOA -> _SFR_MEM8(0x000A) -> _MMIO_BYTE(0x000A) -> (*(volatile uint8_t
*)(mem_addr)).
This expands to volatile, because volatile provides the correct semantics.
But you could as well expand it to some template in D which implements the
correct accesses.
This is what Michael did for ARM [1]. The only problem here is how to implement
volatile reads/writes semantically correct and with as little overhead as
possible. Currently he used shared for reading [2] and writing [3] and those
few places could as well be replaced with intrinsics or asm until intrinsics
are available.

https://github.com/JinShil/stm32_registers/blob/master/source/stm32/registers/rcc.d https://github.com/JinShil/memory_mapped_io/blob/d19cefb42000cd06605ddf4d4d6b120670400144/source/mmio.d#L335 https://github.com/JinShil/memory_mapped_io/blob/d19cefb42000cd06605ddf4d4d6b120670400144/source/mmio.d#L402

-- 
You are receiving this mail because:
You are watching all bug changes.