October 24, 2013
On Thursday, 24 October 2013 at 06:40:14 UTC, Iain Buclaw wrote:
>>> volatile was never a reliable method for dealing with memory mapped I/O.
>>> The correct and guaranteed way to make this work is to write two "peek" and
>>> "poke" functions to read/write a particular memory address:
>>>
>>>     int peek(int* p);
>>>     void poke(int* p, int value);
>>>
>>> Implement them in the obvious way, and compile them separately so the
>>> optimizer will not try to inline/optimize them.
>>
>>
>> Thanks for the answer, Walter. I think this would be acceptable in many
>> (most?) cases, but not where high performance is needed  I think these
>> functions add too much overhead if they are not inlined and in a critical
>> path (bit-banging IO, for example).  Afterall, a read/write to a volatile
>> address is a single atomic instruction, if done properly.
>>
>
> Operations on volatile are *not* atomic. Nor do they establish a
> proper happens-before relationship for threading.  This is why we have
> core.atomic as a portable synchronisation mechanism in D.
>
>
> Regards

I probably shouldn't have used the word "operations".  What I meant is reading/writing to a volatile, aligned word in memory is an atomic operation.  At least on my target platform it is.  That may not be a correct generalization, however.

The point I'm trying to make is the Peek/Poke function proposal adds function overhead compared to the "volatile" method in C, and I'm just want to know if there's a way to to eliminate/reduce it.
October 24, 2013
On Thursday, 24 October 2013 at 06:48:07 UTC, Walter Bright wrote:
> On 10/23/2013 11:19 PM, Mike wrote:
>> Thanks for the answer, Walter. I think this would be acceptable in many (most?)
>> cases, but not where high performance is needed I think these functions add too
>> much overhead if they are not inlined and in a critical path (bit-banging IO,
>> for example). Afterall, a read/write to a volatile address is a single atomic
>> instruction, if done properly.
>>
>> Is there a way to tell D to remove the function overhead, for example, like a
>> "naked" attribute, yet still retain the "volatile" behavior?
>
> You have to give up on volatile. Nobody agrees on what it means. What does "don't optimize" mean? And that's not at all the same thing as "atomic".
>
> I wouldn't worry about peek/poke being too slow unless you actually benchmark it and prove it is. Then, your alternatives are:
>
> 1. Write it in ordinary D, compile it, check the code generated, and if it is what you want, you're golden (at least for that compiler & switches).
>
> 2. Write it in inline asm. That's what it's for.
>
> 3. Write it in an external C function and link it in.

Well, I wasn't rooting for volatile, I just wanted a way to read/write my IO registers as fast as possible with D.

I think the last two methods you've given confirm my suspicions and will work.  But... I had my heart set on doing it all in D :-(

Thanks for the answers.
October 24, 2013
On Thursday, 24 October 2013 at 06:41:54 UTC, Timo Sintonen wrote:
> On Thursday, 24 October 2013 at 05:37:49 UTC, Walter Bright wrote:
>> On 10/23/2013 5:43 PM, Mike wrote:
>>> I'm interested in ARM bare-metal programming with D, and I'm trying to get my
>>> head wrapped around how to approach this.  I'm making progress, but I found
>>> something that was surprising to me: deprecation of the volatile keyword.
>>>
>>> In the bare-metal/hardware/driver world, this keyword is important to ensure the
>>> optimizer doesn't cache reads to memory-mapped IO, as some hardware peripheral
>>> may modify the value without involving the processor.
>>>
>>> I've read a few discussions on the D forums about the volatile keyword debate,
>>> but noone seemed to reconcile the need for volatile in memory-mapped IO.  Was
>>> this an oversight?
>>>
>>> What's D's answer to this?  If one were to use D to read from memory-mapped IO,
>>> how would one ensure the compiler doesn't cache the value?
>>
>> volatile was never a reliable method for dealing with memory mapped I/O. The correct and guaranteed way to make this work is to write two "peek" and "poke" functions to read/write a particular memory address:
>>
>>    int peek(int* p);
>>    void poke(int* p, int value);
>>
>> Implement them in the obvious way, and compile them separately so the optimizer will not try to inline/optimize them.
>
> Yes, this is a simplest way to do it and works with gdc when compiled in separate file with no optimizations and inlining.
>
> But todays peripherals may have tens of registers and they are usually represented as a struct. Using the peripheral often require several register access. Doing it this way will not make code very readable.
>
> As a workaround I have all register access functions in a separate file and compile those files in a separate directory with no optimizations. The amount of code generated is 3-4 times more and this is a problem because in controllers memory and speed are always too small.

+1, This is what I feared.  I don't think D needs a volatile keyword, but it would be nice to have *some* way to avoid this overhead using language features.

I'm beginning to think inline ASM is the only way to avoid this.  That's not a deal breaker for me, but it makes me sad.
October 24, 2013
On Thursday, 24 October 2013 at 06:37:08 UTC, Iain Buclaw wrote:
> On 24 October 2013 06:37, Walter Bright <newshound2@digitalmars.com> wrote:
>> On 10/23/2013 5:43 PM, Mike wrote:
>>>
>>> I'm interested in ARM bare-metal programming with D, and I'm trying to get
>>> my
>>> head wrapped around how to approach this.  I'm making progress, but I
>>> found
>>> something that was surprising to me: deprecation of the volatile keyword.
>>>
>>> In the bare-metal/hardware/driver world, this keyword is important to
>>> ensure the
>>> optimizer doesn't cache reads to memory-mapped IO, as some hardware
>>> peripheral
>>> may modify the value without involving the processor.
>>>
>>> I've read a few discussions on the D forums about the volatile keyword
>>> debate,
>>> but noone seemed to reconcile the need for volatile in memory-mapped IO.
>>> Was
>>> this an oversight?
>>>
>>> What's D's answer to this?  If one were to use D to read from
>>> memory-mapped IO,
>>> how would one ensure the compiler doesn't cache the value?
>>
>>
>> volatile was never a reliable method for dealing with memory mapped I/O.
>
> Are you talking dmd or in general (it's hard to tell).  In gdc,
> volatile is the same as in gcc/g++ in behaviour.  Although in one
> aspect, when the default storage model was switched to thread-local,
> that made volatile on it's own pointless.
>
> As a side note, 'shared' is considered a volatile type in gdc, which
> differs from the deprecated keyword which set volatile at a
> decl/expression level.  There is a difference in semantics, but it
> escapes this author at 6.30am in the morning.  :o)
>
> In any case, using shared would be my recommended route for you to go down.
>
>
>> The correct and guaranteed way to make this work is to write two "peek" and
>> "poke" functions to read/write a particular memory address:
>>
>>     int peek(int* p);
>>     void poke(int* p, int value);
>>
>> Implement them in the obvious way, and compile them separately so the
>> optimizer will not try to inline/optimize them.
>
> +1.  Using an optimiser along with code that talks to hardware can
> result in bizarre behaviour.

Well, I've done some reading about "shared" but I don't quite grasp it yet.  I still have some learning to do.  That's my problem, but if you feel like explaining how it can be used in place of volatile for hardware register access, that would be awfully nice.
October 24, 2013
On 24 October 2013 08:18, Mike <none@none.com> wrote:
> On Thursday, 24 October 2013 at 06:37:08 UTC, Iain Buclaw wrote:
>>
>> On 24 October 2013 06:37, Walter Bright <newshound2@digitalmars.com> wrote:
>>>
>>> On 10/23/2013 5:43 PM, Mike wrote:
>>>>
>>>>
>>>> I'm interested in ARM bare-metal programming with D, and I'm trying to
>>>> get
>>>> my
>>>> head wrapped around how to approach this.  I'm making progress, but I
>>>> found
>>>> something that was surprising to me: deprecation of the volatile
>>>> keyword.
>>>>
>>>> In the bare-metal/hardware/driver world, this keyword is important to
>>>> ensure the
>>>> optimizer doesn't cache reads to memory-mapped IO, as some hardware
>>>> peripheral
>>>> may modify the value without involving the processor.
>>>>
>>>> I've read a few discussions on the D forums about the volatile keyword
>>>> debate,
>>>> but noone seemed to reconcile the need for volatile in memory-mapped IO.
>>>> Was
>>>> this an oversight?
>>>>
>>>> What's D's answer to this?  If one were to use D to read from
>>>> memory-mapped IO,
>>>> how would one ensure the compiler doesn't cache the value?
>>>
>>>
>>>
>>> volatile was never a reliable method for dealing with memory mapped I/O.
>>
>>
>> Are you talking dmd or in general (it's hard to tell).  In gdc, volatile is the same as in gcc/g++ in behaviour.  Although in one aspect, when the default storage model was switched to thread-local, that made volatile on it's own pointless.
>>
>> As a side note, 'shared' is considered a volatile type in gdc, which differs from the deprecated keyword which set volatile at a decl/expression level.  There is a difference in semantics, but it escapes this author at 6.30am in the morning.  :o)
>>
>> In any case, using shared would be my recommended route for you to go down.
>>
>>
>>> The correct and guaranteed way to make this work is to write two "peek"
>>> and
>>> "poke" functions to read/write a particular memory address:
>>>
>>>     int peek(int* p);
>>>     void poke(int* p, int value);
>>>
>>> Implement them in the obvious way, and compile them separately so the optimizer will not try to inline/optimize them.
>>
>>
>> +1.  Using an optimiser along with code that talks to hardware can result in bizarre behaviour.
>
>
> Well, I've done some reading about "shared" but I don't quite grasp it yet. I still have some learning to do.  That's my problem, but if you feel like explaining how it can be used in place of volatile for hardware register access, that would be awfully nice.

'shared' guarantees that all reads and writes specified in source code happen in the exact order specified with no omissions, as there may be other threads reading/writing to the variable at the same time.


Regards
-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';
October 24, 2013
On Thursday, 24 October 2013 at 08:20:43 UTC, Iain Buclaw wrote:
> On 24 October 2013 08:18, Mike <none@none.com> wrote:
>> On Thursday, 24 October 2013 at 06:37:08 UTC, Iain Buclaw wrote:
>>>
>>> On 24 October 2013 06:37, Walter Bright <newshound2@digitalmars.com>
>>> wrote:
>>>>
>>>> On 10/23/2013 5:43 PM, Mike wrote:
>>>>>
>>>>>
>>>>> I'm interested in ARM bare-metal programming with D, and I'm trying to
>>>>> get
>>>>> my
>>>>> head wrapped around how to approach this.  I'm making progress, but I
>>>>> found
>>>>> something that was surprising to me: deprecation of the volatile
>>>>> keyword.
>>>>>
>>>>> In the bare-metal/hardware/driver world, this keyword is important to
>>>>> ensure the
>>>>> optimizer doesn't cache reads to memory-mapped IO, as some hardware
>>>>> peripheral
>>>>> may modify the value without involving the processor.
>>>>>
>>>>> I've read a few discussions on the D forums about the volatile keyword
>>>>> debate,
>>>>> but noone seemed to reconcile the need for volatile in memory-mapped IO.
>>>>> Was
>>>>> this an oversight?
>>>>>
>>>>> What's D's answer to this?  If one were to use D to read from
>>>>> memory-mapped IO,
>>>>> how would one ensure the compiler doesn't cache the value?
>>>>
>>>>
>>>>
>>>> volatile was never a reliable method for dealing with memory mapped I/O.
>>>
>>>
>>> Are you talking dmd or in general (it's hard to tell).  In gdc,
>>> volatile is the same as in gcc/g++ in behaviour.  Although in one
>>> aspect, when the default storage model was switched to thread-local,
>>> that made volatile on it's own pointless.
>>>
>>> As a side note, 'shared' is considered a volatile type in gdc, which
>>> differs from the deprecated keyword which set volatile at a
>>> decl/expression level.  There is a difference in semantics, but it
>>> escapes this author at 6.30am in the morning.  :o)
>>>
>>> In any case, using shared would be my recommended route for you to go
>>> down.
>>>
>>>
>>>> The correct and guaranteed way to make this work is to write two "peek"
>>>> and
>>>> "poke" functions to read/write a particular memory address:
>>>>
>>>>     int peek(int* p);
>>>>     void poke(int* p, int value);
>>>>
>>>> Implement them in the obvious way, and compile them separately so the
>>>> optimizer will not try to inline/optimize them.
>>>
>>>
>>> +1.  Using an optimiser along with code that talks to hardware can
>>> result in bizarre behaviour.
>>
>>
>> Well, I've done some reading about "shared" but I don't quite grasp it yet.
>> I still have some learning to do.  That's my problem, but if you feel like
>> explaining how it can be used in place of volatile for hardware register
>> access, that would be awfully nice.
>
> 'shared' guarantees that all reads and writes specified in source code
> happen in the exact order specified with no omissions, as there may be
> other threads reading/writing to the variable at the same time.
>
>
> Regards

Is it actually implemented as such in any D compiler? That's a lot of memory barriers, shared would have to come with a massive SLOW! notice on it. Not saying that's a bad choice necessarily, but I was pretty sure this had never been implemented.
October 24, 2013
On 24 October 2013 10:27, John Colvin <john.loughran.colvin@gmail.com> wrote:
> On Thursday, 24 October 2013 at 08:20:43 UTC, Iain Buclaw wrote:
>>
>> On 24 October 2013 08:18, Mike <none@none.com> wrote:
>>>
>>> On Thursday, 24 October 2013 at 06:37:08 UTC, Iain Buclaw wrote:
>>>>
>>>>
>>>> On 24 October 2013 06:37, Walter Bright <newshound2@digitalmars.com> wrote:
>>>>>
>>>>>
>>>>> On 10/23/2013 5:43 PM, Mike wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> I'm interested in ARM bare-metal programming with D, and I'm trying to
>>>>>> get
>>>>>> my
>>>>>> head wrapped around how to approach this.  I'm making progress, but I
>>>>>> found
>>>>>> something that was surprising to me: deprecation of the volatile
>>>>>> keyword.
>>>>>>
>>>>>> In the bare-metal/hardware/driver world, this keyword is important to
>>>>>> ensure the
>>>>>> optimizer doesn't cache reads to memory-mapped IO, as some hardware
>>>>>> peripheral
>>>>>> may modify the value without involving the processor.
>>>>>>
>>>>>> I've read a few discussions on the D forums about the volatile keyword
>>>>>> debate,
>>>>>> but noone seemed to reconcile the need for volatile in memory-mapped
>>>>>> IO.
>>>>>> Was
>>>>>> this an oversight?
>>>>>>
>>>>>> What's D's answer to this?  If one were to use D to read from
>>>>>> memory-mapped IO,
>>>>>> how would one ensure the compiler doesn't cache the value?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> volatile was never a reliable method for dealing with memory mapped I/O.
>>>>
>>>>
>>>>
>>>> Are you talking dmd or in general (it's hard to tell).  In gdc, volatile is the same as in gcc/g++ in behaviour.  Although in one aspect, when the default storage model was switched to thread-local, that made volatile on it's own pointless.
>>>>
>>>> As a side note, 'shared' is considered a volatile type in gdc, which differs from the deprecated keyword which set volatile at a decl/expression level.  There is a difference in semantics, but it escapes this author at 6.30am in the morning.  :o)
>>>>
>>>> In any case, using shared would be my recommended route for you to go down.
>>>>
>>>>
>>>>> The correct and guaranteed way to make this work is to write two "peek"
>>>>> and
>>>>> "poke" functions to read/write a particular memory address:
>>>>>
>>>>>     int peek(int* p);
>>>>>     void poke(int* p, int value);
>>>>>
>>>>> Implement them in the obvious way, and compile them separately so the optimizer will not try to inline/optimize them.
>>>>
>>>>
>>>>
>>>> +1.  Using an optimiser along with code that talks to hardware can result in bizarre behaviour.
>>>
>>>
>>>
>>> Well, I've done some reading about "shared" but I don't quite grasp it
>>> yet.
>>> I still have some learning to do.  That's my problem, but if you feel
>>> like
>>> explaining how it can be used in place of volatile for hardware register
>>> access, that would be awfully nice.
>>
>>
>> 'shared' guarantees that all reads and writes specified in source code happen in the exact order specified with no omissions, as there may be other threads reading/writing to the variable at the same time.
>>
>>
>> Regards
>
>
> Is it actually implemented as such in any D compiler? That's a lot of memory barriers, shared would have to come with a massive SLOW! notice on it. Not saying that's a bad choice necessarily, but I was pretty sure this had never been implemented.

If you require memory barriers to access share data, that is what 'synchronized' and core.atomic is for.  There is *no* implicit locks occurring when accessing the data.

-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';
October 24, 2013
On Thursday, 24 October 2013 at 09:43:51 UTC, Iain Buclaw wrote:
>>> 'shared' guarantees that all reads and writes specified in source code
>>> happen in the exact order specified with no omissions

> If you require memory barriers to access share data, that is what
> 'synchronized' and core.atomic is for.  There is *no* implicit locks
> occurring when accessing the data.

If there are no memory barriers, then there is no guarantee* of ordering of reads or writes. Sure, the compiler can promise not to rearrange them, but the CPU is a different matter.

*dependant on CPU architecture of course. e.g. IIRC the intel atom never reorders anything.
October 24, 2013
On Thursday, 24 October 2013 at 05:37:49 UTC, Walter Bright wrote:
> On 10/23/2013 5:43 PM, Mike wrote:
> volatile was never a reliable method for dealing with memory mapped I/O. The correct and guaranteed way to make this work is to write two "peek" and "poke" functions to read/write a particular memory address:
>
>     int peek(int* p);
>     void poke(int* p, int value);
>
> Implement them in the obvious way, and compile them separately so the optimizer will not try to inline/optimize them.

I rised the problem here:

http://forum.dlang.org/thread/selnpobzzvrsuyihnstl@forum.dlang.org

Anyway, poke's and peek's are a bit more cumbersome than volatile variables, since they do not cope so well, for example, with arithmetic expressions.

Anyway, still better than nothing. *If* they would exist.

IMHO, the embedded and hardware interfacing should get more attention.
October 24, 2013
On Thursday, 24 October 2013 at 06:48:07 UTC, Walter Bright wrote:
> On 10/23/2013 11:19 PM, Mike wrote:
>> Thanks for the answer, Walter. I think this would be acceptable in many (most?)
>> cases, but not where high performance is needed I think these functions add too
>> much overhead if they are not inlined and in a critical path (bit-banging IO,
>> for example). Afterall, a read/write to a volatile address is a single atomic
>> instruction, if done properly.
>>
>> Is there a way to tell D to remove the function overhead, for example, like a
>> "naked" attribute, yet still retain the "volatile" behavior?
>
> You have to give up on volatile. Nobody agrees on what it means. What does "don't optimize" mean? And that's not at all the same thing as "atomic".

Is not about "atomize me", it is about "really *read* me" or "really *write* me" at that memory location, don't fake it, don't cache me. And do it now, not 10 seconds later.