June 19, 2012
On 19-06-2012 22:40, Manu wrote:
> On 19 June 2012 21:19, Iain Buclaw <ibuclaw@ubuntu.com
> <mailto:ibuclaw@ubuntu.com>> wrote:
>
>     1) D Inline Asm and naked function support is raising far too many
>     alarm bells. So would just be easier to remove it and avoid all the
>     other comments on why we need middle-end and backend headers in gdc.
>
>
> Inline assembly has been relatively useless in GCC for years. Inline asm
> interferes with the optimisers ability to do a good job, which basically
> makes use of inline assembly self-defeating.
> The only time I ever need to use inline-asm is to interface an arch
> feature that has no API. As long as there are intrinsics for all the
> opcodes one might want, then it's better to use them.
>
> There are 2 operations that spring to mind that typically don't have
> intrinsics, or high level API's, which I always use asm to
> interface; the fine-grain manual manipulation of the flags register on
> PPC (ie, the '.' suite of opcodes), and conditional execution opcodes on
> ARM. Neither of these have high level expressions, and they are both
> relatively important.
> That said, as stated above, if use of this stuff is for performance,
> then using an inline-asm block will ruin the surrounding code anyway, so
> I almost always find I'm required to write the entire function in asm to
> achieve the expected result...
>
> I see no major loss to removing the inline assembler.
> I would like to know what the issue is though? Why are you compelled to
> remove it?
> I thought GCC optionally supported the microsoft asm syntax instead,
> which should make it syntactically consistent with D?

Not "Microsoft", but Intel syntax.

But GDC's inline assembly syntax is very different from DMD's: https://bitbucket.org/goshawk/gdc/wiki/UserDocumentation#!extended-assembler

-- 
Alex Rønne Petersen
alex@lycus.org
http://lycus.org
June 19, 2012
On 19-06-2012 23:52, Walter Bright wrote:
> On 6/19/2012 1:36 PM, bearophile wrote:
>>> No, but the idea was to allow D to innovate on calling
>>> conventions without disturbing code that needed to
>>> interface with C.
>>
>> The idea is nice, but ideas aren't enough. Where are the benchmarks
>> that show a
>> performance improvement over the C calling convention? And even if such
>> improvement is present, is it worth it in the face of people that
>> don't want to
>> add it to GCC?
>
> GDC can certainly define its D calling convention to match GCC's. It's
> an "implementation defined" thing, not a language defined one.
>

Then let's please rename it to the DMD ABI instead of calling it the D ABI and making it look like it's part of the language on the website. Further, D mangling rules should be separate from calling convention.

-- 
Alex Rønne Petersen
alex@lycus.org
http://lycus.org
June 19, 2012
> Please be informed that GCC inline asm supports Intel syntax...

With -masm=intel.
June 19, 2012
On 19-06-2012 23:22, Manu wrote:
> On 19 June 2012 23:59, deadalnix <deadalnix@gmail.com
> <mailto:deadalnix@gmail.com>> wrote:
>
>     Le 19/06/2012 22:08, Iain Buclaw a écrit :
>
>           From what I gathered from further discussion, it made sense for
>         embedded platforms, such as ARM, but not x86.
>
>
>     It has proven to be useful to me, not only for performances reasons,
>     but also for low level manipulations.
>
>     It don't see what make ARM that different on regard to inline
>     assembly capabilities.
>
>
> If you had the register alias feature I described above, would you be
> ale to write such low-level manipulations using intrinsics?
> I think I would be able to rewrite all x86 asm blocks I've ever written
> using that feature.
>
> ARM and PPC both have unique features relating to their branch control
> and branch prediction that x86 doesn't have. Sadly, all high level
> languages COMPLETELY overlook such features when designing high level
> expressions, because they are traditionally designed for x86 first.

To be fair, ARM v8/AArch64 has eliminated predicated execution, simply because it turned out that the complexity of writing languages and compilers for it was not worth it, compared to just having good branch prediction.

> A thorough set of intrinsics can allow access to these features though,
> although since they're related to branch control/conditional execution,
> it feels clumsy, since you lose the feeling of structured code; ie, no
> scoped if blocks, loop constructs, etc,  if you have to use intrinsics
> to generate conditions or masks.
>
> ARM is the most common architecture on earth now. It would be nice if D
> were able to take better advantage of the architecture.


-- 
Alex Rønne Petersen
alex@lycus.org
http://lycus.org
June 19, 2012
On 20-06-2012 00:48, Trass3r wrote:
>> Please be informed that GCC inline asm supports Intel syntax...
>
> With -masm=intel.

No, you can tell the inline assembler to use Intel syntax from inside code. Iain showed me how on IRC at some point, but I forget the specifics.

-- 
Alex Rønne Petersen
alex@lycus.org
http://lycus.org
June 19, 2012
On 20 June 2012 01:07, Walter Bright <newshound2@digitalmars.com> wrote:

> On 6/19/2012 1:58 PM, Manu wrote:
>
>  I find a thorough suite of architecture intrinsics are usually the
>> fastest and
>> cleanest way to the best possible code, although 'naked' may be handy in
>> this
>> circumstance too...
>>
>
> Do a grep for "naked" across the druntime library sources. For example, its use in druntime/src/rt/alloca.d, where it is very much needed, as alloca() is one of those "magic" functions.


I never argued against naked... I agree it's mandatory.


Do a grep for "asm" across the druntime library sources. Can you justify
> all of that with some other scheme?


I think almost all the blocks I just browsed through could be easily
written with nothing more than the register alias feature I suggested, and
perhaps a couple of opcode intrinsics.
And as a bonus, they would also be readable. I can imagine cases where the
optimiser would have more freedom too.


 Thinking more about the implications of removing the inline asm, what would
>> REALLY roxors, would be a keyword to insist a variable is represented by a register, and by extension, to associate it with a specific register:
>>
>
> This was a failure in C.


Really? This is the missing link between mandatory asm blocks, and being
able to do it in high level code with intrinsics.
The 'register' keyword was similarly fail as 'inline'.. __forceinline was
not fail, it is actually mandatory. I'd argue that __forceregister would be
similarly useful in C aswell, but the real power would come from being able
to specify the particular register to alias.



> This would almost entirely eliminate the usefulness of an inline assembler.
>> Better yet, this could use the 'new' attribute syntax, which most agree
>> will
>> support arguments:
>> @register(rsp) int x;
>>
>
> Some C compilers did have such pseudo-register abilities. It was a failure in practice.
>

Really? I've never seen that. What about it was fail?

I really don't understand preferring all these rather convoluted
> enhancements to avoid something simple and straightforward like the inline assembler. The use of IA in the D runtime library, for example, has been quite successful.
>

I agree, IA is useful and has been successful, but it has drawbacks too.
  * IA ruins optimisation around the IA block
  * IA doesn't inline well. intrinsics allow much greater opportunity for
efficient integration into the calling context
  * most IA functions are small, and prime candidates for inlining (see
points 1 and 2)
  * IA is difficult for the majority of programmers to follow/understand
  * even to experienced programmers, poorly commented asm takes a lot of
time to mentally parse

It's a shame that there are IA constructs that can't be expressed any other way. I don't think it would take much to address that.


For example, consider this bit from druntime/src/rt/lifetime.d:
>

> ------------------------------**------------------------------**-------
>    auto isshared = ti.classinfo is TypeInfo_Shared.classinfo;
>    auto bic = !isshared ? __getBlkInfo((*p).ptr) : null;
>    auto info = bic ? *bic : gc_query((*p).ptr);
>    auto size = ti.next.tsize();
>    version (D_InlineAsm_X86)
>    {
>        size_t reqsize = void;
>
>        asm
>        {
>            mov EAX, newcapacity;
>            mul EAX, size;
>            mov reqsize, EAX;
>            jc  Loverflow;
>        }
>    }
>    else
>    {
>        size_t reqsize = size * newcapacity;
>
>        if (newcapacity > 0 && reqsize / newcapacity != size)
>            goto Loverflow;
>    }
>
>    // step 2, get the actual "allocated" size.  If the allocated size does
> not
>    // match what we expect, then we will need to reallocate anyways.
>
>    // TODO: this probably isn't correct for shared arrays
>    size_t curallocsize = void;
>    size_t curcapacity = void;
>    size_t offset = void;
>    size_t arraypad = void;
> ------------------------------**----------------
>

This one seems trivial, you just need one intrinsic:

  size_t reqsize = size * newcapacity;
  __jc(&Loverflow);

Although it depends on a '&codeLabel' mechanism to get the label address (GCC supports this in C, I'd love to see this in D too).


June 19, 2012
On 19-06-2012 22:51, Trass3r wrote:
>> 1) D Inline Asm and naked function support is raising far too many
>> alarm bells. So would just be easier to remove it and avoid all the
>> other comments on why we need middle-end and backend headers in gdc.
>
> And the C++ frontend doesn't need these headers for its inline assembler
> implementation?

No, it passes the assembly on to the assembler. Simple as that.

-- 
Alex Rønne Petersen
alex@lycus.org
http://lycus.org
June 19, 2012
On 20 June 2012 01:50, Alex Rønne Petersen <alex@lycus.org> wrote:

> On 19-06-2012 23:22, Manu wrote:
>
>> On 19 June 2012 23:59, deadalnix <deadalnix@gmail.com
>>
>> <mailto:deadalnix@gmail.com>> wrote:
>>
>>    Le 19/06/2012 22:08, Iain Buclaw a écrit :
>>
>>          From what I gathered from further discussion, it made sense for
>>        embedded platforms, such as ARM, but not x86.
>>
>>
>>    It has proven to be useful to me, not only for performances reasons,
>>    but also for low level manipulations.
>>
>>    It don't see what make ARM that different on regard to inline
>>    assembly capabilities.
>>
>>
>> If you had the register alias feature I described above, would you be
>> ale to write such low-level manipulations using intrinsics?
>> I think I would be able to rewrite all x86 asm blocks I've ever written
>> using that feature.
>>
>> ARM and PPC both have unique features relating to their branch control and branch prediction that x86 doesn't have. Sadly, all high level languages COMPLETELY overlook such features when designing high level expressions, because they are traditionally designed for x86 first.
>>
>
> To be fair, ARM v8/AArch64 has eliminated predicated execution, simply because it turned out that the complexity of writing languages and compilers for it was not worth it, compared to just having good branch prediction.


I suspect it may have been because C didn't have expressions to support it,
and D... ;)
Shame though, it's a totally awesome hardware feature.

I don't know of any mass-market arm-v8 devices yet. arm-v7 is still very much alive, and will exist for many years yet.


June 19, 2012
On 20 June 2012 01:37, deadalnix <deadalnix@gmail.com> wrote:
>
> Walter gave you examples. You'll find many others in druntime.
>
> Here is something I wrote recently that use this again :
> http://www.deadalnix.me/2012/**03/24/get-an-exception-from-a-**
> segfault-on-linux-x86-and-x86_**64-using-some-black-magic/<http://www.deadalnix.me/2012/03/24/get-an-exception-from-a-segfault-on-linux-x86-and-x86_64-using-some-black-magic/>
>

That code could all be done with the register alias I described, and __push/__pop intrinsics.


June 19, 2012
On 19 June 2012 23:51, Alex Rønne Petersen <alex@lycus.org> wrote:
> On 20-06-2012 00:48, Trass3r wrote:
>>>
>>> Please be informed that GCC inline asm supports Intel syntax...
>>
>>
>> With -masm=intel.
>
>
> No, you can tell the inline assembler to use Intel syntax from inside code. Iain showed me how on IRC at some point, but I forget the specifics.
>
>

iirc, it's:

asm {
    ".intel_syntax noprefix"
    /* Intel syntax here */
    ".att_syntax"
}



-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';