Jump to page: 1 2
Thread overview
Question about GCC / GDC / LDC syntax of inline asm advanced
July 06
In GDC and LDC’s inline asm syntax, the main asm part is separated from the constraints block by the first colon that begins the ‘outputs’ section. My question: how does GDC / LDC / GCC parse the first part, given that there can be umpteen kinds of assembler language. Is the parsing asm dialect-specific so that a full parse finds the first significant colon ?

If not and the very first colon (outside double-quoted strings and comments) ends the first section, which is how I parse it, then there is a problem, as labels contain colons. And so I have a bug in my gramma for my kludge asm section parser, see thread elsewhere.

About labels then, is a label the only place a non section-terminator colon can occur?

A label doesn’t have a colon before the name, but after it, is that correct for ATT asm dialect? If so, I could check for a newline plus optional whitespace required before any colon if it is to be recognised as a section terminator and the beginning of the constraints.
July 06
On Thursday, 6 July 2023 at 23:11:08 UTC, Cecil Ward wrote:
> In GDC and LDC’s inline asm syntax, the main asm part is separated from the constraints block by the first colon that begins the ‘outputs’ section. My question: how does GDC / LDC / GCC parse the first part, given that there can be umpteen kinds of assembler language. Is the parsing asm dialect-specific so that a full parse finds the first significant colon ?
>
> [...]

I would have put this question in the GDC or LDC sections, but it applies to both.
July 07

On Thursday, 6 July 2023 at 23:11:08 UTC, Cecil Ward wrote:

>

In GDC and LDC’s inline asm syntax, the main asm part is separated from the constraints block by the first colon that begins the ‘outputs’ section. My question: how does GDC / LDC / GCC parse the first part, given that there can be umpteen kinds of assembler language. Is the parsing asm dialect-specific so that a full parse finds the first significant colon ?

The first part is parsed as an AssignExpression, so you could have:

asm {
  (test ? "if-true-insn" : "if-false-insn")
  ~ buildAsmString(foo, bar)
  ~ test2() ? enumInsnTrue : enumInsnFalse // assign-expression finishes here
  : output-constraints
  : ...
}

It's only at semantic-time that a "string-literal" result is enforced using CTFE.

>

If not and the very first colon (outside double-quoted strings and comments) ends the first section, which is how I parse it, then there is a problem, as labels contain colons. And so I have a bug in my gramma for my kludge asm section parser, see thread elsewhere.

Labels are statements, so there shouldn't be any conflict between the two.

July 07

On Friday, 7 July 2023 at 11:39:44 UTC, Iain Buclaw wrote:

>

On Thursday, 6 July 2023 at 23:11:08 UTC, Cecil Ward wrote:

>

[...]

The first part is parsed as an AssignExpression, so you could have:

asm {
  (test ? "if-true-insn" : "if-false-insn")
  ~ buildAsmString(foo, bar)
  ~ test2() ? enumInsnTrue : enumInsnFalse // assign-expression finishes here
  : output-constraints
  : ...
}

It's only at semantic-time that a "string-literal" result is enforced using CTFE.

As shown in compiler explorer: https://d.godbolt.org/z/eTqvW8o8E

July 07
On Friday, 7 July 2023 at 11:39:44 UTC, Iain Buclaw wrote:
> On Thursday, 6 July 2023 at 23:11:08 UTC, Cecil Ward wrote:
>> In GDC and LDC’s inline asm syntax, the main asm part is separated from the constraints block by the first colon that begins the ‘outputs’ section. My question: how does GDC / LDC / GCC parse the first part, given that there can be umpteen kinds of assembler language. Is the parsing asm dialect-specific so that a full parse finds the first significant colon ?
>>
>
> The first part is parsed as an [AssignExpression](https://dlang.org/spec/expression.html#assign_expressions), so you could have:
>
> ```
> asm {
>   (test ? "if-true-insn" : "if-false-insn")
>   ~ buildAsmString(foo, bar)
>   ~ test2() ? enumInsnTrue : enumInsnFalse // assign-expression finishes here
>   : output-constraints
>   : ...
> }
> ```
> It's only at semantic-time that a "string-literal" result is enforced using CTFE.
>
>> If not and the very first colon (outside double-quoted strings and comments) ends the first section, which is how I parse it, then there is a problem, as labels contain colons. And so I have a bug in my gramma for my kludge asm section parser, see thread elsewhere.
>>
>
> Labels are statements, so there shouldn't be any conflict between the two.

Many thanks Iain. I am not doing a full parse so I shall simply have to warn the user that the ternary operator must be resolved at ctfe before I get the resulting string, and otherwise I have to forbid it, because for me, aside from detecting labels, I am treating colon as a section terminator. My kludge. But I don’t have months to spend.
July 07
On Friday, 7 July 2023 at 12:12:15 UTC, Iain Buclaw wrote:
> On Friday, 7 July 2023 at 11:39:44 UTC, Iain Buclaw wrote:
>> On Thursday, 6 July 2023 at 23:11:08 UTC, Cecil Ward wrote:
>>>[...]
>>
>> The first part is parsed as an [AssignExpression](https://dlang.org/spec/expression.html#assign_expressions), so you could have:
>>
>> ```
>> asm {
>>   (test ? "if-true-insn" : "if-false-insn")
>>   ~ buildAsmString(foo, bar)
>>   ~ test2() ? enumInsnTrue : enumInsnFalse // assign-expression finishes here
>>   : output-constraints
>>   : ...
>> }
>> ```
>> It's only at semantic-time that a "string-literal" result is enforced using CTFE.
>>
>
> As shown in compiler explorer: https://d.godbolt.org/z/eTqvW8o8E

Many thanks Iain. Much appreciated. I just had a go at checking for the ? : expression with a simple state machine. As I mentioned, I’m not doing a proper parse here, not by a million miles, just the bare minimum to get it to work.


July 07
On Friday, 7 July 2023 at 12:18:47 UTC, Cecil Ward wrote:
> On Friday, 7 July 2023 at 11:39:44 UTC, Iain Buclaw wrote:
>> [...]
>
> Many thanks Iain. I am not doing a full parse so I shall simply have to warn the user that the ternary operator must be resolved at ctfe before I get the resulting string, and otherwise I have to forbid it, because for me, aside from detecting labels, I am treating colon as a section terminator. My kludge. But I don’t have months to spend.

Iain, what’s the lex syntax for a label in the asm ? is it always alphanum+ ‘:’ - something like that?
July 07

On Friday, 7 July 2023 at 15:25:32 UTC, Cecil Ward wrote:

>

On Friday, 7 July 2023 at 12:18:47 UTC, Cecil Ward wrote:

>

On Friday, 7 July 2023 at 11:39:44 UTC, Iain Buclaw wrote:

>

[...]

Many thanks Iain. I am not doing a full parse so I shall simply have to warn the user that the ternary operator must be resolved at ctfe before I get the resulting string, and otherwise I have to forbid it, because for me, aside from detecting labels, I am treating colon as a section terminator. My kludge. But I don’t have months to spend.

Iain, what’s the lex syntax for a label in the asm ? is it always alphanum+ ‘:’ - something like that?

Where is the label being referenced/defined from?

Within the asm insn string? GNU As documents it as:

[A-Za-z._][0-9A-Za-z._]+

For a label referenced from an asm statement (goto labels section)? Then it's the same as any other D Identifier - same as above, but also includes unicode alpha characters.

July 07

On Friday, 7 July 2023 at 19:45:19 UTC, Iain Buclaw wrote:

>

On Friday, 7 July 2023 at 15:25:32 UTC, Cecil Ward wrote:

>

On Friday, 7 July 2023 at 12:18:47 UTC, Cecil Ward wrote:

>

On Friday, 7 July 2023 at 11:39:44 UTC, Iain Buclaw wrote:

>

[...]

Many thanks Iain. I am not doing a full parse so I shall simply have to warn the user that the ternary operator must be resolved at ctfe before I get the resulting string, and otherwise I have to forbid it, because for me, aside from detecting labels, I am treating colon as a section terminator. My kludge. But I don’t have months to spend.

Iain, what’s the lex syntax for a label in the asm ? is it always alphanum+ ‘:’ - something like that?

Where is the label being referenced/defined from?

Within the asm insn string? GNU As documents it as:

[A-Za-z._][0-9A-Za-z._]+

Saying that, I'm pretty sure GNU As accepts Unicode in symbol names, as I have encountered reports of testsuite failures on Solaris 10/11 that involved Oracle's assembler and Unicode in D function names (gcc and gccgo rather encode the unicode characters in a symbol name so IIRC).

July 07
On Friday, 7 July 2023 at 19:45:19 UTC, Iain Buclaw wrote:
> On Friday, 7 July 2023 at 15:25:32 UTC, Cecil Ward wrote:
>> On Friday, 7 July 2023 at 12:18:47 UTC, Cecil Ward wrote:
>>> On Friday, 7 July 2023 at 11:39:44 UTC, Iain Buclaw wrote:
>>>> [...]
>>>
>>> Many thanks Iain. I am not doing a full parse so I shall simply have to warn the user that the ternary operator must be resolved at ctfe before I get the resulting string, and otherwise I have to forbid it, because for me, aside from detecting labels, I am treating colon as a section terminator. My kludge. But I don’t have months to spend.
>>
>> Iain, what’s the lex syntax for a label in the asm ? is it always alphanum+ ‘:’ - something like that?
>
> Where is the label being referenced/defined from?
>
> Within the asm insn string? [GNU As documents it as](https://sourceware.org/binutils/docs/as/Symbol-Names.html):
>
>     [A-Za-z._][0-9A-Za-z._]+
>
> For a label referenced from an asm statement (goto labels section)? Then it's the same as any other D [Identifier](https://dlang.org/spec/lex.html#Identifier) - same as above, but also includes unicode alpha characters.

I’m thinking it’s within the asm string. Not one mentioned in the labels section. I’m trying to remember the syntax of local labels from the days of my youth when I was a full-time pro asm programmer, various different machines back then.
« First   ‹ Prev
1 2