DMD 1.021 and 2.004 releases (page 8)

Jascha Wetzel wrote: > Kirk McDonald wrote: > >> Jascha Wetzel wrote: >> >>> therefore you can match q"[^"]*" and check the delimiters during (context sensitive) semantic analysis. >> >> >> Is the following a valid string? >> >> q"/foo " bar/" > > > oh, you're right of course... > >> Walter also said, in another branch of the thread, that this is not valid: >> >> q"/foo/bar/" >> >> Since it isn't all /that/ hard to match these examples, I wonder why they are disallowed. Just to simplify the lexer that much more? > > > what string would that represent? > foo/bar > foobar > foo I would expect it to represent foo/bar, in the same way that q"(foo(bar))" represents foo(bar). -- Kirk McDonald http://kirkmcdonald.blogspot.com Pyd: Connecting D and Python http://pyd.dsource.org

Kirk McDonald wrote: > I would expect it to represent foo/bar, in the same way that q"(foo(bar))" represents foo(bar). > '/' is not a nesting delimiter. I think q"/foo/bar/" should be scanned as: q"/foo/ // Error: expected '"' after closing delimiter. "foo" would be the actual value of the literal. bar // Identifier token / // Division token " // Start of a new, normal string literal

Aziz K. wrote: > Kirk McDonald wrote: > >> I would expect it to represent foo/bar, in the same way that q"(foo(bar))" represents foo(bar). >> > > '/' is not a nesting delimiter. I think q"/foo/bar/" should be scanned as: > q"/foo/ // Error: expected '"' after closing delimiter. "foo" would be the actual value of the literal. > bar // Identifier token > / // Division token > " // Start of a new, normal string literal When I updated the Pygments lexer, I interpreted it like this: It sees q"/, and matches a string until it sees /". As Pygments is merely a syntax highlighter, it is not really that important for it to correctly flag invalid code as erroneous. Obviously, it /should/ do so in the optimum case, and I may get around to fixing this at some point, but it would be nice for the lexical docs to be a little more clear on this subject. Primarily, I see no reason why q"/foo/bar/" shouldn't be scanned as the string foo/bar. (Though I hasten to add that I recognize we are speaking of edge-cases, probably of interest only to people writing D lexers.) -- Kirk McDonald http://kirkmcdonald.blogspot.com Pyd: Connecting D and Python http://pyd.dsource.org

September 13, 2007

Re: DMD 1.021 and 2.004 releases

Posted by Walter Bright
in reply to Aziz K.

Permalink

Walter Bright

Posted in reply to Aziz K.

Permalink

Aziz K. wrote:
> Thanks for clarifying. While implementing the methods in my lexer for scanning the new string literals I found a few other ambiguities:
> 
> q"∆abcdef∆" // Might be superfluous to ask, but are (non-alpha) Unicode character delimiters allowed?

Yes.

> q" abcdef " // "abcdef". Allowed?

Yes.

> q"
> äöüß
> " // "äöüß". Should leading newlines be skipped or are they allowed as delimiters?

Skipped.

> q"EOF
> abcdefEOF" // Valid?

No.

> Or is \nEOF a requirement?

Yes.

> If so, how would you write such a string excluding the last newline?

Can't.

> Because you say in the specs that the last newline is part of the string. Maybe it shouldn't be?
> q"EOF
> abcdef
>   EOF" // Provided the previous example is an error. Is indenting the matching delimiter allowed (with " \t\v\f")?

No.

> Walter Bright wrote:
>> Aziz K. wrote:
>>> q{666, this is super __EOF__} // Should __EOF__ be evaluated here causing the token string to be unterminated?
>>
>> Yes (__EOF__ is not a token, it's an end of file)
> Are you sure you want __EOF__ to really mean end of file like '\0' and 0x1A (^Z)? Every time one encounters '_', one would have to look ahead for "_EOF__" and one would have to make sure it's not followed by a valid identifier character. I have twelve instances where I check for \0 and ^Z. It wouldn't be that hard to adapt the code but I'm sure in general it would impact the speed of a D lexer adversely.
> 
> Regards,
> Aziz

BCS wrote: > Reply to Aziz K., > >> q"EOF >> abcdefEOF" // Valid? >> >> Or is \nEOF a requirement? If so, how would you >> write >> such a string excluding the last newline? > > q"EOF > abcdef > EOF"[0..$-1] > Or peel off the last line: q"EOF abcdef ghijkl mnop qrstuv EOF" "wxyz." Still... Why the draconian limitation that heredocs MUST always have a newline? Seems like allowing escaped newlines would make life easier. Like q"EOF abcdef ghijkl mnop qrstuv wxyz.\ EOF" Or make only the last newline escapable with something prefixing the terminator, like \: q"EOF abcdef ghijkl mnop qrstuv wxyz. \EOF" --bb

Forums