September 11, 2007
Jascha Wetzel wrote:
> Kirk McDonald wrote:
> 
>> Jascha Wetzel wrote:
>>
>>> therefore you can match q"[^"]*" and check the delimiters during (context sensitive) semantic analysis.
>>
>>
>> Is the following a valid string?
>>
>> q"/foo " bar/"
> 
> 
> oh, you're right of course...
> 
>> Walter also said, in another branch of the thread, that this is not valid:
>>
>> q"/foo/bar/"
>>
>> Since it isn't all /that/ hard to match these examples, I wonder why they are disallowed. Just to simplify the lexer that much more?
> 
> 
> what string would that represent?
> foo/bar
> foobar
> foo

I would expect it to represent foo/bar, in the same way that q"(foo(bar))" represents foo(bar).

-- 
Kirk McDonald
http://kirkmcdonald.blogspot.com
Pyd: Connecting D and Python
http://pyd.dsource.org
September 12, 2007
Kirk McDonald wrote:
> I would expect it to represent foo/bar, in the same way that q"(foo(bar))" represents foo(bar).
>

'/' is not a nesting delimiter. I think q"/foo/bar/" should be scanned as:
q"/foo/ // Error: expected '"' after closing delimiter. "foo" would be the actual value of the literal.
bar // Identifier token
/ // Division token
" // Start of a new, normal string literal
September 12, 2007
Aziz K. wrote:
> Kirk McDonald wrote:
> 
>> I would expect it to represent foo/bar, in the same way that  q"(foo(bar))" represents foo(bar).
>>
> 
> '/' is not a nesting delimiter. I think q"/foo/bar/" should be scanned as:
> q"/foo/ // Error: expected '"' after closing delimiter. "foo" would be the  actual value of the literal.
> bar // Identifier token
> / // Division token
> " // Start of a new, normal string literal

When I updated the Pygments lexer, I interpreted it like this: It sees q"/, and matches a string until it sees /".

As Pygments is merely a syntax highlighter, it is not really that important for it to correctly flag invalid code as erroneous. Obviously, it /should/ do so in the optimum case, and I may get around to fixing this at some point, but it would be nice for the lexical docs to be a little more clear on this subject. Primarily, I see no reason why q"/foo/bar/" shouldn't be scanned as the string foo/bar. (Though I hasten to add that I recognize we are speaking of edge-cases, probably of interest only to people writing D lexers.)

-- 
Kirk McDonald
http://kirkmcdonald.blogspot.com
Pyd: Connecting D and Python
http://pyd.dsource.org
September 13, 2007
Aziz K. wrote:
> Thanks for clarifying. While implementing the methods in my lexer for scanning the new string literals I found a few other ambiguities:
> 
> q"∆abcdef∆" // Might be superfluous to ask, but are (non-alpha) Unicode character delimiters allowed?

Yes.

> q" abcdef " // "abcdef". Allowed?

Yes.

> q"
> äöüß
> " // "äöüß". Should leading newlines be skipped or are they allowed as delimiters?

Skipped.

> q"EOF
> abcdefEOF" // Valid?

No.

> Or is \nEOF a requirement?

Yes.

> If so, how would you write such a string excluding the last newline?

Can't.

> Because you say in the specs that the last newline is part of the string. Maybe it shouldn't be?
> q"EOF
> abcdef
>   EOF" // Provided the previous example is an error. Is indenting the matching delimiter allowed (with " \t\v\f")?

No.

> Walter Bright wrote:
>> Aziz K. wrote:
>>> q{666, this is super __EOF__} // Should __EOF__ be evaluated here causing the token string to be unterminated?
>>
>> Yes (__EOF__ is not a token, it's an end of file)
> Are you sure you want __EOF__ to really mean end of file like '\0' and 0x1A (^Z)? Every time one encounters '_', one would have to look ahead for "_EOF__" and one would have to make sure it's not followed by a valid identifier character. I have twelve instances where I check for \0 and ^Z. It wouldn't be that hard to adapt the code but I'm sure in general it would impact the speed of a D lexer adversely.
> 
> Regards,
> Aziz
September 13, 2007
Reply to Aziz K.,

> q"EOF
> abcdefEOF" // Valid?
>
> Or is \nEOF a requirement? If so, how would you
> write
> such a string excluding the last newline?

q"EOF
abcdef
EOF"[0..$-1]


September 13, 2007
BCS wrote:
> Reply to Aziz K.,
> 
>> q"EOF
>> abcdefEOF" // Valid?
>>
>> Or is \nEOF a requirement? If so, how would you
>> write
>> such a string excluding the last newline?
> 
> q"EOF
> abcdef
> EOF"[0..$-1]
> 

Or peel off the last line:

q"EOF
abcdef
ghijkl
mnop
qrstuv
EOF"
"wxyz."

Still...
Why the draconian limitation that heredocs MUST always have a newline?

Seems like allowing escaped newlines would make life easier.  Like

q"EOF
abcdef
ghijkl
mnop
qrstuv
wxyz.\
EOF"

Or make only the last newline escapable with something prefixing the terminator, like \:

q"EOF
abcdef
ghijkl
mnop
qrstuv
wxyz.
\EOF"

--bb
1 2 3 4 5 6 7 8
Next ›   Last »