Encoding of eol in multiline wysiwyg strings

Feb 17, 2009

KlausO

Feb 17, 2009

Jarrett Billingsley

Feb 17, 2009

grauzone

Feb 17, 2009

Jarrett Billingsley

Hello, does the D specification specify how the "end of line" is encoded when you use wysiwyg strings. Currently it seems to be '\n' on windows (And I guess it will '\n' on linux, too.). Is this the intended behaviour ? It's not a big issue but somtimes when you use wysiwyg strings, string concatenation and import expressions to combine some text the result is a string with mixed EOL encodings. Thanks for clarifying, KlausO

On Tue, Feb 17, 2009 at 4:41 AM, KlausO <oberhofer@users.sf.net> wrote: > Hello, > > does the D specification specify how the "end of line" is encoded when you > use wysiwyg strings. Currently it seems to be '\n' on windows > (And I guess it will '\n' on linux, too.). > Is this the intended behaviour ? http://www.digitalmars.com/d/1.0/lex.html "Wysiwyg Strings Wysiwyg quoted strings are enclosed by r" and ". All characters between the r" and " are part of the string except for EndOfLine which is regarded as a single \n character." > It's not a big issue but somtimes when you use wysiwyg strings, string > concatenation and import expressions to combine some text the result is a > string with mixed EOL encodings. > Thanks for clarifying, It's the import() expression that's messing things up. It just loads the file verbatim and does no line-ending conversions.

February 17, 2009

Re: Encoding of eol in multiline wysiwyg strings

Posted by grauzone
in reply to Jarrett Billingsley

Permalink

grauzone

Posted in reply to Jarrett Billingsley

Permalink

Jarrett Billingsley wrote:
> On Tue, Feb 17, 2009 at 4:41 AM, KlausO <oberhofer@users.sf.net> wrote:
>> Hello,
>>
>> does the D specification specify how the "end of line" is encoded when you
>> use wysiwyg strings. Currently it seems to be '\n' on windows
>> (And I guess it will '\n' on linux, too.).
>> Is this the intended behaviour ?
> 
> http://www.digitalmars.com/d/1.0/lex.html
> 
> "Wysiwyg Strings
> 
> Wysiwyg quoted strings are enclosed by r" and ". All characters
> between the r" and " are part of the string except for EndOfLine which
> is regarded as a single \n character."
> 
>> It's not a big issue but somtimes when you use wysiwyg strings, string
>> concatenation and import expressions to combine some text the result is a
>> string with mixed EOL encodings.
>> Thanks for clarifying,
> 
> It's the import() expression that's messing things up.  It just loads
> the file verbatim and does no line-ending conversions.

But many people would like to use import() to read binary data.

I guess one could extend the language specification to solve this:

//load, convert line endings, check for valid UTF-8
char[] import_text(char[] filename);

//return unchanged file contents as byte array
ubyte[] import_binary(char[] filename);

On the other hand, both could be implemented as compile-time functions using the current import().

On Tue, Feb 17, 2009 at 10:02 AM, grauzone <none@example.net> wrote: > > But many people would like to use import() to read binary data. Oh, I'm not saying import() is in the wrong here :) just that that's where his mixed line endings are coming from. > I guess one could extend the language specification to solve this: > > //load, convert line endings, check for valid UTF-8 > char[] import_text(char[] filename); > > //return unchanged file contents as byte array > ubyte[] import_binary(char[] filename); > > On the other hand, both could be implemented as compile-time functions using > the current import(). I suppose, as long as CTFE were made a bit more efficient. Can you imagine doing line-end conversions on a 20k line text file at compile time? The compiler would probably explode.

Forums