July 31, 2003 Re: Cataclysmic decision re: String literals | ||||
---|---|---|---|---|
| ||||
Posted in reply to Sean L. Palmer | Sean L. Palmer says...
>
>r["string"] conflicts with syntax for associative arrays.
>
>To be honest, I really don't give a damn about raw strings. If you want a string in D, run it through a teeny program that escapes it properly and paste it in.
You are right about the square brackets. They just came off the top of my head.
Raw strings help in specialized niches. Embedded work is one. C offers no way
to declare a readable block of hexadecimal digits larger than one integer word.
You either have lots of commas, or lots of backslashes, or cryptic gibberish,
depending how you go about it. Variable-width spacing of the hex is hard if not
impossible, though often desirable in embedded work, e.g.
x"04EAC AB CD FAF 1234FFFFDDEE".
Not only have I written the translator scripts you suggest (only too many times), but I have programmed dynamic regex construction for communications protocols. That means the program creates and uses regular expressions which are unknowable at compile time. These expressions are heavy with escape characters. Writing and debugging such code is hard without raw strings.
Mark
P.S. Walter I don't know what you mean about Unicode but consider
utf8"string" vs. utf16"string" vs. utf32"string"...just thinking out loud.
|
July 31, 2003 Re: Cataclysmic decision re: String literals | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dario | Dario says...
>
>Mark Evans:
>>Different string types concatenate, too:
>>myVar = x"0123" r"string"; --> myVar = '\0\1\2\3string';
>
>What should x"0123" be? A byte array like [0x01, 0x23] or
>like [0x0, 0x1, 0x2, 0x3]?
>This seems strange this to me. It's not that intuitive.
>-Dario
Sorry that my memory for C escape syntax is getting rusty.
x"0123"
would be a 16-bit data chunk with C equivalents
'\x01\x23'
{0x01,0x23}
|
July 31, 2003 Re: Cataclysmic decision re: String literals | ||||
---|---|---|---|---|
| ||||
Posted in reply to Mark Evans | "Mark Evans" <Mark_member@pathlink.com> wrote in message news:bgbr92$1j3j$1@digitaldaemon.com... > P.S. Walter I don't know what you mean about Unicode but consider utf8"string" vs. utf16"string" vs. utf32"string"...just thinking out loud. 1) Unicode source text will be accepted. 2) Use of \uXXXX and \UXXXXXXXX will be accepted in strings. 3) String literals can be converted at compile time between UTF-8, UTF-16 and UCS-32 all by doing the appropriate cast. |
July 31, 2003 Re: Cataclysmic decision re: String literals | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter | Walter wrote:
> 1) Unicode source text will be accepted.
> 2) Use of \uXXXX and \UXXXXXXXX will be accepted in strings.
> 3) String literals can be converted at compile time between UTF-8, UTF-16
> and UCS-32 all by doing the appropriate cast.
Then why not convert single-character-strings into character literals using a cast?
-i.
|
July 31, 2003 Re: Cataclysmic decision re: String literals | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ilya Minkov | Overloadability "Ilya Minkov" <midiclub@8ung.at> wrote in message news:bgc8ut$210h$1@digitaldaemon.com... > Walter wrote: > > 1) Unicode source text will be accepted. > > 2) Use of \uXXXX and \UXXXXXXXX will be accepted in strings. > > 3) String literals can be converted at compile time between UTF-8, UTF-16 > > and UCS-32 all by doing the appropriate cast. > > Then why not convert single-character-strings into character literals using a cast? > > -i. > |
August 01, 2003 Re: Cataclysmic decision re: String literals | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ilya Minkov | "Ilya Minkov" <midiclub@8ung.at> wrote in message news:bgc8ut$210h$1@digitaldaemon.com... > Walter wrote: > > 1) Unicode source text will be accepted. > > 2) Use of \uXXXX and \UXXXXXXXX will be accepted in strings. > > 3) String literals can be converted at compile time between UTF-8, UTF-16 > > and UCS-32 all by doing the appropriate cast. > > Then why not convert single-character-strings into character literals using a cast? Because it's too much typing: cast(char)"a" for a list of them. |
August 01, 2003 Re: Cataclysmic decision re: String literals | ||||
---|---|---|---|---|
| ||||
Posted in reply to Mark Evans | Perhaps I was a bit too hasty. On reflection, it would be nice to be able to declare large binary or hex arrays without so many extra 0x and commas. And I guess for paths and regex it'd be nice to disable escapes. What if you put the "raw" signifier *into* the string instead? As an escape. "string\n" "\:Rraw string" "\:Xdeadbeef" "\:B01001110" "\:" would be the escape trigger for string mode in the above (I chose backslash colon because I believe it to be unused currently, but it could be anything). This escape code doesn't emit any character at all, just changes string mode to raw, hex, or binary (or whatever). Would that work? Seems more C compatible than the other alternatives. Sean "Mark Evans" <Mark_member@pathlink.com> wrote in message news:bgbr92$1j3j$1@digitaldaemon.com... > Sean L. Palmer says... > > > >r["string"] conflicts with syntax for associative arrays. > > > >To be honest, I really don't give a damn about raw strings. If you want a > >string in D, run it through a teeny program that escapes it properly and paste it in. > > You are right about the square brackets. They just came off the top of my head. > > Raw strings help in specialized niches. Embedded work is one. C offers no way > to declare a readable block of hexadecimal digits larger than one integer word. > You either have lots of commas, or lots of backslashes, or cryptic gibberish, > depending how you go about it. Variable-width spacing of the hex is hard if not > impossible, though often desirable in embedded work, e.g. x"04EAC AB CD FAF 1234FFFFDDEE". > > Not only have I written the translator scripts you suggest (only too many times), but I have programmed dynamic regex construction for communications > protocols. That means the program creates and uses regular expressions which > are unknowable at compile time. These expressions are heavy with escape characters. Writing and debugging such code is hard without raw strings. > > Mark > P.S. Walter I don't know what you mean about Unicode but consider > utf8"string" vs. utf16"string" vs. utf32"string"...just thinking out loud. > > |
August 01, 2003 Re: OT was Re: Cataclysmic decision re: String literals | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter | In article <bgblpp$1d3a$1@digitaldaemon.com>, Walter says... > > >"Mark T" <Mark_member@pathlink.com> wrote in message news:bg9g0a$2b7v$1@digitaldaemon.com... >> >...would have had me talking to God on the big white telephone (I'm learning Australian, and that means "puking"), >> >> I think that is "calling God on the big white phone" >> I'm American and we used that expression back in the 1970's. > >Hmm. I always heard it as "praying to the porcelain gods." I also did some of that, it helped to clear my head for doing programming homework in Algol, with all those BEGIN-END pairs |
August 01, 2003 Re: OT was Re: Cataclysmic decision re: String literals | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter | "Walter" <walter@digitalmars.com> wrote in message news:bgblpp$1d3a$1@digitaldaemon.com... > > Hmm. I always heard it as "praying to the porcelain gods." > My favorite was always "driving the big white bus." Rich C. |
August 07, 2003 Re: String literals | ||||
---|---|---|---|---|
| ||||
Posted in reply to Mark Evans | "Mark Evans" <Mark_member@pathlink.com> wrote in message news:bg6rkb$2k4p$1@digitaldaemon.com... > > OK, Walter, here is my final answer. Do not use @ even though C# uses it. That > was a bad choice on Microsoft's part. They really should have known better. > Too many preprocessors and languages use @ for special purposes. Think SWIG, > JavaDoc, etc. > > Use r"string"r or raw"string"raw. The advantage here is that you can later > define new types of strings with a new letter (Unicode? a string of bits, b"101010101111111000000" or hexadecimal bit groups, x"ABCD12340000FFFF11111"?). > So in a sense it's extensible and future-proof. This syntax is also reminiscent > of C's numeric prefix and suffix notations, 0xABCD, 0b1010, 1.234L, etc. > > The numeric string concept is convenient for static pre-assignment of memory. > The alternatives in C are not pretty: arrays of smaller things (comma, comma, > comma, comma, another comma,...) or an unreadable string ("#$~H*G_#@jdkBG$*&"). > So this notation is extra candy on top for embedded programming work. > > The redundant closing letter is optional but recommended. It solves the meta-escape problem very cleanly. (Actually it's dumbed-down XML.) The b and x > variants would not require closing letters, as their contents are intrinsically > limited. Whitespace should be allowed in them of course, x"ABCD 1234 FF00". > > Mark > I came to a simular conclusion. The there suggested syntaxes (I haven't read all the replies) would make it much harder for the transition from C to D. Explaining that a character is put in front of the quote is easier then explaining that you need to use a particular symbol instead of quote. There may be some use for this syntax on array's as well. u{10,12,16}; //enforce unsigned. r{10,12,16}; //read only. o{10,12,16}; //ordered. fi{c:\data.txt}; //file to import integer array from. I can't think of anything *good* right now, but it's an option for latter down the track. |
Copyright © 1999-2021 by the D Language Foundation