Thread overview | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
|
April 23, 2012 Escaping control in formatting | ||||
---|---|---|---|---|
| ||||
I've never used new excellent range formatting syntax by Kenji Hara until now. And I've met with difficulties, because "%(%(%c%), %)" is the most common format for string array for me and it neither obvious nor elegant. It occurs that "%c" disables character escaping. What the hell? Why? Not obvious at all.
So I think it will be good to add 'Escaping' part after 'Precision' in format specifications:
Escaping:
empty
!-
!+
!'
!"
!?'
!?"
!?!
Escaping affect formatting depending on the specifier as follows.
Escaping Semantics
!- disable escaping, for a range it also disables [,]
!+ enable escaping using single quotes for chars and double quotes for strings
!' enable escaping using single quotes
!" enable escaping using double quotes
!?' like !' but without adding the quotes and [,] for a range
!?" like !" but without adding the quotes and [,] for a range
!?! enable escaping, both single and double quotes will be escaped without adding any quotes and [,] for a range
Escaping is enabled by default only for associative arrays, ranges (not strings), user-defined types, and all its sub-elements.
I'd like to remove "%c"'s ability to magically disable escaping and it looks possible until it is documented.
Look at the example:
---
import std.stdio;
void main() {
writeln(" char");
char c = '\'';
writefln("unescaped: %s." , c );
writefln(`escaped+': %(%).`, [ c ]); // proposal: %!+s or %!'s
writefln(`escaped+": %(%).`, [[c]]); // proposal: %!"s
writeln (` escaped: \t.`); // proposal: %!?'s
writeln();
writeln(" string");
string s = "a\tb";
writefln("unescaped: %s." , s );
writefln(`escaped+": %(%).`, [s]); // proposal: %!+s or %!"s
writeln (` escaped: a\tb.`); // proposal: %!?"s
writeln();
writeln(" strings");
string[] ss = ["a\tb", "cd"];
writefln("unescaped: %(%(%c%)%).", ss); // proposal: %!-s
writefln(`escaped+": %(%).` , ss);
writeln (` escaped: a\tbcd.` , ss); // proposal: %!?"s
}
---
If it will be accepted, I can volunteer to try to implement it. If not, escaping should be at least documented (and do not forget about "%c"'s magic!).
Any thoughts?
P.S.
If it has already been discussed, please give me a link.
--
Денис В. Шеломовский
Denis V. Shelomovskij
|
April 23, 2012 Re: Escaping control in formatting | ||||
---|---|---|---|---|
| ||||
Posted in reply to Denis Shelomovskij | On 23.04.2012 16:36, Denis Shelomovskij wrote: > I've never used new excellent range formatting syntax by Kenji Hara > until now. And I've met with difficulties, because "%(%(%c%), %)" is the > most common format for string array for me and it neither obvious nor > elegant. It occurs that "%c" disables character escaping. What the hell? > Why? Not obvious at all. Does %(%s, %) not work? [snip] -- Dmitry Olshansky |
April 23, 2012 Re: Escaping control in formatting | ||||
---|---|---|---|---|
| ||||
Posted in reply to Denis Shelomovskij | 2012$BG/(B4$B7n(B23$BF|(B21:36 Denis Shelomovskij <verylonglogin.reg@gmail.com>: > I've never used new excellent range formatting syntax by Kenji Hara until now. And I've met with difficulties, because "%(%(%c%), %)" is the most common format for string array for me and it neither obvious nor elegant. It occurs that "%c" disables character escaping. What the hell? Why? Not obvious at all. > > So I think it will be good to add 'Escaping' part after 'Precision' in format specifications: > > Escaping: > empty > !- > !+ > !' > !" > !?' > !?" > !?! > > Escaping affect formatting depending on the specifier as follows. > > Escaping Semantics > !- disable escaping, for a range it also disables [,] > !+ enable escaping using single quotes for chars and double quotes for > strings > !' enable escaping using single quotes > !" enable escaping using double quotes > !?' like !' but without adding the quotes and [,] for a range > !?" like !" but without adding the quotes and [,] for a range > !?! enable escaping, both single and double quotes will be escaped > without adding any quotes and [,] for a range > > Escaping is enabled by default only for associative arrays, ranges (not strings), user-defined types, and all its sub-elements. > > I'd like to remove "%c"'s ability to magically disable escaping and it looks possible until it is documented. > > Look at the example: > --- > import std.stdio; > > void main() { > writeln(" char"); > char c = '\''; > writefln("unescaped: %s." , c ); > writefln(`escaped+': %(%).`, [ c ]); // proposal: %!+s or %!'s > writefln(`escaped+": %(%).`, [[c]]); // proposal: %!"s > writeln (` escaped: \t.`); // proposal: %!?'s > writeln(); > writeln(" string"); > string s = "a\tb"; > writefln("unescaped: %s." , s ); > writefln(`escaped+": %(%).`, [s]); // proposal: %!+s or %!"s > writeln (` escaped: a\tb.`); // proposal: %!?"s > writeln(); > writeln(" strings"); > string[] ss = ["a\tb", "cd"]; > writefln("unescaped: %(%(%c%)%).", ss); // proposal: %!-s > writefln(`escaped+": %(%).` , ss); > writeln (` escaped: a\tbcd.` , ss); // proposal: %!?"s > } > --- > > If it will be accepted, I can volunteer to try to implement it. If not, escaping should be at least documented (and do not forget about "%c"'s magic!). > > Any thoughts? Please give us use cases. I cannot imagine why you want to change/remove quotations but keep escaped contents. > P.S. > If it has already been discussed, please give me a link. As far as I know, there is not yet discussions. Kenji Hara |
April 23, 2012 Re: Escaping control in formatting | ||||
---|---|---|---|---|
| ||||
Posted in reply to kenji hara | 23.04.2012 18:54, kenji hara написал: > Please give us use cases. I cannot imagine why you want to > change/remove quotations but keep escaped contents. Sorry, I should mention that !' and !" are optional and aren't commonly used, and all !?* are very optional and are here just for completeness (IMHO). An example is generating a complicated string for C/C++: --- myCppFile.writefln(`tmp = "%!?"s, and %!?"s, and even %!?"s";`, str1, str2, str3) --- -- Денис В. Шеломовский Denis V. Shelomovskij |
April 23, 2012 Re: Escaping control in formatting | ||||
---|---|---|---|---|
| ||||
Posted in reply to Denis Shelomovskij | 2012$BG/(B4$B7n(B24$BF|(B1:14 Denis Shelomovskij <verylonglogin.reg@gmail.com>: > 23.04.2012 18:54, kenji hara $B'_'Q'a'Z'c'Q'](B: > >> Please give us use cases. I cannot imagine why you want to change/remove quotations but keep escaped contents. > > > Sorry, I should mention that !' and !" are optional and aren't commonly used, and all !?* are very optional and are here just for completeness (IMHO). > > An example is generating a complicated string for C/C++: > --- > myCppFile.writefln(`tmp = "%!?"s, and %!?"s, and even %!?"s";`, > str1, str2, str3) > --- > > > -- > $B'%'V'_'Z'c(B $B'#(B. $B':'V']'`'^'`'S'c'\'Z'[(B > Denis V. Shelomovskij During my improvements of std.format module, I have decided a design. If you format some values with a format specifier, you should unformat the output with same format specifier. Example: import std.format, std.array; auto aa = [1:"hello", 2:"world"]; auto writer = appender!string(); formattedWrite(writer, "%s", aa); aa = null; auto output = writer.data; formattedRead(output, "%s", &aa); // same format specifier assert(aa == [1:"hello", 2:"world"]); More details: https://github.com/D-Programming-Language/phobos/blob/master/std/format.d#L3264 I call this "reflective formatting", and it supports simple text based serialization and de-serialization. Automatic quotation/escaping for nested elements is necessary for the feature. But your proposal will break this design very easy, and it is impossible to unformat the outputs reflectively. For these reasons, your suggestion is hard to accept. Kenji Hara |
April 23, 2012 Re: Escaping control in formatting | ||||
---|---|---|---|---|
| ||||
Posted in reply to kenji hara | 23.04.2012 21:15, kenji hara написал: > 2012年4月24日1:14 Denis Shelomovskij<verylonglogin.reg@gmail.com>: >> 23.04.2012 18:54, kenji hara написал: >> >>> Please give us use cases. I cannot imagine why you want to >>> change/remove quotations but keep escaped contents. >> >> >> Sorry, I should mention that !' and !" are optional and aren't commonly >> used, and all !?* are very optional and are here just for completeness >> (IMHO). >> >> An example is generating a complicated string for C/C++: >> --- >> myCppFile.writefln(`tmp = "%!?"s, and %!?"s, and even %!?"s";`, >> str1, str2, str3) >> --- >> >> >> -- >> Денис В. Шеломовский >> Denis V. Shelomovskij > > During my improvements of std.format module, I have decided a design. > If you format some values with a format specifier, you should unformat > the output with same format specifier. > > Example: > import std.format, std.array; > > auto aa = [1:"hello", 2:"world"]; > auto writer = appender!string(); > formattedWrite(writer, "%s", aa); > > aa = null; > > auto output = writer.data; > formattedRead(output, "%s",&aa); // same format specifier > assert(aa == [1:"hello", 2:"world"]); > > More details: > https://github.com/D-Programming-Language/phobos/blob/master/std/format.d#L3264 > > I call this "reflective formatting", and it supports simple text based > serialization and de-serialization. > Automatic quotation/escaping for nested elements is necessary for the feature. > > But your proposal will break this design very easy, and it is > impossible to unformat the outputs reflectively. > > For these reasons, your suggestion is hard to accept. > > Kenji Hara Is there sum misunderstanding? Reflective formatting is good! But it isn't what you always want. It is needed mostly for debug purposes. But debugging is one of two usings of formatting, the second one is just writing something somewhere. There are already some non-reflective constructs (like "%(%(%c%), %)" for a range and "X%sY%sZ" for strings) and I just propose adding more comfortable ones because every second time I use formatting I use it for writing (I mean not for debugging). -- Денис В. Шеломовский Denis V. Shelomovskij |
April 23, 2012 Re: Escaping control in formatting | ||||
---|---|---|---|---|
| ||||
Posted in reply to Denis Shelomovskij | 23.04.2012 21:49, Denis Shelomovskij написал: > 23.04.2012 21:15, kenji hara написал: >> 2012年4月24日1:14 Denis Shelomovskij<verylonglogin.reg@gmail.com>: >>> 23.04.2012 18:54, kenji hara написал: >>> >>>> Please give us use cases. I cannot imagine why you want to >>>> change/remove quotations but keep escaped contents. >>> >>> >>> Sorry, I should mention that !' and !" are optional and aren't commonly >>> used, and all !?* are very optional and are here just for completeness >>> (IMHO). >>> >>> An example is generating a complicated string for C/C++: >>> --- >>> myCppFile.writefln(`tmp = "%!?"s, and %!?"s, and even %!?"s";`, >>> str1, str2, str3) >>> --- >>> >>> >>> -- >>> Денис В. Шеломовский >>> Denis V. Shelomovskij >> >> During my improvements of std.format module, I have decided a design. >> If you format some values with a format specifier, you should unformat >> the output with same format specifier. >> >> Example: >> import std.format, std.array; >> >> auto aa = [1:"hello", 2:"world"]; >> auto writer = appender!string(); >> formattedWrite(writer, "%s", aa); >> >> aa = null; >> >> auto output = writer.data; >> formattedRead(output, "%s",&aa); // same format specifier >> assert(aa == [1:"hello", 2:"world"]); >> >> More details: >> https://github.com/D-Programming-Language/phobos/blob/master/std/format.d#L3264 >> >> >> I call this "reflective formatting", and it supports simple text based >> serialization and de-serialization. >> Automatic quotation/escaping for nested elements is necessary for the >> feature. >> >> But your proposal will break this design very easy, and it is >> impossible to unformat the outputs reflectively. >> >> For these reasons, your suggestion is hard to accept. >> >> Kenji Hara > > Is there sum misunderstanding? > > Reflective formatting is good! But it isn't what you always want. It is > needed mostly for debug purposes. But debugging is one of two usings of > formatting, the second one is just writing something somewhere. > > There are already some non-reflective constructs (like "%(%(%c%), %)" > for a range and "X%sY%sZ" for strings) and I just propose adding more > comfortable ones because every second time I use formatting I use it for > writing (I mean not for debugging). > Completely forgot. %!+s in my proposal is exactly for reflective formatting (e.g. "X%!+sY%!+sZ" in reflective for strings). -- Денис В. Шеломовский Denis V. Shelomovskij |
April 24, 2012 Re: Escaping control in formatting | ||||
---|---|---|---|---|
| ||||
Posted in reply to Denis Shelomovskij | 2012$BG/(B4$B7n(B24$BF|(B2:49 Denis Shelomovskij <verylonglogin.reg@gmail.com>: > 23.04.2012 21:15, kenji hara $B'_'Q'a'Z'c'Q'](B: >> >> 2012$BG/(B4$B7n(B24$BF|(B1:14 Denis Shelomovskij<verylonglogin.reg@gmail.com>: >>> >>> 23.04.2012 18:54, kenji hara $B'_'Q'a'Z'c'Q'](B: >>> >>> >>>> Please give us use cases. I cannot imagine why you want to change/remove quotations but keep escaped contents. >>> >>> >>> >>> Sorry, I should mention that !' and !" are optional and aren't commonly used, and all !?* are very optional and are here just for completeness (IMHO). >>> >>> An example is generating a complicated string for C/C++: >>> --- >>> myCppFile.writefln(`tmp = "%!?"s, and %!?"s, and even %!?"s";`, >>> str1, str2, str3) >>> --- >>> >>> >>> -- >>> $B'%'V'_'Z'c(B $B'#(B. $B':'V']'`'^'`'S'c'\'Z'[(B >>> Denis V. Shelomovskij >> >> >> During my improvements of std.format module, I have decided a design. If you format some values with a format specifier, you should unformat the output with same format specifier. >> >> Example: >> import std.format, std.array; >> >> auto aa = [1:"hello", 2:"world"]; >> auto writer = appender!string(); >> formattedWrite(writer, "%s", aa); >> >> aa = null; >> >> auto output = writer.data; >> formattedRead(output, "%s",&aa); // same format specifier >> >> assert(aa == [1:"hello", 2:"world"]); >> >> More details: >> >> https://github.com/D-Programming-Language/phobos/blob/master/std/format.d#L3264 >> >> I call this "reflective formatting", and it supports simple text based >> serialization and de-serialization. >> Automatic quotation/escaping for nested elements is necessary for the >> feature. >> >> But your proposal will break this design very easy, and it is impossible to unformat the outputs reflectively. >> >> For these reasons, your suggestion is hard to accept. >> >> Kenji Hara > > > Is there sum misunderstanding? > > Reflective formatting is good! But it isn't what you always want. It is needed mostly for debug purposes. But debugging is one of two usings of formatting, the second one is just writing something somewhere. > > There are already some non-reflective constructs (like "%(%(%c%), %)" for a > range and "X%sY%sZ" for strings) and I just propose adding more comfortable > ones because every second time I use formatting I use it for writing (I mean > not for debugging). > > > -- > $B'%'V'_'Z'c(B $B'#(B. $B':'V']'`'^'`'S'c'\'Z'[(B > Denis V. Shelomovskij My concern is that the proposal is much complicated and less useful for general use cases. You can emulate such formatting like follows: import std.array, std.format, std.stdio; import std.range, std.uni; void main() { auto strs = ["It's", "\"world\""]; { // emulation of !?" auto w = appender!string(); foreach (s; strs) formatStrWithEscape(w, s, '"'); writeln(w.data); } { // emulation of !?' auto w = appender!string(); foreach (s; strs) formatStrWithEscape(w, s, '\''); writeln(w.data); } } void formatStrWithEscape(W)(W writer, string str, char quote) { writer.put(quote); foreach (dchar c; str) formatChar(writer, c, quote); writer.put(quote); } // copy from std.format void formatChar(Writer)(Writer w, in dchar c, in char quote) { if (std.uni.isGraphical(c)) { if (c == quote || c == '\\') put(w, '\\'), put(w, c); else put(w, c); } else if (c <= 0xFF) { put(w, '\\'); switch (c) { case '\a': put(w, 'a'); break; case '\b': put(w, 'b'); break; case '\f': put(w, 'f'); break; case '\n': put(w, 'n'); break; case '\r': put(w, 'r'); break; case '\t': put(w, 't'); break; case '\v': put(w, 'v'); break; default: formattedWrite(w, "x%02X", cast(uint)c); } } else if (c <= 0xFFFF) formattedWrite(w, "\\u%04X", cast(uint)c); else formattedWrite(w, "\\U%08X", cast(uint)c); } I can agree changing private functions in std.format, e.g. formatChar, to public undocumented, but cannot agree adding such complicated rule into supported format specifier. Kenji Hara |
April 24, 2012 Re: Escaping control in formatting | ||||
---|---|---|---|---|
| ||||
Posted in reply to kenji hara | On Tuesday, 24 April 2012 at 04:55:34 UTC, kenji hara wrote:
> My concern is that the proposal is much complicated and less useful
> for general use cases.
> You can emulate such formatting like follows:
IMHO addition of %!+s and %!-s alone and removing %c's magic will only simplify formatting for the user. It was hard (for me) to understand current escaping rules because it's undocumented and looks dissonant (for me) because of the fact that escaping is a part of formatting but user is unable to control it unless magical %c is used.
I agree that !', !", and !?* of course aren't commonly used as I have already written. Personally I don't need them at all.
But this is a common pattern for me: `xformat("My pets: %(%!-s, %)", petsAsStrings)`. And "My pets: %(%(%c%), %)" is too complicated, dissonant and not general (will not work if I'll give it pets as int[] e.g.) that I never use it. I use `.joiner(", ")` instead and every time I do it I think that something is really wrong with array formatting in Phobos.
--
Денис В. Шеломовский
Denis V. Shelomovskij
|
Copyright © 1999-2021 by the D Language Foundation