View mode: basic / threaded / horizontal-split · Log in · Help
April 23, 2012
Escaping control in formatting
I've never used new excellent range formatting syntax by Kenji Hara 
until now. And I've met with difficulties, because "%(%(%c%), %)" is the 
most common format for string array for me and it neither obvious nor 
elegant. It occurs that "%c" disables character escaping. What the hell? 
Why? Not obvious at all.

So I think it will be good to add 'Escaping' part after 'Precision' in 
format specifications:

Escaping:
  empty
  !-
  !+
  !'
  !"
  !?'
  !?"
  !?!

Escaping affect formatting depending on the specifier as follows.

Escaping    Semantics
  !-      disable escaping, for a range it also disables [,]
  !+      enable escaping using single quotes for chars and double 
quotes for strings
  !'      enable escaping using single quotes
  !"      enable escaping using double quotes
  !?'     like !' but without adding the quotes and [,] for a range
  !?"     like !" but without adding the quotes and [,] for a range
  !?!     enable escaping, both single and double quotes will be 
escaped without adding any quotes and [,] for a range

Escaping is enabled by default only for associative arrays, ranges (not 
strings), user-defined types, and all its sub-elements.

I'd like to remove "%c"'s ability to magically disable escaping and it 
looks possible until it is documented.

Look at the example:
---
import std.stdio;

void main() {
    writeln("    char");
    char c = '\'';
    writefln("unescaped: %s."  ,   c  );
    writefln(`escaped+': %(%).`, [ c ]); // proposal: %!+s or %!'s
    writefln(`escaped+": %(%).`, [[c]]); // proposal: %!"s
    writeln (`  escaped: \t.`);          // proposal: %!?'s
    writeln();
    writeln("    string");
    string s = "a\tb";
    writefln("unescaped: %s."  ,  s );
    writefln(`escaped+": %(%).`, [s]); // proposal: %!+s or %!"s
    writeln (`  escaped: a\tb.`);      // proposal: %!?"s
    writeln();
    writeln("    strings");
    string[] ss = ["a\tb", "cd"];
    writefln("unescaped: %(%(%c%)%).", ss); // proposal: %!-s
    writefln(`escaped+": %(%).`      , ss);
    writeln (`  escaped: a\tbcd.`    , ss); // proposal: %!?"s
}
---

If it will be accepted, I can volunteer to try to implement it. If not, 
escaping should be at least documented (and do not forget about "%c"'s 
magic!).

Any thoughts?

P.S.
If it has already been discussed, please give me a link.

-- 
Денис В. Шеломовский
Denis V. Shelomovskij
April 23, 2012
Re: Escaping control in formatting
On 23.04.2012 16:36, Denis Shelomovskij wrote:
> I've never used new excellent range formatting syntax by Kenji Hara
> until now. And I've met with difficulties, because "%(%(%c%), %)" is the
> most common format for string array for me and it neither obvious nor
> elegant. It occurs that "%c" disables character escaping. What the hell?
> Why? Not obvious at all.

Does %(%s, %)  not work?

[snip]

-- 
Dmitry Olshansky
April 23, 2012
Re: Escaping control in formatting
2012$BG/(B4$B7n(B23$BF|(B21:36 Denis Shelomovskij <verylonglogin.reg@gmail.com>:
> I've never used new excellent range formatting syntax by Kenji Hara until
> now. And I've met with difficulties, because "%(%(%c%), %)" is the most
> common format for string array for me and it neither obvious nor elegant. It
> occurs that "%c" disables character escaping. What the hell? Why? Not
> obvious at all.
>
> So I think it will be good to add 'Escaping' part after 'Precision' in
> format specifications:
>
> Escaping:
>  empty
>  !-
>  !+
>  !'
>  !"
>  !?'
>  !?"
>  !?!
>
> Escaping affect formatting depending on the specifier as follows.
>
> Escaping    Semantics
>  !-      disable escaping, for a range it also disables [,]
>  !+      enable escaping using single quotes for chars and double quotes for
> strings
>  !'      enable escaping using single quotes
>  !"      enable escaping using double quotes
>  !?'     like !' but without adding the quotes and [,] for a range
>  !?"     like !" but without adding the quotes and [,] for a range
>  !?!     enable escaping, both single and double quotes will be escaped
> without adding any quotes and [,] for a range
>
> Escaping is enabled by default only for associative arrays, ranges (not
> strings), user-defined types, and all its sub-elements.
>
> I'd like to remove "%c"'s ability to magically disable escaping and it looks
> possible until it is documented.
>
> Look at the example:
> ---
> import std.stdio;
>
> void main() {
>    writeln("    char");
>    char c = '\'';
>    writefln("unescaped: %s."  ,   c  );
>    writefln(`escaped+': %(%).`, [ c ]); // proposal: %!+s or %!'s
>    writefln(`escaped+": %(%).`, [[c]]); // proposal: %!"s
>    writeln (`  escaped: \t.`);          // proposal: %!?'s
>    writeln();
>    writeln("    string");
>    string s = "a\tb";
>    writefln("unescaped: %s."  ,  s );
>    writefln(`escaped+": %(%).`, [s]); // proposal: %!+s or %!"s
>    writeln (`  escaped: a\tb.`);      // proposal: %!?"s
>    writeln();
>    writeln("    strings");
>    string[] ss = ["a\tb", "cd"];
>    writefln("unescaped: %(%(%c%)%).", ss); // proposal: %!-s
>    writefln(`escaped+": %(%).`      , ss);
>    writeln (`  escaped: a\tbcd.`    , ss); // proposal: %!?"s
> }
> ---
>
> If it will be accepted, I can volunteer to try to implement it. If not,
> escaping should be at least documented (and do not forget about "%c"'s
> magic!).
>
> Any thoughts?

Please give us use cases.
I cannot imagine why you want to change/remove quotations but keep
escaped contents.

> P.S.
> If it has already been discussed, please give me a link.

As far as I know, there is not yet discussions.

Kenji Hara
April 23, 2012
Re: Escaping control in formatting
23.04.2012 18:54, kenji hara написал:
> Please give us use cases. I cannot imagine why you want to
> change/remove quotations but keep escaped contents.

Sorry, I should mention that !' and !" are optional and aren't commonly
used, and all !?* are very optional and are here just for completeness
(IMHO).

An example is generating a complicated string for C/C++:
---
myCppFile.writefln(`tmp = "%!?"s, and %!?"s, and even %!?"s";`,
                   str1, str2, str3)
---

-- 
Денис В. Шеломовский
Denis V. Shelomovskij
April 23, 2012
Re: Escaping control in formatting
2012$BG/(B4$B7n(B24$BF|(B1:14 Denis Shelomovskij <verylonglogin.reg@gmail.com>:
> 23.04.2012 18:54, kenji hara $B'_'Q'a'Z'c'Q'](B:
>
>> Please give us use cases. I cannot imagine why you want to
>> change/remove quotations but keep escaped contents.
>
>
> Sorry, I should mention that !' and !" are optional and aren't commonly
> used, and all !?* are very optional and are here just for completeness
> (IMHO).
>
> An example is generating a complicated string for C/C++:
> ---
> myCppFile.writefln(`tmp = "%!?"s, and %!?"s, and even %!?"s";`,
>                   str1, str2, str3)
> ---
>
>
> --
> $B'%'V'_'Z'c(B $B'#(B. $B':'V']'`'^'`'S'c'\'Z'[(B
> Denis V. Shelomovskij

During my improvements of std.format module, I have decided a design.
If you format some values with a format specifier, you should unformat
the output with same format specifier.

Example:
   import std.format, std.array;

   auto aa = [1:"hello", 2:"world"];
   auto writer = appender!string();
   formattedWrite(writer, "%s", aa);

   aa = null;

   auto output = writer.data;
   formattedRead(output, "%s", &aa);  // same format specifier
   assert(aa == [1:"hello", 2:"world"]);

More details:
   https://github.com/D-Programming-Language/phobos/blob/master/std/format.d#L3264

I call this "reflective formatting", and it supports simple text based
serialization and de-serialization.
Automatic quotation/escaping for nested elements is necessary for the feature.

But your proposal will break this design very easy, and it is
impossible to unformat the outputs reflectively.

For these reasons, your suggestion is hard to accept.

Kenji Hara
April 23, 2012
Re: Escaping control in formatting
23.04.2012 21:15, kenji hara написал:
> 2012年4月24日1:14 Denis Shelomovskij<verylonglogin.reg@gmail.com>:
>> 23.04.2012 18:54, kenji hara написал:
>>
>>> Please give us use cases. I cannot imagine why you want to
>>> change/remove quotations but keep escaped contents.
>>
>>
>> Sorry, I should mention that !' and !" are optional and aren't commonly
>> used, and all !?* are very optional and are here just for completeness
>> (IMHO).
>>
>> An example is generating a complicated string for C/C++:
>> ---
>> myCppFile.writefln(`tmp = "%!?"s, and %!?"s, and even %!?"s";`,
>>                    str1, str2, str3)
>> ---
>>
>>
>> --
>> Денис В. Шеломовский
>> Denis V. Shelomovskij
>
> During my improvements of std.format module, I have decided a design.
> If you format some values with a format specifier, you should unformat
> the output with same format specifier.
>
> Example:
>      import std.format, std.array;
>
>      auto aa = [1:"hello", 2:"world"];
>      auto writer = appender!string();
>      formattedWrite(writer, "%s", aa);
>
>      aa = null;
>
>      auto output = writer.data;
>      formattedRead(output, "%s",&aa);  // same format specifier
>      assert(aa == [1:"hello", 2:"world"]);
>
> More details:
>      https://github.com/D-Programming-Language/phobos/blob/master/std/format.d#L3264
>
> I call this "reflective formatting", and it supports simple text based
> serialization and de-serialization.
> Automatic quotation/escaping for nested elements is necessary for the feature.
>
> But your proposal will break this design very easy, and it is
> impossible to unformat the outputs reflectively.
>
> For these reasons, your suggestion is hard to accept.
>
> Kenji Hara

Is there sum misunderstanding?

Reflective formatting is good! But it isn't what you always want. It is 
needed mostly for debug purposes. But debugging is one of two usings of 
formatting, the second one is just writing something somewhere.

There are already some non-reflective constructs (like "%(%(%c%), %)" 
for a range and "X%sY%sZ" for strings) and I just propose adding more 
comfortable ones because every second time I use formatting I use it for 
writing (I mean not for debugging).

-- 
Денис В. Шеломовский
Denis V. Shelomovskij
April 23, 2012
Re: Escaping control in formatting
23.04.2012 21:49, Denis Shelomovskij написал:
> 23.04.2012 21:15, kenji hara написал:
>> 2012年4月24日1:14 Denis Shelomovskij<verylonglogin.reg@gmail.com>:
>>> 23.04.2012 18:54, kenji hara написал:
>>>
>>>> Please give us use cases. I cannot imagine why you want to
>>>> change/remove quotations but keep escaped contents.
>>>
>>>
>>> Sorry, I should mention that !' and !" are optional and aren't commonly
>>> used, and all !?* are very optional and are here just for completeness
>>> (IMHO).
>>>
>>> An example is generating a complicated string for C/C++:
>>> ---
>>> myCppFile.writefln(`tmp = "%!?"s, and %!?"s, and even %!?"s";`,
>>> str1, str2, str3)
>>> ---
>>>
>>>
>>> --
>>> Денис В. Шеломовский
>>> Denis V. Shelomovskij
>>
>> During my improvements of std.format module, I have decided a design.
>> If you format some values with a format specifier, you should unformat
>> the output with same format specifier.
>>
>> Example:
>> import std.format, std.array;
>>
>> auto aa = [1:"hello", 2:"world"];
>> auto writer = appender!string();
>> formattedWrite(writer, "%s", aa);
>>
>> aa = null;
>>
>> auto output = writer.data;
>> formattedRead(output, "%s",&aa); // same format specifier
>> assert(aa == [1:"hello", 2:"world"]);
>>
>> More details:
>> https://github.com/D-Programming-Language/phobos/blob/master/std/format.d#L3264
>>
>>
>> I call this "reflective formatting", and it supports simple text based
>> serialization and de-serialization.
>> Automatic quotation/escaping for nested elements is necessary for the
>> feature.
>>
>> But your proposal will break this design very easy, and it is
>> impossible to unformat the outputs reflectively.
>>
>> For these reasons, your suggestion is hard to accept.
>>
>> Kenji Hara
>
> Is there sum misunderstanding?
>
> Reflective formatting is good! But it isn't what you always want. It is
> needed mostly for debug purposes. But debugging is one of two usings of
> formatting, the second one is just writing something somewhere.
>
> There are already some non-reflective constructs (like "%(%(%c%), %)"
> for a range and "X%sY%sZ" for strings) and I just propose adding more
> comfortable ones because every second time I use formatting I use it for
> writing (I mean not for debugging).
>

Completely forgot. %!+s in my proposal is exactly for reflective 
formatting (e.g. "X%!+sY%!+sZ" in reflective for strings).

-- 
Денис В. Шеломовский
Denis V. Shelomovskij
April 24, 2012
Re: Escaping control in formatting
2012$BG/(B4$B7n(B24$BF|(B2:49 Denis Shelomovskij <verylonglogin.reg@gmail.com>:
> 23.04.2012 21:15, kenji hara $B'_'Q'a'Z'c'Q'](B:
>>
>> 2012$BG/(B4$B7n(B24$BF|(B1:14 Denis Shelomovskij<verylonglogin.reg@gmail.com>:
>>>
>>> 23.04.2012 18:54, kenji hara $B'_'Q'a'Z'c'Q'](B:
>>>
>>>
>>>> Please give us use cases. I cannot imagine why you want to
>>>> change/remove quotations but keep escaped contents.
>>>
>>>
>>>
>>> Sorry, I should mention that !' and !" are optional and aren't commonly
>>> used, and all !?* are very optional and are here just for completeness
>>> (IMHO).
>>>
>>> An example is generating a complicated string for C/C++:
>>> ---
>>> myCppFile.writefln(`tmp = "%!?"s, and %!?"s, and even %!?"s";`,
>>>                   str1, str2, str3)
>>> ---
>>>
>>>
>>> --
>>> $B'%'V'_'Z'c(B $B'#(B. $B':'V']'`'^'`'S'c'\'Z'[(B
>>> Denis V. Shelomovskij
>>
>>
>> During my improvements of std.format module, I have decided a design.
>> If you format some values with a format specifier, you should unformat
>> the output with same format specifier.
>>
>> Example:
>>     import std.format, std.array;
>>
>>     auto aa = [1:"hello", 2:"world"];
>>     auto writer = appender!string();
>>     formattedWrite(writer, "%s", aa);
>>
>>     aa = null;
>>
>>     auto output = writer.data;
>>     formattedRead(output, "%s",&aa);  // same format specifier
>>
>>     assert(aa == [1:"hello", 2:"world"]);
>>
>> More details:
>>
>> https://github.com/D-Programming-Language/phobos/blob/master/std/format.d#L3264
>>
>> I call this "reflective formatting", and it supports simple text based
>> serialization and de-serialization.
>> Automatic quotation/escaping for nested elements is necessary for the
>> feature.
>>
>> But your proposal will break this design very easy, and it is
>> impossible to unformat the outputs reflectively.
>>
>> For these reasons, your suggestion is hard to accept.
>>
>> Kenji Hara
>
>
> Is there sum misunderstanding?
>
> Reflective formatting is good! But it isn't what you always want. It is
> needed mostly for debug purposes. But debugging is one of two usings of
> formatting, the second one is just writing something somewhere.
>
> There are already some non-reflective constructs (like "%(%(%c%), %)" for a
> range and "X%sY%sZ" for strings) and I just propose adding more comfortable
> ones because every second time I use formatting I use it for writing (I mean
> not for debugging).
>
>
> --
> $B'%'V'_'Z'c(B $B'#(B. $B':'V']'`'^'`'S'c'\'Z'[(B
> Denis V. Shelomovskij

My concern is that the proposal is much complicated and less useful
for general use cases.
You can emulate such formatting like follows:

import std.array, std.format, std.stdio;
import std.range, std.uni;
void main()
{
   auto strs = ["It's", "\"world\""];
   {
       // emulation of !?"
       auto w = appender!string();
       foreach (s; strs)
           formatStrWithEscape(w, s, '"');
       writeln(w.data);
   }
   {
       // emulation of !?'
       auto w = appender!string();
       foreach (s; strs)
           formatStrWithEscape(w, s, '\'');
       writeln(w.data);
   }
}
void formatStrWithEscape(W)(W writer, string str, char quote)
{
   writer.put(quote);
   foreach (dchar c; str)
       formatChar(writer, c, quote);
   writer.put(quote);
}
// copy from std.format
void formatChar(Writer)(Writer w, in dchar c, in char quote)
{
   if (std.uni.isGraphical(c))
   {
       if (c == quote || c == '\\')
           put(w, '\\'), put(w, c);
       else
           put(w, c);
   }
   else if (c <= 0xFF)
   {
       put(w, '\\');
       switch (c)
       {
       case '\a':  put(w, 'a');  break;
       case '\b':  put(w, 'b');  break;
       case '\f':  put(w, 'f');  break;
       case '\n':  put(w, 'n');  break;
       case '\r':  put(w, 'r');  break;
       case '\t':  put(w, 't');  break;
       case '\v':  put(w, 'v');  break;
       default:
           formattedWrite(w, "x%02X", cast(uint)c);
       }
   }
   else if (c <= 0xFFFF)
       formattedWrite(w, "\\u%04X", cast(uint)c);
   else
       formattedWrite(w, "\\U%08X", cast(uint)c);
}

I can agree changing private functions in std.format, e.g. formatChar,
to public undocumented, but cannot agree adding such complicated rule
into supported format specifier.

Kenji Hara
April 24, 2012
Re: Escaping control in formatting
On Tuesday, 24 April 2012 at 04:55:34 UTC, kenji hara wrote:
> My concern is that the proposal is much complicated and less 
> useful
> for general use cases.
> You can emulate such formatting like follows:

IMHO addition of %!+s and %!-s alone and removing %c's magic will 
only simplify formatting for the user. It was hard (for me) to 
understand current escaping rules because it's undocumented and 
looks dissonant (for me) because of the fact that escaping is a 
part of formatting but user is unable to control it unless 
magical %c is used.

I agree that !', !", and !?* of course aren't commonly used as I 
have already written. Personally I don't need them at all.

But this is a common pattern for me: `xformat("My pets: %(%!-s, 
%)", petsAsStrings)`. And "My pets: %(%(%c%), %)" is too 
complicated, dissonant and not general (will not work if I'll 
give it pets as int[] e.g.) that I never use it. I use 
`.joiner(", ")` instead and every time I do it I think that 
something is really wrong with array formatting in Phobos.

--
Денис В. Шеломовский
Denis V. Shelomovskij
Top | Discussion index | About this forum | D home