Thread overview | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
March 06, 2007 Poll on improved format strings. | ||||
---|---|---|---|---|
| ||||
Attachments: | Hello, I've made a templated format string templates joined in attachment (this new version is improved thanks to Frits van Bommel), but I'm not sure about the syntax of the format string. The idea is: printf format string are interesting because they are powerful but they suck because the %d,%s, etc are in one part of the function and the corresponding variable are in a different part of the function (Tango has the same problem), writef improve this by allowing "... %d",var," ... %s",var2 but it's still not ideal because in the gluing of the various strings, it's easy to forget a space or a comma thus providing a not very good output. So my idea would be to have embedded expression like this "... %08d{var1+var2} ...", but it's not easy to provide a good syntax/semantic, so I'd like some remarks: -Should the mix of printf format and new style format string be allowed? (It is in the current implementation). This has the advantage of nearly keeping the compatibility, the problem is with the format string "..%d{..." in printf this means a number followed by '{' but with the new format this creates an error. It is possible to escape the '{' to allow this, ie to say that '%{' is '{' so now '%d{' would need to be '%d%{', this has the inconvenient that it's not possible to have the embedding format '%{var}' which would be the shortest syntax.. Another possibility would be to say that if you want to have '... %d{ ...' one need to write it has '.... %d',var,'{ ....', this would permit to have the '..%{var}...' embedding syntax. -What to do with non-const char[]? They cannot be parsed by the template, so one possibility is to allow only const char[] parameter or to allow non-const char[] and leave them alone (they may contain printf-style format string). This is what the current implementation is doing but I'm not sure if the added flexibility is not confusing: const char[] can contain both printf-like format and 'new embedded format' but non const char[] can only contain printf-like format string. - Another possibility would be to use a different character '#' (like in Ruby) for these new format string.. I'd like some inputs to see if there is a majority in favour of one style or the other.. renoX |
March 06, 2007 Re: Poll on improved format strings. | ||||
---|---|---|---|---|
| ||||
Posted in reply to renoX | renoX wrote: > Hello, > > I've made a templated format string templates > > The idea is: printf format string are interesting because they are powerful but they suck because the %d,%s, etc are in one part of the function and the corresponding variable are in a different part of the function (Tango has the same problem), writef improve this by allowing "... %d",var," ... %s",var2 but it's still not ideal because in the gluing of the various strings, it's easy to forget a space or a comma thus providing a not very good output. > > So my idea would be to have embedded expression like this "... %08d{var1+var2} ...", but it's not easy to provide a good syntax/semantic, so I'd like some remarks: I like this better than anything else I've ever seen. > > -Should the mix of printf format and new style format string be allowed? (It is in the current implementation). > This has the advantage of nearly keeping the compatibility, the problem is with the format string "..%d{..." in printf this means a number followed by '{' but with the new format this creates an error. > It is possible to escape the '{' to allow this, ie to say that '%{' is '{' so now '%d{' would need to be '%d%{', this has the inconvenient that it's not possible to have the embedding format '%{var}' which would be the shortest syntax.. > Another possibility would be to say that if you want to have '... %d{ ...' one need to write it has '.... %d',var,'{ ....', this would permit to have the '..%{var}...' embedding syntax. Personally I'd rather get an error if I leave off the {}. (I'm someone who uses {} inside printf debugging strings a lot, so it's far from compatible for me). I think it's better to minimise features wherever possible. > -What to do with non-const char[]? Aren't they a security risk? eg, char [] a = getFromUser(); writefln(a); if a is "%d", you get an access violation. Apart from security, it just hides an insidious bug -- the code works fine until someone innocently enters a % followed by one of the allowable letters.... I've always thought that the first argument to printf() should be forced to be a string literal. I would see it as an *advantage*, to only support const char [] ! |
March 06, 2007 Re: Poll on improved format strings. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Don Clugston | Don Clugston a écrit : > renoX wrote: >> Hello, >> >> I've made a templated format string templates >> >> The idea is: printf format string are interesting because they are powerful but they suck because the %d,%s, etc are in one part of the function and the corresponding variable are in a different part of the function (Tango has the same problem), writef improve this by allowing "... %d",var," ... %s",var2 but it's still not ideal because in the gluing of the various strings, it's easy to forget a space or a comma thus providing a not very good output. >> >> So my idea would be to have embedded expression like this "... %08d{var1+var2} ...", but it's not easy to provide a good syntax/semantic, so I'd like some remarks: > > I like this better than anything else I've ever seen. > >> >> -Should the mix of printf format and new style format string be allowed? (It is in the current implementation). >> This has the advantage of nearly keeping the compatibility, the problem is with the format string "..%d{..." in printf this means a number followed by '{' but with the new format this creates an error. >> It is possible to escape the '{' to allow this, ie to say that '%{' is '{' so now '%d{' would need to be '%d%{', this has the inconvenient that it's not possible to have the embedding format '%{var}' which would be the shortest syntax.. >> Another possibility would be to say that if you want to have '... %d{ ...' one need to write it has '.... %d',var,'{ ....', this would permit to have the '..%{var}...' embedding syntax. > > Personally I'd rather get an error if I leave off the {}. > (I'm someone who uses {} inside printf debugging strings a lot, so it's far from compatible for me). > I think it's better to minimise features wherever possible. And this has the benefit that %{var} is the 'default' embedded format string no need to use %s{var}. > >> -What to do with non-const char[]? > > Aren't they a security risk? > eg, > char [] a = getFromUser(); > writefln(a); Yes, they might be a security risk if they are not 'sanitized' before usage. > > if a is "%d", you get an access violation. Apart from security, it just hides an insidious bug -- the code works fine until someone innocently enters a % followed by one of the allowable letters.... > I've always thought that the first argument to printf() should be forced to be a string literal. > I would see it as an *advantage*, to only support const char [] ! Interesting. Thanks for sharing your opinion, it's true that supporting only const char[] and the 'embedded format string' makes the usage of putf/sputf easier for the programmer.. I think that I will follow your ideas. Regards, renoX |
March 06, 2007 Re: Poll on improved format strings. | ||||
---|---|---|---|---|
| ||||
Posted in reply to renoX | On Tue, 06 Mar 2007 09:53:48 -0500, renoX wrote: > Hello, > > I've made a templated format string templates ... > I'd like some inputs to see if there is a majority in favour of one style or the other.. I will not be using this style of string formatting. I like to have the format strings as something that the user supplies at runtime so that they can control the output of messages to their users - especially when considering multiple language support. msg = Expand( getMsgLayout( msgno ), vars ...); Output (msg ); where the msgno is a key to a runtime lookup for the layout of the message which is suitable for the langugage of the current user. -- Derek Parnell Melbourne, Australia "Justice for David Hicks!" skype: derek.j.parnell |
March 06, 2007 Re: Poll on improved format strings. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Derek Parnell | Derek Parnell a écrit : > On Tue, 06 Mar 2007 09:53:48 -0500, renoX wrote: > >> Hello, >> >> I've made a templated format string templates > ... > >> I'd like some inputs to see if there is a majority in favour of one style or the other.. > > I will not be using this style of string formatting. > > I like to have the format strings as something that the user supplies at > runtime so that they can control the output of messages to their users - > especially when considering multiple language support. I haven't thought about localisation, I'll have to take a look how it's done currently in D to see if it's compatible. > msg = Expand( getMsgLayout( msgno ), vars ...); > Output (msg ); Well your scheme is simple to implement, but this isn't very readable.. The GNU localisation package use the text in the default language (usually but not necessarily English) as a key to find the translations in a localisation file. Of course the problem is that it requires a parser to do this.. > where the msgno is a key to a runtime lookup for the layout of the message > which is suitable for the langugage of the current user. Note that both type of formating are useful, yours is for user interface, mine is for trace logs (which are not localised usually). They are not necessarily incompatible, I'll have to think about it. renoX |
March 07, 2007 Re: Poll on improved format strings. | ||||
---|---|---|---|---|
| ||||
Posted in reply to renoX | renoX wrote:
> Hello,
>
>
> So my idea would be to have embedded expression like this "... %08d{var1+var2} ..."
In any case I like the idea of having these string in-beded in the text.
I've had to write run-time versions of this many times for localization and for designers. In these cases (and I assume most games work this way) there is a separation between design and code, so this would simplify the process. With this format you could just make all the variables visible to the write statement (ie player1, player2, enermy1 ect...).
Of course in the cases I've dealt with, the designer/localizer should use the default string conversion for the given variable. Possibly even rounding of floats should be specifiable outside and optionally changed inside.
-Joel
|
March 07, 2007 Re: Poll on improved format strings. | ||||
---|---|---|---|---|
| ||||
Posted in reply to renoX | renoX wrote: > Hello, > > I've made a templated format string templates joined in attachment (this new version is improved thanks to Frits van Bommel), but I'm not sure about the syntax of the format string. > > The idea is: printf format string are interesting because they are powerful but they suck because the %d,%s, etc are in one part of the function and the corresponding variable are in a different part of the function (Tango has the same problem), writef improve this by allowing "... %d",var," ... %s",var2 but it's still not ideal because in the gluing of the various strings, it's easy to forget a space or a comma thus providing a not very good output. > > So my idea would be to have embedded expression like this "... %08d{var1+var2} ...", but it's not easy to provide a good syntax/semantic, so I'd like some remarks: > > -Should the mix of printf format and new style format string be allowed? (It is in the current implementation). > This has the advantage of nearly keeping the compatibility, the problem is with the format string "..%d{..." in printf this means a number followed by '{' but with the new format this creates an error. > It is possible to escape the '{' to allow this, ie to say that '%{' is '{' so now '%d{' would need to be '%d%{', this has the inconvenient that it's not possible to have the embedding format '%{var}' which would be the shortest syntax.. > Another possibility would be to say that if you want to have '... %d{ ...' one need to write it has '.... %d',var,'{ ....', this would permit to have the '..%{var}...' embedding syntax. > > -What to do with non-const char[]? > They cannot be parsed by the template, so one possibility is to allow only const char[] parameter or to allow non-const char[] and leave them alone (they may contain printf-style format string). This is what the current implementation is doing but I'm not sure if the added flexibility is not confusing: const char[] can contain both printf-like format and 'new embedded format' but non const char[] can only contain printf-like format string. > > - Another possibility would be to use a different character '#' (like in Ruby) for these new format string.. > > I'd like some inputs to see if there is a majority in favour of one style or the other.. > > renoX To be honest, I think the type suffix needs to go. After all, if you know what the type is at compile-time, why do I need to repeat it? Of course, doing that leaves you with the problem of how to specify formatting options... but then in the majority of cases, you probably don't care; you just want the thing output. So how about something like this: Expansion ::= "$" ExpansionSpec ExpansionSpec ::= FormattingOptions ExpansionExpr ExpansionSpec ::= ExpansionExpr FormattingOptions ::= "(" string ")" ExpansionExpr ::= "{" D_Expression "}" ExpansionExpr ::= D_Identifier So the example above becomes "... $(08){var1+var2} ...": one character longer, but gives you more freedom as to what you can put in the formatting options. Plus, if you don't care how it's formatted, you can use "... ${var1+var2} ...", and if you just want to print a variable out, you can use "... $somevar ...". Plus, if you discount the formatting stuff out the front, it's roughly comparable to how variable expansions are written in bash and the like. I also think that Nemerle (which has had this sort of compile-time printf stuff for ages) does it the same way. As for the spec itself: it should be const char[] only, and display a meaningful error if the programmer tries to pass a non-const char[]. That said, I think you should also provide a "run-time" version of the function that has the exact same parser, formatting, etc., but the user can pass one or more hash maps to the function. This would allow people to use the same format for both compile and runtime, whilst not making the runtime version a security risk (well, aside from arbitrary expressions, anyway). For example: > auto author = "renoX"; > auto d_bdfl = "Walter Bright"; > auto life = 42; > > mixin(swritefln("Author: $author, BDFL: $(q)d_bdfl, " > "Meaning of life: $life")); > > // Author: renoX, BDFL: "Walter Bright", Meaning of life: 42 > > char[][char[]] strings; > int[char[]] ints; > > strings["author"] = author; > strings["d-bdfl"] = d_bdfl; > ints["life"] = life; > > auto formatstr = "Author: $author, BDFL: $(q){d-bdfl}, " > "Meaning of life: $life"; > > writefln(formatstr, strings, ints); > > // Prints the same thing as above -- Daniel P.S. $(q){...} is stolen from Lua's "%q" format specifier: prints a string out complete with escaping and quotation marks :P -- Unlike Knuth, I have neither proven or tried the above; it may not even make sense. v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/ |
March 07, 2007 Re: Poll on improved format strings. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Daniel Keep | Daniel Keep wrote: > > renoX wrote: >> Hello, >> >> I've made a templated format string templates joined in attachment (this new version is improved thanks to Frits van Bommel), but I'm not sure about the syntax of the format string. >> >> The idea is: printf format string are interesting because they are powerful but they suck because the %d,%s, etc are in one part of the function and the corresponding variable are in a different part of the function (Tango has the same problem), writef improve this by allowing "... %d",var," ... %s",var2 but it's still not ideal because in the gluing of the various strings, it's easy to forget a space or a comma thus providing a not very good output. >> >> So my idea would be to have embedded expression like this "... %08d{var1+var2} ...", but it's not easy to provide a good syntax/semantic, so I'd like some remarks: >> >> -Should the mix of printf format and new style format string be allowed? (It is in the current implementation). >> This has the advantage of nearly keeping the compatibility, the problem is with the format string "..%d{..." in printf this means a number followed by '{' but with the new format this creates an error. >> It is possible to escape the '{' to allow this, ie to say that '%{' is '{' so now '%d{' would need to be '%d%{', this has the inconvenient that it's not possible to have the embedding format '%{var}' which would be the shortest syntax.. >> Another possibility would be to say that if you want to have '... %d{ ...' one need to write it has '.... %d',var,'{ ....', this would permit to have the '..%{var}...' embedding syntax. >> >> -What to do with non-const char[]? >> They cannot be parsed by the template, so one possibility is to allow only const char[] parameter or to allow non-const char[] and leave them alone (they may contain printf-style format string). This is what the current implementation is doing but I'm not sure if the added flexibility is not confusing: const char[] can contain both printf-like format and 'new embedded format' but non const char[] can only contain printf-like format string. >> >> - Another possibility would be to use a different character '#' (like in Ruby) for these new format string.. >> >> I'd like some inputs to see if there is a majority in favour of one style or the other.. >> >> renoX > > To be honest, I think the type suffix needs to go. After all, if you > know what the type is at compile-time, why do I need to repeat it? > Of course, doing that leaves you with the problem of how to specify > formatting options... but then in the majority of cases, you probably > don't care; you just want the thing output. When you use floating point, you want to specify the formatting options almost every time -- do you want %f, %e, %g, or %a? And it's almost always necessary to specify the number of decimal places to use. I display integers in hex pretty often, too. Still, being able to leave all the formatting options out, and write: "next=%{i+1}" is very appealing. |
March 07, 2007 Re: Poll on improved format strings. | ||||
---|---|---|---|---|
| ||||
Posted in reply to Don Clugston | Don Clugston wrote: > Daniel Keep wrote: >> > > When you use floating point, you want to specify the formatting options almost every time -- do you want %f, %e, %g, or %a? And it's almost always necessary to specify the number of decimal places to use. > I display integers in hex pretty often, too. Most of the time I want the same number of decimal places. I think being able to provide a default decimal place would be a good idea. Perhaps it could be in the first part of the string. Then it could just append it (to hide from design/localizers/myself) something like: "%.2f()" ~ "blar: %(value)". Or maybe "%.2f=default" ~ "blar: %(value)". > > Still, being able to leave all the formatting options out, and write: > "next=%{i+1}" is very appealing. Agreed. |
March 07, 2007 Re: Poll on improved format strings. | ||||
---|---|---|---|---|
| ||||
Posted in reply to janderson | janderson wrote:
> Don Clugston wrote:
>> Daniel Keep wrote:
>>>
>>
>> When you use floating point, you want to specify the formatting options almost every time -- do you want %f, %e, %g, or %a? And it's almost always necessary to specify the number of decimal places to use.
>> I display integers in hex pretty often, too.
>
> Most of the time I want the same number of decimal places. I think being able to provide a default decimal place would be a good idea.
I think you're right.
|
Copyright © 1999-2021 by the D Language Foundation