February 09, 2007
Kevin Bealer wrote:
> 
> Okay -- I'm really sorry if any of this seems to have a negative tone. I hesitate to write this since I have a lot of respect for the Tango design in general, but there are a couple of friction points I've noticed.
> 
> 1. writefln / format replacements
> 
> Concerning standard output and string formatting, in phobos I can do these operations:
> 
>   writefln("%s %s %s", a, b, c);
>   format("%s %s %s", a, b, c);
> 
> How do I do these in Tango?  The change to "{0} {1}" stuff is fine with me, in fact I like it, but this syntax:
> 
>   Stdout.formatln("{0} {1} {2}", a, b, c);
>   Format!(char).convert("{0} {1} {2}", a, b, c);
> 
> Is awkward.  And these statements are used *all the time*.  In a recent toy project I wrote, I used Stdout 15 times, compared to using "foreach" only 8 times.  I also use the "format to string" idiom a lot (oddly enough, not in that project), and it's even more awkward.

The conversion modules seem to have slightly spotty API documentation, but I think this will work for the common case:

Formatter( "{0} {1} {2}", a, b, c );

The Stdout design is the result of a lengthy discussion involving overload rules and expected behavior.  I believe two of the salient points were that the default case should be the simplest to execute, and that the .format method call provided a useful signifier that an explicit format was being supplied.  That said, I believe that the default output format can be called via:

Stdout( a, b, c );

or the "whisper" syntax:

Stdout( a )( b )( c );

> That's why I think phobos really did the "Right Thing" by keeping those down to one token.  Second, the fact that the second one does exactly what the first does but you need to build a template, etc, is annoying.  I kept asking myself if I was doing the right thing because it seemed like I was using too much syntax for this kind of operation (I'm still not sure it's the best way to go -- is it?)

Do you consider the Formatter instance to be sufficient or would it be more useful to wrap this behavior in a free function?  I'll admit that, being from a C++ background I'm quite used to customizing the library behavior to suit my particular use style, but I can understand the desire for "out of the box" convenience.

> 2. toString and toUtf8 (collisions)
> 
> The change of the terminology is actually okay with me.
> 
> But phobos has a way of using toString as both a method and a top-level function name, all over the place.  This gets really clumsy because you can never use the top level function names when writing a class unless you fully qualify them.
> 
> For example, std.cpuid.toString(), always has to be fully qualified when called from a class, and seems nondescriptive anyway.  All the std.conv.toString() functions are nice but it's easy to accidentally call the in-class toString() by accident.
> 
> For the utf8 <--> utf16 and similar, it's frustrating to have to do this:
> 
> dchar[] x32 = ...;
> char[] x8 = tango.text.convert.Utf.toUtf8(x32);
> 
> But you have to fully qualify if you are writing code in any class or struct.  If these were given another name, like makeUtf8, then these collisions would not happen.

One aspect of the Mango design that has carried forward into Tango is that similar functions are typically intended to live in their own namespace for the sake of clarity.  Previously, most/all of the free functions were declared in structs simply to prevent collisions, but this had code bloat issues so the design was changed.  Now, users are encouraged to use the aliasing import to produce the same effect:

import Utf = tango.text.convert.Utf;

Utf.toUtf8( x32 );

I'll admit it's not as convenient as simply importing and using the functions, but it does make the origin of every function call quite clear.  I personally avoid "using" in C++ for exactly this reason--if I'm using an external routine I want to know what library it's from by inspection.


Sean
February 10, 2007
Sean Kelly wrote:
>That said, I believe that the 
> default output format can be called via:
> 
> Stdout( a, b, c );
> 
> or the "whisper" syntax:
> 
> Stdout( a )( b )( c );
> 

One thing that surprised me when trying out this, was that the buffer is never flushed automatically.  Not even when outputting a '\n'.  Not for small outputs anyway.  I'm used to printf's unbuffered output, at least on windows.  Stdout.formatln() does flush, so it might be safer to stick with that than to risk forgetting to flush when doing some 'printf debugging'.  Just a thought.

(I know about .newline and .flush.)
February 10, 2007
torhu wrote:
> Sean Kelly wrote:
>> That said, I believe that the default output format can be called via:
>>
>> Stdout( a, b, c );
>>
>> or the "whisper" syntax:
>>
>> Stdout( a )( b )( c );
>>
> 
> One thing that surprised me when trying out this, was that the buffer is never flushed automatically.  Not even when outputting a '\n'.  Not for small outputs anyway.  I'm used to printf's unbuffered output, at least on windows.  Stdout.formatln() does flush, so it might be safer to stick with that than to risk forgetting to flush when doing some 'printf debugging'.  Just a thought.
> 
> (I know about .newline and .flush.)

For Cout and Stdout, .opCall with no arguments is equivalent to .flush(). It provides for a quite clean syntax to specify "please flush now". Not perfect but quite usable. I don't think there's way to determine, when using whisper syntax, when an appropriate time would be to flush except if explicitly requested.


On a related note, one of the things that bothers /me/ is that no flush is performed at the end of the program. That causes some or all of the output to be missing if you don't explicitly flush after the last output.
I'd suggest adding the following to tango.io.Console:
---
static ~this ()
{
        Cout.flush();
        Cerr.flush();
}
---

That would fix it, I think.
(Well, I could probably write some code that maintains indirect references to Cout/Cerr from modules not importing tango.io.Console, but above addition should fix it for *most* cases)
February 10, 2007
Sean Kelly wrote:
> Kevin Bealer wrote:
>>
>> Okay -- I'm really sorry if any of this seems to have a negative tone. I hesitate to write this since I have a lot of respect for the Tango design in general, but there are a couple of friction points I've noticed.
>>
>> 1. writefln / format replacements
>>
>> Concerning standard output and string formatting, in phobos I can do these operations:
>>
>>   writefln("%s %s %s", a, b, c);
>>   format("%s %s %s", a, b, c);
>>
>> How do I do these in Tango?  The change to "{0} {1}" stuff is fine with me, in fact I like it, but this syntax:
>>
>>   Stdout.formatln("{0} {1} {2}", a, b, c);
>>   Format!(char).convert("{0} {1} {2}", a, b, c);
>>
>> Is awkward.  And these statements are used *all the time*.  In a recent toy project I wrote, I used Stdout 15 times, compared to using "foreach" only 8 times.  I also use the "format to string" idiom a lot (oddly enough, not in that project), and it's even more awkward.
> 
> The conversion modules seem to have slightly spotty API documentation, but I think this will work for the common case:
> 
> Formatter( "{0} {1} {2}", a, b, c );
> 
> The Stdout design is the result of a lengthy discussion involving overload rules and expected behavior.  I believe two of the salient points were that the default case should be the simplest to execute, and that the .format method call provided a useful signifier that an explicit format was being supplied. 

Is there any reason not to make the format item's index also optional? So that
   Formatter("{} {} {}", a, b, c);
can be used?  I mean making it more like %s?

The meaning would just be "use the index (1+ the last one that appeared)" or 0 if it's the first to appear.

And then if you go there, it might be nice to have a way to say "same as the last item" or "last item +/- some index".  Maybe use +/- numbers.  So
   Formatter("{1} {+0} {-1}",a,b);
would be equal to
   Formatter("{1} {1} {0}",a,b);
I can't really think of when I'd use that though.  The {} I'd use for sure though.

Anyway, the positional references are great, and really a must have for any serious I18N usage, but in the original language the app is written in, things tend to appear in the order of the arguments.

--bb
February 10, 2007
Bill Baxter wrote:
> Is there any reason not to make the format item's index also optional? So that
>    Formatter("{} {} {}", a, b, c);
> can be used?  I mean making it more like %s?
> 
> The meaning would just be "use the index (1+ the last one that appeared)" or 0 if it's the first to appear.
> 
> And then if you go there, it might be nice to have a way to say "same as the last item" or "last item +/- some index".  Maybe use +/- numbers.  So
>    Formatter("{1} {+0} {-1}",a,b);
> would be equal to
>    Formatter("{1} {1} {0}",a,b);
> I can't really think of when I'd use that though.  The {} I'd use for sure though.
> 
> Anyway, the positional references are great, and really a must have for any serious I18N usage, but in the original language the app is written in, things tend to appear in the order of the arguments.

An argument against that would be: Don't you think it'd be easier on the translators if they could just pick the argument number out of the untranslated string without having to keep a running count of which argument they're at?


I'd very much like the "{} {} {}" syntax though, especially for anything quick-and-dirty.
The "relative" argument numbers I don't see much use for either.
February 10, 2007
I like the idea of {}.
February 10, 2007
On Sat, 10 Feb 2007 23:43:43 +0900, Bill Baxter wrote:


> Is there any reason not to make the format item's index also optional?
> So that
>     Formatter("{} {} {}", a, b, c);
> can be used?  I mean making it more like %s?

Seem to be a great idea. With that we have the choice between positional and/or indexed tokens in the format string.


-- 
Derek Parnell
Melbourne, Australia
"Justice for David Hicks!"
skype: derek.j.parnell
February 10, 2007
Frits van Bommel wrote:
> Bill Baxter wrote:
>> Is there any reason not to make the format item's index also optional? So that
>>    Formatter("{} {} {}", a, b, c);
>> can be used?  I mean making it more like %s?
>>
>> The meaning would just be "use the index (1+ the last one that appeared)" or 0 if it's the first to appear.
>>
>> And then if you go there, it might be nice to have a way to say "same as the last item" or "last item +/- some index".  Maybe use +/- numbers.  So
>>    Formatter("{1} {+0} {-1}",a,b);
>> would be equal to
>>    Formatter("{1} {1} {0}",a,b);
>> I can't really think of when I'd use that though.  The {} I'd use for sure though.
>>
>> Anyway, the positional references are great, and really a must have for any serious I18N usage, but in the original language the app is written in, things tend to appear in the order of the arguments.
> 
> An argument against that would be: Don't you think it'd be easier on the translators if they could just pick the argument number out of the untranslated string without having to keep a running count of which argument they're at?

Good point.  But strings for translation are usually extracted by a text processing tool of some sort (like poedit).  So it would be easy for that tool to also fill in the numbers while extracting.

--bb
February 11, 2007
Bill Baxter wrote:
> Sean Kelly wrote:
>> Kevin Bealer wrote:
>>>
>>> Okay -- I'm really sorry if any of this seems to have a negative tone. I hesitate to write this since I have a lot of respect for the Tango design in general, but there are a couple of friction points I've noticed.
>>>
>>> 1. writefln / format replacements
>>>
>>> Concerning standard output and string formatting, in phobos I can do these operations:
>>>
>>>   writefln("%s %s %s", a, b, c);
>>>   format("%s %s %s", a, b, c);
>>>
>>> How do I do these in Tango?  The change to "{0} {1}" stuff is fine with me, in fact I like it, but this syntax:
>>>
>>>   Stdout.formatln("{0} {1} {2}", a, b, c);
>>>   Format!(char).convert("{0} {1} {2}", a, b, c);
>>>
>>> Is awkward.  And these statements are used *all the time*.  In a recent toy project I wrote, I used Stdout 15 times, compared to using "foreach" only 8 times.  I also use the "format to string" idiom a lot (oddly enough, not in that project), and it's even more awkward.
>>
>> The conversion modules seem to have slightly spotty API documentation, but I think this will work for the common case:
>>
>> Formatter( "{0} {1} {2}", a, b, c );
>>
>> The Stdout design is the result of a lengthy discussion involving overload rules and expected behavior.  I believe two of the salient points were that the default case should be the simplest to execute, and that the .format method call provided a useful signifier that an explicit format was being supplied. 
> 
> Is there any reason not to make the format item's index also optional? So that
>    Formatter("{} {} {}", a, b, c);
> can be used?  I mean making it more like %s?

Nope.  That's a good idea.


Sean
February 11, 2007
Max Samukha wrote:
> On Fri, 09 Feb 2007 01:40:00 -0500, Kevin Bealer
> <kevinbealer@gmail.com> wrote:
> 
>> Okay -- I'm really sorry if any of this seems to have a negative tone. I hesitate to write this since I have a lot of respect for the Tango design in general, but there are a couple of friction points I've noticed.
>>
>> 1. writefln / format replacements
>>
>> Concerning standard output and string formatting, in phobos I can do these operations:
>>
>>   writefln("%s %s %s", a, b, c);
>>   format("%s %s %s", a, b, c);
>>
>> How do I do these in Tango?  The change to "{0} {1}" stuff is fine with me, in fact I like it, but this syntax:
>>
>>   Stdout.formatln("{0} {1} {2}", a, b, c);
>>   Format!(char).convert("{0} {1} {2}", a, b, c);
>>
>> Is awkward.  And these statements are used *all the time*.  In a recent toy project I wrote, I used Stdout 15 times, compared to using "foreach" only 8 times.  I also use the "format to string" idiom a lot (oddly enough, not in that project), and it's even more awkward.
>>
>> That's why I think phobos really did the "Right Thing" by keeping those down to one token.  Second, the fact that the second one does exactly what the first does but you need to build a template, etc, is annoying.  I kept asking myself if I was doing the right thing because it seemed like I was using too much syntax for this kind of operation (I'm still not sure it's the best way to go -- is it?)
>>
>> I know about Cout as a replacement for the first one, but as far as I can tell it doesn't take parameters, and usually I need some.
>>
>> When people ask "why D", I tell them that simpler syntax, better defaults and better garbage collection, each gain us a 50 % reduction in code, and when all three apply to a problem, D can each C++'s lunch. Let's not throw away the simpler syntax.
>>
>> (I'm not talking about architecture changes, just wrappers with standardized short names that can become familiar to all D users.)
>>
>>
>> 2. toString and toUtf8 (collisions)
>>
>> The change of the terminology is actually okay with me.
>>
>> But phobos has a way of using toString as both a method and a top-level function name, all over the place.  This gets really clumsy because you can never use the top level function names when writing a class unless you fully qualify them.
>>
>> For example, std.cpuid.toString(), always has to be fully qualified when called from a class, and seems nondescriptive anyway.  All the std.conv.toString() functions are nice but it's easy to accidentally call the in-class toString() by accident.
>>
>> For the utf8 <--> utf16 and similar, it's frustrating to have to do this:
>>
>> dchar[] x32 = ...;
>> char[] x8 = tango.text.convert.Utf.toUtf8(x32);
>>
>> But you have to fully qualify if you are writing code in any class or struct.  If these were given another name, like makeUtf8, then these collisions would not happen.
>>
>> Actually, if it wasn't already out there, I would want to go through all of phobos and remove all the common collisions.  They are much less trouble in an "import" system than in an "include" system, but every time there is a collision it requires an additional "edit-compile" cycle, and/or a fully qualified name.
>>
>> And if you forget to import all the right modules, its can impact the correctness angle, because you pick up someone else's "toString" from who knows where.
>>
>> I'm just saying, ideally tango should not be duplicating this with toUtf8 etc.
>>
>> Kevin
> 
> If you really want the 'writefln' and 'format' with Tango, you could
> do the following:
> 
> import tango.io.Stdout;
> import tango.text.convert.Format;
> 
> 
> Format!(char) format;
> typeof(&Stdout.formatln) writefln;
> 
> static this()
> {
> 	writefln = &Stdout.formatln;
> 	format = new Format!(char);
> }
> 
> void main()
> {
> 	auto str = format("Test {0}: {1}", 1, "Passed");
> 	writefln(str);
> 	writefln("Test {0}: {1}", 2, "Passed");
> }

Right - this is good.  But almost everyone will eventually do this, so What I'm also suggesting though is that this be done in the module, so that everyone who imports the module doesn't need to cut and paste or invent something like the above in their code.

If it's done in the module it helps readability of all user code because you don't need to see what particular identifier is used by each coder.  Sort of like in C++ where you see this all the time:

Int1, Uint1, Int2, Uint2, Int4, Uint4, Int8, Uint8 // label=#bytes

  in another project they will use

int8_t, uint8_t, ... int64_t, uint64_t // label=#bits

If you use a library from ncbi, another from GTK, another from Qt, etc, you eventually have a dozen types with a three or four synonyms for each one.

Kevin