October 13, 2001
Russ Lewis wrote:
> 
> a wrote:
> >
> > You still need to create your own %magic symbol to use it with printf, so he knows how to handle the argument.  That just make the % problem worse, if not open ended.  I admit that printf does what it does better that most anything else.  It's just not capable of being expanded in a reasonable way.
> 
> How about:
> 
> class object     // base for everything else
> {
> public:
>    char[] toString() { // default value returns a hex print of the 'this' pointer };
> };
> 
> The implementation recognizes a new %magic that is only legal when matched to an argument that is a pointer to object (or
> a child, ofc).  Any classes that want to override how they are printed must override toString().  Thus, you only add 1
> %magic that will work with all classes.
> 
> It doesn't help with structs, ofc...

	Well, at the very least, I'd like to have a toStream method that can
default to toString.  The human readable format is not always the
best/most complete way to store to a file.
	Given only one %magic symbol that can handle all descendants of Object
would work, but you would only have one way of formatting, and how would
you deal with the %modifiers?  (Assuming we really want the kludge up
printf this way.)

> >         If it's type safe, then you don't need the formatter in most cases.
> > The %i, %d, %s, %c, %x, etc. formats are redundant if printf could be
> > made type safe.  They would allow you to put an implicit case of some
> > sort in the format string, but that sounds nasty to me.  And it defies
> > the "Only one way to possibly do it." ideology that D seems to be going
> > for.  (Maybe I do like it.)
> 
> Hold on here...the point of those specifiers is to tell you how to print something.  %d (and %i, which I think is identical) prints an integer in decimal format.  %c prints the ASCII character that has the ASCII code given.  %x and %X are two distinct ways of printing an integer as a hexdecimal value.  All of these specifiers can work on the SAME argument!
> 
> printf("%d %c %x %X",10,10,10,10);
>   prints out
> "10 \n a A"
> 
> The typesafety comes in when you try to pass a char[] and print it as an integer, or an integer and you try to print it as a string (meaning printf expects it to be a pointer value).  In either of those cases, printf() would throw an exception.

	You are right that there is a difference between %d and %x (etc.) that
we'd need to keep.  In the above discussion, we could not have such
differences in representation for objects.  A pity.
	I will argue that you don't need %c or similar %magic that specify a
type, but not a representation.  In your above example, if you want 10
to be a char, cast it.  that is the implicit cast I was talking about.

> > > It appears to me that scanf was designed to look like printf...but in doing so, it created a nightmare.  scanf should have been designed very differently, in a way that made sense for reading, just like printf was designed to work well with output.
> >
> > printf allows aggregate printing. scanf allow aggregate reading.  I used to work with folks who loved scanf the way you love printf.  Both are better than the world of have a read and print function for every data type imaginable for the user to memorize the options to.  It's just that it feels like it is time to get fed up and improve things again.
> 
> I suppose scanf might work to my liking, if it was typesafe, wouldn't overrun buffers, and didn't require you to take the
> address of every little integer you're reading.
> 
> I agree, it's time for an upgrade.  cout might be a step in the right direction, but I think it's more of a step sideways.  Let's brainstorm... :)

The whole OO and infix think I think was in at least a vaguely forward
direction.

> >         Same here.  Recently I began to force myself to use cout even when it
> > hurts so I can learn how to criticize it more meaningfully.  It ain't
> > printf but it's not as bad as most people make it out to be.  It's cool
> > that you can make your own custom manipulators easily, but it would be
> > nicer if you could define a manipulator type and assign it the
> > OR/accumulation/whatever a bunch of manipulations together so you could
> > then apply it to a variable being inserted.
> >
> >         form F = hex + width(5) + center;
> >         cout << x << F + y << width(3) + right + z << endl;
> >
> > It still needs work, but it give the basic idea.  I just used plus every where because I was at a loss for words.  The accumulation operator should be different from the apply operator.  Maybe it could be spelled:
> >
> >         form F = hex & width(5) & center;
> >         cout << x << F|y << width(3)&right|z << end;
> >
> > It might be nice to be apply to apply to a list of arguments too. But I digress.  This type of syntax won't work for D.
> 
> I actually kind of like your syntax, personally.  It has the potential for the compactness I like of printf with the all of the OOP benefits.
> 
> And it's actually not as far from D syntax as you might think.  What if we make some definitions:
> 
> class FormElement;
> typedef FormElement[] form;
> 
> form hex = { hexElement };
> form width(int i) { return ... };
> form center = { centerElement };
> 
> To accumulate types, you do this:
> 
> form F = hex + width(5) + center;
> 
> which D interprets as concatenating arrays.  Now walter helps us by adding a property function called "format" which takes a form argument and returns a char[].  We also implement it as a function in 'object', so now we can print easily:
> 
> stdout.print(x.format() + "this is a string" + y.format(F) + z.format(width(3)+right) + endl);
> 
> Thoughts?

	Is the array thing really the best way to do this?  I looks to me like
a hack to get around the lack of operator overloads.  It might feel
better like:
	format.apply(hex).apply(right).apply(width(3));
I'm not crazy in love with this either.

	On an aesthetic level, I think I'm more accustomed to the format coming
before the data being formatted.  That's minor though. If enough people
agree though, we could be the format something that take an argument and
returns something streamable.

	I can say that I don't like the concatenations prior to the print
operation.  If print could safely do variable arguments, it could
print/write them all without having to allocate an intermediate buffer
the size of everything being printed and copying it.  There would be
excess time overhead with the extra copying and memory overhead with the
temp string.  This isn't a problem with print so much as it's the only
way we can pass a variable number of objects to a print routine.

	Another aesthetic note.  Say what you will about cout's failings, I do
find the expression based syntax more visually pleasing.  It's a style
thing I think.  I hate doing multi-line method invocations.  Multi-line
expressions don't for some reason unless it's an argument to function or
control structure.  In your example I would probably want to do:
	stdout.print(x.format()
		+ "this is a string"
		+ y.format(F)
		+ z.format(width(3)+right)
		+ endl);

For some reason I like it better when it's not a method call.  Cout's
formatting is still gross in form and function but I find its code
formats better for my eyes.
	I like the example less since those are pluses and not commas.  For
plain function calls, I don't tend to mind splitting lines on commas.
Anything else just feels icky.  I know my reasoning here is dippy, but
style is like that.

	I think it would be worth brain storming out how the I/O might work.
The same is true of other library like functions.  Java got big before
it's library was stable.  I'm not sure I'd feel safe calling it stable
yet.  I'd hate to see D follow in Java's foot steps in this way.  I've
found the difference between versions of java to bother me more than the
differences implementations of C on different platforms.  It rivals
C++.  (Granted, I've had the supreme pleasure of not having to code for
Windows.)

Dan
October 13, 2001
a wrote:

> Russ Lewis wrote:
> >
> > a wrote:
> > >
> > > You still need to create your own %magic symbol to use it with printf, so he knows how to handle the argument.  That just make the % problem worse, if not open ended.  I admit that printf does what it does better that most anything else.  It's just not capable of being expanded in a reasonable way.
> >
> > How about:
> >
> > class object     // base for everything else
> > {
> > public:
> >    char[] toString() { // default value returns a hex print of the 'this' pointer };
> > };
> >
> > The implementation recognizes a new %magic that is only legal when matched to an argument that is a pointer to object (or
> > a child, ofc).  Any classes that want to override how they are printed must override toString().  Thus, you only add 1
> > %magic that will work with all classes.
> >
> > It doesn't help with structs, ofc...
>
>         Well, at the very least, I'd like to have a toStream method that can
> default to toString.  The human readable format is not always the
> best/most complete way to store to a file.
>         Given only one %magic symbol that can handle all descendants of Object
> would work, but you would only have one way of formatting, and how would
> you deal with the %modifiers?  (Assuming we really want the kludge up
> printf this way.)

What if toString had a char[] argument:

stdout.print("%z %'asdf'z", obj1, obj2);

The 'asdf' string would be passed to the toString call on obj2.

>         I will argue that you don't need %c or similar %magic that specify a
> type, but not a representation.  In your above example, if you want 10
> to be a char, cast it.  that is the implicit cast I was talking about.

That's true, kind of.  But there is a substantial difference between "print the decimal representation of this byte", "print
the hexadecimal representation of this byte", and "print the ASCII representation of this byte", all of which are valid for a
char type.

>         Is the array thing really the best way to do this?

I certainly hope not!

> I looks to me like
> a hack to get around the lack of operator overloads.

Very true.  It is very ugly; it was just a brainstorm to get things started. :)

> It might feel
> better like:
>         format.apply(hex).apply(right).apply(width(3));
> I'm not crazy in love with this either.
>
>         On an aesthetic level, I think I'm more accustomed to the format coming
> before the data being formatted.  That's minor though. If enough people
> agree though, we could be the format something that take an argument and
> returns something streamable.

char[] format(form,int)    ???

--
The Villagers are Online! http://villagersonline.com

.[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ]
.[ (a version.of(English).(precise.more)) is(possible) ]
?[ you want.to(help(develop(it))) ]


October 14, 2001
Russ Lewis wrote:
> 
> a wrote:
> 
> > Russ Lewis wrote:
> > >
> > > a wrote:
> > > >
> > > > You still need to create your own %magic symbol to use it with printf, so he knows how to handle the argument.  That just make the % problem worse, if not open ended.  I admit that printf does what it does better that most anything else.  It's just not capable of being expanded in a reasonable way.
> > >
> > > How about:
> > >
> > > class object     // base for everything else
> > > {
> > > public:
> > >    char[] toString() { // default value returns a hex print of the 'this' pointer };
> > > };
> > >
> > > The implementation recognizes a new %magic that is only legal when matched to an argument that is a pointer to object (or
> > > a child, ofc).  Any classes that want to override how they are printed must override toString().  Thus, you only add 1
> > > %magic that will work with all classes.
> > >
> > > It doesn't help with structs, ofc...
> >
> >         Well, at the very least, I'd like to have a toStream method that can
> > default to toString.  The human readable format is not always the
> > best/most complete way to store to a file.
> >         Given only one %magic symbol that can handle all descendants of Object
> > would work, but you would only have one way of formatting, and how would
> > you deal with the %modifiers?  (Assuming we really want the kludge up
> > printf this way.)
> 
> What if toString had a char[] argument:
> 
> stdout.print("%z %'asdf'z", obj1, obj2);
> 
> The 'asdf' string would be passed to the toString call on obj2.

	I like this idea.  I still don't like %magic, but this improves it a
bit.  It did just strike me though, we were going after typesafety
somewhere back there.  By have one %magic for all object, haven't we
killed real typesafety?  For instance, suppose you passed the wrong
object type to your print statement above.  It would be valid still.  It
may barf at runtime because it doesn't handle 'asdf'.  Not to derail
you, its a good idea, but how do we get typesafety back into it?

> >         I will argue that you don't need %c or similar %magic that specify a
> > type, but not a representation.  In your above example, if you want 10
> > to be a char, cast it.  that is the implicit cast I was talking about.
> 
> That's true, kind of.  But there is a substantial difference between "print the decimal representation of this byte", "print
> the hexadecimal representation of this byte", and "print the ASCII representation of this byte", all of which are valid for a
> char type.

There a different representations, like hex, oct, dec, left, right, etc.
but doesn't %h only work for type char because printf uses varargs
converting chars into ints?  I'm just used to the bare %magic being used
to specify type.  That notion should die and be replaced with type
safety.  Representation descriptors would still be valid, but in many
cases a sane default should be acceptable.  Make character type look
like characters, make numbers look like numbers, etc.

> >         Is the array thing really the best way to do this?
> 
> I certainly hope not!

Sorry.  :-)

> > I looks to me like
> > a hack to get around the lack of operator overloads.
> 
> Very true.  It is very ugly; it was just a brainstorm to get things started. :)

Well, I guess the very first thing we ought to decide what type of
syntax we want.  If we go all procedural, we are ok.  You said you would
need a format property in your idea.  I think I like that, but I'm not
sure how to best leverage it.  In any case that is something we would
have to be Walter for.  Likewise, if we want more on an expression based
syntax, we either have to kludge around with the built-in types in D or
we have to beg Walter to add a couple more primitives, with operators.
At that point I'd say we are work in the language itself though and not
in the library.

> > It might feel
> > better like:
> >         format.apply(hex).apply(right).apply(width(3));
> > I'm not crazy in love with this either.
> >
> >         On an aesthetic level, I think I'm more accustomed to the format coming
> > before the data being formatted.  That's minor though. If enough people
> > agree though, we could be the format something that take an argument and
> > returns something streamable.
> 
> char[] format(form,int)    ???

I'm not sure really.  I guess I was thinking:

	form F = <manipulator(s)>;        // custom
	stdio.print(v1, hex(v2), F(v3));  // default form, hex form, my custom
form

But think makes it difficult or applets nasty looking to apply multiple format manipulator to a single variable in the print statement.

	stdio.print(hex(width(3, right(v1))));  // lisp anyone

I don't think I would want to have to declare a format type for every custom manipulation.  It would give us one of the worst drawbacks of COBOL's and perl's record based output without the benefits.

	I'm also having trouble being objective coming up with something looks
like it fits D's ideology.  I'm real partial to C++'s i/o and in D
that's right out.  I like the suggestion you had above for %magic on
object types but I don't think it goes far enough.
	Would help to work out a formalized set of string ops to build upon?
Most I/O will either be binary records or text strings I would assume.
The binary output wouldn't need much in the way of formatting.  fixed
record sizes, byte and bit ordering and that sort of thing should do.
It's with the human readable that we worry and justification,
representation, padding, text formation of complex types.  Maybe we
would have a better feel for what feels right for printing strings in D
once we get a feel for what it would be like to just muck around with
strings in D.

Dan
October 14, 2001
a wrote:

>         I like this idea.  I still don't like %magic, but this improves it a
> bit.  It did just strike me though, we were going after typesafety
> somewhere back there.  By have one %magic for all object, haven't we
> killed real typesafety?  For instance, suppose you passed the wrong
> object type to your print statement above.  It would be valid still.  It
> may barf at runtime because it doesn't handle 'asdf'.  Not to derail
> you, its a good idea, but how do we get typesafety back into it?

The key with strong typing is that you know what you know and you don't have to make wild guesses or assumptions.  When the %z (or
whatever) magic is passed, if the typesafe printf sees that the matching argument is anything but a pointer to an object, then it
throws an exception.  If it is an object, then it calls toString with the (possibly zero length) char[] array that is the %z format
qualifier.  This call goes right to the overloaded version of this function in the class (remember, the compiler is smart enough to
make all functions virtual).  This class is then responsible for dealing with the qualifier.

If your class doesn't deal with qualifiers, then you throw the exception if it is nonzero length.  If you do, then you throw an
exception if you don't recognize the format.  Etc...

Please clarify if I'm missing something, but this seems 100% typesafe to me - provided that the underlying printf varargs
architecture is also typesafe.

> There a different representations, like hex, oct, dec, left, right, etc. but doesn't %h only work for type char because printf uses varargs converting chars into ints?  I'm just used to the bare %magic being used to specify type.  That notion should die and be replaced with type safety.  Representation descriptors would still be valid, but in many cases a sane default should be acceptable.  Make character type look like characters, make numbers look like numbers, etc.

Ah, yes.  I see what you're saying now.  And I totally agree.  One of my big gripes with %x/%X is that if the high bit is 1, then
we always get a 8-character printf string:

printf("%x",(signed char)-1)    prints "FFFFFFFF"

whereas I wish it would print "FF".  The hack around that that I found was to convert my arg to an UNsigned char...so that when
it's promoted to a long, it is filled with 0's.  Then printf works as I would like.  In D's printf, passing a char should make the
format default to only printing 2 characters (or less).

> > > I looks to me like
> > > a hack to get around the lack of operator overloads.
> >
> > Very true.  It is very ugly; it was just a brainstorm to get things started. :)
>
> Well, I guess the very first thing we ought to decide what type of syntax we want.  If we go all procedural, we are ok.  You said you would need a format property in your idea.  I think I like that, but I'm not sure how to best leverage it.  In any case that is something we would have to be Walter for.  Likewise, if we want more on an expression based syntax, we either have to kludge around with the built-in types in D or we have to beg Walter to add a couple more primitives, with operators. At that point I'd say we are work in the language itself though and not in the library.

We could go with the Perl syntax here (shudder).  Use the dot operator to concatenate (formatted) strings.  Any time that the
compiler sees something of the form:

char[] . foo

It converts it internally to

char[] + format(foo)

where format(...) is a series of library routines that convert various things into char[].  The format(object) just calls
toString() to do it.  But now we have the problem of how to do format specifiers again....  @#$)(^%$#%&

Ofc, this all requires that Walter implement some special case code for char[]...but one of the stated objectives of D was
"Languages should be good at handling strings."

Another thought.  Instead of only making this work for char[], you could make it work for all array types.  Earlier, in a
discussion about casts, somebody talked about actually passing a type (not a value) as a parameter into a cast function:

int i = cast(int,foo);

We could do a similar thing with the . operator on arrays.

char[] str = "" . foo . bar . baz;

Is expanded to:

char[] str = "" + format(char[],foo) + format(char[],bar) + format(char[],baz);

I thought about using the cast syntax, but that becomes cloudy when the argument you're trying to format is a pointer or another
array.

>         I'm also having trouble being objective coming up with something looks
> like it fits D's ideology.  I'm real partial to C++'s i/o and in D
> that's right out.  I like the suggestion you had above for %magic on
> object types but I don't think it goes far enough.

I Walter doesn't mind the overloaded . operator idea, you could use << and >> instead:

char[] str = "" . foo;
    becomes
char[] str = "" + format(char[],foo);

and

str >> foo;
   becomes
foo = extract(typeof(foo),str);

Or something like that.  Still no (good) solution for format specifiers.

--
The Villagers are Online! http://villagersonline.com

.[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ]
.[ (a version.of(English).(precise.more)) is(possible) ]
?[ you want.to(help(develop(it))) ]


October 17, 2001
Are you implying that .toString() is a "magic method" that will be invoked automatically by printf?

That's a plausible approach.  It makes it more important that it be easy to concatenate strings.  Do remember to allow for usages such as:
   printf (x.asHead(1, 10) + "  " + y.asHead(1, 13) + ...);
or some suitable substitute.
N.B.:  here I was assuming that the asHead methods took the arguments format#, width, and that each object might do a bit of self layout formatting.
E.g., if one wanted to produce:
   This is  And This is
       x         y
   =======  ===========
one could do:
printf (x.asHead(1, 10) + "  " + y.asHead(1, 13) + "\n");
printf (x.asHead(2, 10) + "  " + y.asHead(2, 13) + "\n");
printf (x.asHead(3, 10) + "  " + y.asHead(3, 13) + "\n");

persuming that the objects could determine their names.

October 18, 2001
Charles Hixson wrote:

> Are you implying that .toString() is a "magic method" that will
> be invoked automatically by printf?

We have been using the term "magic" to refer to the printf format specifiers.  We were thinking that we could add a new specifier to format objects.  The idea is that when printf sees that specifier, it thinks the argument is a pointer to an object, and so calls the toString() method on it.

It won't work - not safely, anyhow - unless we can get some sort of typesafe printing routine, so we don't mistakenly think that an int is a pointer or vice-versa.

--
The Villagers are Online! http://villagersonline.com

.[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ]
.[ (a version.of(English).(precise.more)) is(possible) ]
?[ you want.to(help(develop(it))) ]


October 20, 2001
Russ Lewis wrote:

> The key with strong typing is that you know what you know and you don't have to make wild guesses or assumptions.  When the %z (or
> whatever) magic is passed, if the typesafe printf sees that the matching argument is anything but a pointer to an object, then it
> throws an exception.  If it is an object, then it calls toString with the (possibly zero length) char[] array that is the %z format
> qualifier.  This call goes right to the overloaded version of this function in the class (remember, the compiler is smart enough to
> make all functions virtual).  This class is then responsible for dealing with the qualifier.
> 
> If your class doesn't deal with qualifiers, then you throw the exception if it is nonzero length.  If you do, then you throw an
> exception if you don't recognize the format.  Etc...
> 
> Please clarify if I'm missing something, but this seems 100% typesafe to me - provided that the underlying printf varargs
> architecture is also typesafe.

	Well are typesafe in that it will always be a descendant of Object.
But you can't be more specific.  (I don't mean just any Fruit.  I want
an Apple!)  You will always get the streamify routine for the Object you
pass, but I guess I would like to say what type of Object I expect and
get en error otherwise, like you would with builtin types.  Not a show
stopper by any means.

> > There a different representations, like hex, oct, dec, left, right, etc. but doesn't %h only work for type char because printf uses varargs converting chars into ints?  I'm just used to the bare %magic being used to specify type.  That notion should die and be replaced with type safety.  Representation descriptors would still be valid, but in many cases a sane default should be acceptable.  Make character type look like characters, make numbers look like numbers, etc.
> 
> Ah, yes.  I see what you're saying now.  And I totally agree.  One of my big gripes with %x/%X is that if the high bit is 1, then
> we always get a 8-character printf string:
> 
> printf("%x",(signed char)-1)    prints "FFFFFFFF"
> 
> whereas I wish it would print "FF".  The hack around that that I found was to convert my arg to an UNsigned char...so that when
> it's promoted to a long, it is filled with 0's.  Then printf works as I would like.  In D's printf, passing a char should make the
> format default to only printing 2 characters (or less).

  Well, I guess I would rather that D would always print a char as a
char/glyph, unless I cast it to something else (like an unsigned) in
which case I would like to to print way ever I passed it.  I do agree
that in some cases there is format information that is useful and not
associated to the type (unless we attach a format property; probably
overkill) like hex/oct/dec, left/right/center, and the like, but I don't
think I like the use of the format character as an end run around a
cast.  I C you didn't have a choice.  printf only saw ints.

> We could go with the Perl syntax here (shudder).  Use the dot operator to concatenate (formatted) strings.  Any time that the
> compiler sees something of the form:
> 
> char[] . foo
> 
> It converts it internally to
> 
> char[] + format(foo)
> 
> where format(...) is a series of library routines that convert various things into char[].  The format(object) just calls
> toString() to do it.  But now we have the problem of how to do format specifiers again....  @#$)(^%$#%&

Actually, this isn't how perl does formatting.  Your choices in perl are (s)printf and format lines like:

format FOO
price: $<<<<<   tax: $>>>>
$thePrice, $theTax

to say the first value will be left justified and the second will be right justified.  (I have the syntax wrong, but it's the general idea. You then have to tell the interpreter which format you are using for a given filehandle and then you do a write like so:

($thePrice, $theTax) = ($something, $somethingelse);
write myFileHandle;

> Ofc, this all requires that Walter implement some special case code for char[]...but one of the stated objectives of D was
> "Languages should be good at handling strings."

	Oh, for string handling.  I was thinking of something comparable to C's
strtok, sprintf, index, etc except (hopefully) better.  If we can come
up with an easy way to say "make this string 8 chars wide, center the
characters in the string and uppercase the first letter" in a very
concise way, then print formatting is easy.  Kind of like reducing
printf to {return puts(sprintf(ARGS))};
	Now if it's not very abbreviated, it's going to suck.  printf was very
brief.  C++ wasn't as brief, but with the exception that some modifiers
were persistent and some were one shot, it conveyed the idea fairly
elegantly.  It needs some work though in terms of ease of use.  I've
found the single argument, overloaded write/writeln scheme to be a bit
clumsy.  With good string formating, the printing may be less of an
issue sincec we can build the string and print it as a single argument,
but I still believe there will be needless memory/cpu overhead with such
a scheme.  If you do a lot of i/o, you'll run the cpu ragged collecting
the temporaries when you're not in i/o wait.

> Another thought.  Instead of only making this work for char[], you could make it work for all array types.  Earlier, in a
> discussion about casts, somebody talked about actually passing a type (not a value) as a parameter into a cast function:
> 
> int i = cast(int,foo);

I think that got shot down because the type could not be determined until the semantic pass.

> We could do a similar thing with the . operator on arrays.
> 
> char[] str = "" . foo . bar . baz;

Dot won't work.  It could be a member dereference, or it could be a
concatenation.  Actually, perl 6 is switching . to _ so that they switch
-> to . like all other OO languages.  That's academic though.  We can
find another operator.

> Is expanded to:
> 
> char[] str = "" + format(char[],foo) + format(char[],bar) + format(char[],baz);
> 
> I thought about using the cast syntax, but that becomes cloudy when the argument you're trying to format is a pointer or another
> array.

	Actually, format is looking a lot like a synonym for cast.  How would
you use the scheme to justify, set field width, or print a number in
hex?

> >         I'm also having trouble being objective coming up with something looks
> > like it fits D's ideology.  I'm real partial to C++'s i/o and in D
> > that's right out.  I like the suggestion you had above for %magic on
> > object types but I don't think it goes far enough.
> 
> I Walter doesn't mind the overloaded . operator idea, you could use << and >> instead:
> 
> char[] str = "" . foo;
>     becomes
> char[] str = "" + format(char[],foo);
> 
> and
> 
> str >> foo;
>    becomes
> foo = extract(typeof(foo),str);
> 
> Or something like that.  Still no (good) solution for format specifiers.

	It's not that I'm fixated on the >>/<< operators.  It's that I am
fixated on expressions as opposed to calls.  I don't see a clean way to
do that in a library w/o overloading operators or variable arguments.  I
also like the fact that user defined type can be printed like any first
class Object.  Treating Objects like first class types has been met with
near hostility here.  I think the might be a good hint that D i/o
probably should try to blur that line no matter how much I may want to.

Dan
October 20, 2001
a wrote:

> Russ Lewis wrote:
>
> > The key with strong typing is that you know what you know and you don't have to make wild guesses or assumptions.  When the %z (or
> > whatever) magic is passed, if the typesafe printf sees that the matching argument is anything but a pointer to an object, then it
> > throws an exception.  If it is an object, then it calls toString with the (possibly zero length) char[] array that is the %z format
> > qualifier.  This call goes right to the overloaded version of this function in the class (remember, the compiler is smart enough to
> > make all functions virtual).  This class is then responsible for dealing with the qualifier.
> >
> > If your class doesn't deal with qualifiers, then you throw the exception if it is nonzero length.  If you do, then you throw an
> > exception if you don't recognize the format.  Etc...
> >
> > Please clarify if I'm missing something, but this seems 100% typesafe to me - provided that the underlying printf varargs
> > architecture is also typesafe.
>
>         Well are typesafe in that it will always be a descendant of Object.
> But you can't be more specific.  (I don't mean just any Fruit.  I want
> an Apple!)  You will always get the streamify routine for the Object you
> pass, but I guess I would like to say what type of Object I expect and
> get en error otherwise, like you would with builtin types.  Not a show
> stopper by any means.

Remember that all member functions of classes are automatically virtual in D (when necessary).  Thus, if you pass a Fruit* pointer (which
is cast down to an Object*), then when you call toString(), you get Fruit::toString.  You get complete control.  And if the formatter
arguments you pass with it don't make sense for Fruit::toString, then it throws an exception.

> Well, I guess I would rather that D would always print a char as a char/glyph, unless I cast it to something else (like an unsigned) in which case I would like to to print way ever I passed it.  I do agree that in some cases there is format information that is useful and not associated to the type (unless we attach a format property; probably overkill) like hex/oct/dec, left/right/center, and the like, but I don't think I like the use of the format character as an end run around a cast.  I C you didn't have a choice.  printf only saw ints.

Makes sense.  We just have to implement the printf() routine...but I agree that this is a good default.

> > We could go with the Perl syntax here (shudder).  Use the dot operator to concatenate (formatted) strings.  Any time that the
> > compiler sees something of the form:
> >
> > char[] . foo
> >
> > It converts it internally to
> >
> > char[] + format(foo)
> >
> > where format(...) is a series of library routines that convert various things into char[].  The format(object) just calls
> > toString() to do it.  But now we have the problem of how to do format specifiers again....  @#$)(^%$#%&
>
> Actually, this isn't how perl does formatting.  Your choices in perl are (s)printf and format lines like:

I meant it not as a formatting routine, but as an easy concatenation routine.  If you used
   char[] . foo
, it would call format to convert foo to a char[], then concatenate the arrays.  If you wanted explicit formating strings, you would call

   char[] . format(foo, options)

> > Ofc, this all requires that Walter implement some special case code for char[]...but one of the stated objectives of D was
> > "Languages should be good at handling strings."
>
>         Oh, for string handling.  I was thinking of something comparable to C's
> strtok, sprintf, index, etc except (hopefully) better.  If we can come
> up with an easy way to say "make this string 8 chars wide, center the
> characters in the string and uppercase the first letter" in a very
> concise way, then print formatting is easy.

Unfortunately, we can't really describe it easily even in English...not terribly likely we're going to be able to better in a programming
language without creating some weird new language (like printf did).   :(

> Kind of like reducing printf to {return puts(sprintf(ARGS))};

BTW, as you imply here, I very much think that sprintf should NOT take the buffer as an argument, but instead should return the string it
creates.  Either that or require arrays (no pointers).  I don't want to mess with buffer overflows anymore.  Or with having to calculate
ahead of time how much buffer space I'll need.

> > Is expanded to:
> >
> > char[] str = "" + format(char[],foo) + format(char[],bar) + format(char[],baz);
> >
> > I thought about using the cast syntax, but that becomes cloudy when the argument you're trying to format is a pointer or another
> > array.
>
>         Actually, format is looking a lot like a synonym for cast.  How would
> you use the scheme to justify, set field width, or print a number in
> hex?

By adding an extra argument that allows the user to pass format specifiers.

The problem with unifying format and cast is:

int i = ...;
char *ptr = format(char*,i);
char[] str = format(char[],i);

The first should cast the value i into a ptr whose address is i.  The latter should create a char[] which contains a string which is the
decimal representation of i.  We need to keep these totally distinct.

--
The Villagers are Online! http://villagersonline.com

.[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ]
.[ (a version.of(English).(precise.more)) is(possible) ]
?[ you want.to(help(develop(it))) ]


November 04, 2001
> > >         Well, at the very least, I'd like to have a toStream method
that can
> > > default to toString.  The human readable format is not always the best/most complete way to store to a file.

I hope the hell somebody goes through D and standardizes capitalization before it goes public, or better yet get rid of case sensitivity.  That was one of the few things I really liked about Pascal and that I now miss.

> I'm not sure really.  I guess I was thinking:
>
> form F = <manipulator(s)>;        // custom
> stdio.print(v1, hex(v2), F(v3));  // default form, hex form, my custom
> form
>
> But think makes it difficult or applets nasty looking to apply multiple format manipulator to a single variable in the print statement.
>
> stdio.print(hex(width(3, right(v1))));  // lisp anyone
>
> I don't think I would want to have to declare a format type for every custom manipulation.  It would give us one of the worst drawbacks of COBOL's and perl's record based output without the benefits.
>
> I'm also having trouble being objective coming up with something looks
> like it fits D's ideology.  I'm real partial to C++'s i/o and in D
> that's right out.  I like the suggestion you had above for %magic on
> object types but I don't think it goes far enough.
> Would help to work out a formalized set of string ops to build upon?
> Most I/O will either be binary records or text strings I would assume.
> The binary output wouldn't need much in the way of formatting.  fixed
> record sizes, byte and bit ordering and that sort of thing should do.
> It's with the human readable that we worry and justification,
> representation, padding, text formation of complex types.  Maybe we
> would have a better feel for what feels right for printing strings in D
> once we get a feel for what it would be like to just muck around with
> strings in D.

Binary files could be built on strings (since strings don't rely on that nasty trailing NULL character to determine their length, as in C/C++).

Strings in general should be easy to manipulate.  There's only a few things you'd want to do to any string: field width, justification, alignment, and padding (all can be implemented once as a function and will otherwise work for all types) and for any printable item to change type of conversion applied (These can be simple function calls probably with overloading, such as hex(int) and hex(char).

I don't know about anyone else here, but I'm all for building my strings in memory and then dumping them off in a block to the file or stdout or whatever at reasonable intervals (such as each line).  Especially if you can somehow cause the string to reserve enough memory to hold the biggest batch before you start.  The big advantage is you should be able to use these methods on all your strings, not just strings going to stdout/cout/whatever. That one separation enables every kind of stream to have formatting, without having any stream have to implement or even wrap printf.  All you'd need is an equivalent to puts().

Sean



November 04, 2001
"Sean L. Palmer" wrote:
> 
> > > >         Well, at the very least, I'd like to have a toStream method
> that can
> > > > default to toString.  The human readable format is not always the best/most complete way to store to a file.
> 
> I hope the hell somebody goes through D and standardizes capitalization before it goes public, or better yet get rid of case sensitivity.  That was one of the few things I really liked about Pascal and that I now miss.
> 
> > I'm not sure really.  I guess I was thinking:
> >
> > form F = <manipulator(s)>;        // custom
> > stdio.print(v1, hex(v2), F(v3));  // default form, hex form, my custom
> > form
> >
> > But think makes it difficult or applets nasty looking to apply multiple format manipulator to a single variable in the print statement.
> >
> > stdio.print(hex(width(3, right(v1))));  // lisp anyone
> >
> > I don't think I would want to have to declare a format type for every custom manipulation.  It would give us one of the worst drawbacks of COBOL's and perl's record based output without the benefits.
> >
> > I'm also having trouble being objective coming up with something looks
> > like it fits D's ideology.  I'm real partial to C++'s i/o and in D
> > that's right out.  I like the suggestion you had above for %magic on
> > object types but I don't think it goes far enough.
> > Would help to work out a formalized set of string ops to build upon?
> > Most I/O will either be binary records or text strings I would assume.
> > The binary output wouldn't need much in the way of formatting.  fixed
> > record sizes, byte and bit ordering and that sort of thing should do.
> > It's with the human readable that we worry and justification,
> > representation, padding, text formation of complex types.  Maybe we
> > would have a better feel for what feels right for printing strings in D
> > once we get a feel for what it would be like to just muck around with
> > strings in D.
> 
> Binary files could be built on strings (since strings don't rely on that nasty trailing NULL character to determine their length, as in C/C++).

The only problem is you have to make sure no character conversion silliness is going on.  You also have to know what sizeof(char).  You have to know that some deranged systems aren't converting '\n' into two characters.  It's better to have a way to tell the code not to do any of that.

> Strings in general should be easy to manipulate.  There's only a few things you'd want to do to any string: field width, justification, alignment, and padding (all can be implemented once as a function and will otherwise work for all types) and for any printable item to change type of conversion applied (These can be simple function calls probably with overloading, such as hex(int) and hex(char).

This could be ok if the syntax doesn't get so complex as to take a way from what you are trying to format.  printf fans/fiends often like how you get a feel for how the output will like by looking at the format string.  I am also a little concerned about how to do alignment though before you send to the stream since that usual depends on stream state.

> I don't know about anyone else here, but I'm all for building my strings in memory and then dumping them off in a block to the file or stdout or whatever at reasonable intervals (such as each line).  Especially if you can somehow cause the string to reserve enough memory to hold the biggest batch before you start.  The big advantage is you should be able to use these methods on all your strings, not just strings going to stdout/cout/whatever. That one separation enables every kind of stream to have formatting, without having any stream have to implement or even wrap printf.  All you'd need is an equivalent to puts().

	I've mentioned this before myself.  The problem is that you have to
know ahead of time what the largest batch is going to be.  Some (many,
I) would consider that an unnecessary pain.  It also adds yet another
buffer layer between the print and the final destination.  This uses
more clock cycles to build a temp string and it and a requires more
memory to store our extra buffer.  It will also require the the user the
maintain the string to make sure it gets reused, or you have to count on
GC to not go nuts when the program does a lot of io. printf didn't need
to build such a buffer.  It analyzed the arguments to be printed one at
a time.  What ever D has will be compared to printf, unless it is
printf.  If we replace printf, it better be good so we don't get as many
people complaining about it as C++ iostreams has.

Dan
1 2 3
Next ›   Last »