March 08, 2010
Steven Schveighoffer:
> Tell me how you would parse the following text serialization string for a string[]:
> 
> hello world how are you
> 
> What if it was a string[][]?
> 
> Compare that to:
> 
> [hello world, [how are, you]]

You are missing something:

["hello world", ["how are", "you"]]

:-)
(And yes, there are simple solutions if the strings contains " or ').
Bye,
bearophile
March 08, 2010
bearophile wrote:
> Note: this produces the same very large binary, I don't know why:
> 
> double[100_000] arr = void;
> static this() {
>     arr[] = typeof(arr[0]).init;
> }
> void main() {}

obj2asm tells the tale. (obj2asm is an incredibly useful tool, I don't know why nobody uses it.)

   double[100_000] arr = void;

puts arr in the thread local storage segment. Unfortunately, there is no bss for TLS.

   __gshared double[100_000] arr = void;

puts arr in the BSS segment, which takes up space in your executable but not the executable *file*.
March 08, 2010
Don wrote:
> Bug 1914 Array initialisation from const array yields memory trample
> 
> was fixed, in D2 only. Can we get this into D1 as well?

The problem is I don't think it's the right fix, and I haven't spent the time figuring it out yet.
March 08, 2010
Andrei Alexandrescu:
> Your choice of leading/trailing symbols and of separators makes 'to' friendlier for printing e.g. debug strings. My choice makes it a primitive for text serialization. I'm not 100% sure which is the more frequent use and therefore which is the most appropriate as a default, but I'm leaning towards the latter.

Sorry for losing my temper about this in my last posts.
In Python they have faced this problem using two different things, str() and repr(), the first one produces a little more readable output, used for normal reading, and then second is a little more for textual serialization (But both use the []).
The shell outputs using repr(), the print uses str(), the items inside collections are always represented with repr().
Objects can define the __repr__ and __str__. Print calls the __str__. If __str__ is missing it's used __repr__.

I like handy defaults, they save time iff they are well chosen.

Bye,
bearophile
March 08, 2010
On Mon, 08 Mar 2010 15:12:24 -0500, bearophile <bearophileHUGS@lycos.com> wrote:

> Steven Schveighoffer:
>> Tell me how you would parse the following text serialization string for a
>> string[]:
>>
>> hello world how are you
>>
>> What if it was a string[][]?
>>
>> Compare that to:
>>
>> [hello world, [how are, you]]
>
> You are missing something:
>
> ["hello world", ["how are", "you"]]

For completely unambiguous, yes.  But still, I find often that quotes are more noise than they are worth when just doing simple printouts.  What we want is the most useful default.

Also, the desired output I would like is:

T[] => "[T1, T2, ..., Tn]"

This would mean that strings have a special case of printing with quotes only when printed inside an array.  This seems like an oddity to me.

-Steve
March 08, 2010
Steven Schveighoffer:
> For completely unambiguous, yes.  But still, I find often that quotes are more noise than they are worth when just doing simple printouts.  What we want is the most useful default.

Quotes add a bit of noise, but I prefer to tell apart the cases of two strings from the case of one string with a comma in the middle.


> This would mean that strings have a special case of printing with quotes only when printed inside an array.  This seems like an oddity to me.

It's how Python works, that's why there are __repr__ and __str__ for objects, the repr of a string includes the quotes, its __str__ doesn't.

>>> a = ["hello world", ["that's right!", "you"]]
>>> a
['hello world', ["that's right!", 'you']]
>>> print a
['hello world', ["that's right!", 'you']]
>>> a[0]
'hello world'
>>> print a[0]
hello world

Notes:
- that 'a' contains a string and an array of strings, you usually can't do this in D, so this is not a fully representative example;
- Python is dynamically typed, so it's more important that what you print shows its type. In D you can often tell it looking at type of the variable you print (unless it's hidden by a labyrinth of 'auto').

Bye,
bearophile
March 08, 2010
Steven Schveighoffer wrote:
> On Mon, 08 Mar 2010 14:49:33 -0500, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> 
>> bearophile wrote:
>>> Andrei Alexandrescu:
>>>
>>>> Sorry, this stays.
>>>  Then I'm not going to use the Phobos printing in all my future D2
>>> programs. As I was not using it in D1. I'm not going to change idea
>>> on this.
>>>
>>>> (e.g. the comma may be a decimal point in some languages, so is
>>>> [1,2] in a German locale an array of double with one value or two?<
>>>>
>>>  In German you need no space after the comma, and there's no [] after
>>> and before it. So [1, 2] is not a floating point value in German.
>>>
>>>> Why one space?<
>>>  Because that's they way people print things in natural languages.
>>> It's a convention, you know. And it's a good one. It tells apart the
>>> FP numbers and it's the minimal.
>>>
>>>> It's the most neutral thing I could think of. Why no brackets?
>>>> Because of minimalism. You can very easy add them if you want
>>>> them.<
>>>  The purpose of things like the square brackets is to give a less
>>> ambiguous textual representation of the most common data structures
>>> (array and strings are the most common after numbers). So you put ""
>>> or '' around strings and [] to know what you are printing.
>>
>> Your choice of leading/trailing symbols and of separators makes 'to' friendlier for printing e.g. debug strings. My choice makes it a primitive for text serialization. I'm not 100% sure which is the more frequent use and therefore which is the most appropriate as a default, but I'm leaning towards the latter.
> 
> No it doesn't.
> 
> Tell me how you would parse the following text serialization string for a string[]:
> 
> hello world how are you
> 
> What if it was a string[][]?
> 
> Compare that to:
> 
> [hello world, [how are, you]]
> 
> That is almost completely unambiguous (you still have to account for literal commas or brackets), whereas you have a multitude of choices with the first version.

I said a primitive for serialization, not a serialization infrastructure. So the basic idea is that you use "to" in conjunctions with your own routines to serialize things. "to" imposes no policy. Using "[", ", ", and "]" is policy.

> The thing is, friendlier for text-based serialization is almost identical to friendlier for printing.  In fact, friendlier for text-based serialization should have even more annotations to escape delimiters.
> 
> In fact, I find the defaults you defined not useful in most cases.  Printing or serializing, I want to see the delimiters for the arrays, elements, and subarrays.

You can choose them no problem. std.conv gives you mechanism, you choose the policy.


Andrei
March 08, 2010
> obj2asm tells the tale. (obj2asm is an incredibly useful tool, I don't know why nobody uses it.)
> 

Maybe because it's not free (and not much advertised).
obconv also supports disassembling various object file formats + conversion between them and it's open source: http://www.agner.org/optimize/#objconv

obj2asm might provide something fancy that objconv doesn't but its page doesn't show anything that would justify paying 10$.
March 08, 2010
On Mon, Mar 08, 2010 at 04:12:00PM -0500, Trass3r wrote:
> Maybe because it's not free (and not much advertised).

The linux version comes in the zip right along side dmd.

-- 
Adam D. Ruppe
http://arsdnet.net
March 08, 2010
I think this bug has been squished as well. Both test cases now compile fine.
http://d.puremagic.com/issues/show_bug.cgi?id=3694

Walter Bright wrote:
> Lots of meat and potatoes here, and a cookie! (spelling checker for error messages)
> 
> http://www.digitalmars.com/d/1.0/changelog.html
> http://ftp.digitalmars.com/dmd.1.057.zip
> 
> 
> http://www.digitalmars.com/d/2.0/changelog.html
> http://ftp.digitalmars.com/dmd.2.041.zip
> 
> Thanks to the many people who contributed to this update!