May 01, 2009
Robert Jacques wrote:
> Do you have a link to your article? 

http://tinyurl.com/dac82a


Andrei
May 02, 2009
On Fri, May 1, 2009 at 4:23 PM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> Rainer Deyke wrote:
>>
>> Robert Jacques wrote:
>>>
>>> Precisely, so it's always passed by reference and doesn't need to support copying.
>>
>> All containers need to support copying.
>
> Speaking of which, I'm thinking of a field like scientific computing (even of the simplest, most basic kind that we all dabble into) or applied math. People who use for example matrices would NOT expect them to have reference semantics. They'd find it very confusing if a = b would not make matrix "a" a copy of matrix "b" and so on. (That + no operator overloading = R.I.P. my excitement for doing numeric work in Java.)
>
> It would work to ask people who actually use Matrix!double to wrap it in a Value!(Matrix!double). But say someone asks, whatever, but if Value!(Matrix!double) is a matrix, then what is Matrix? Well, I'd reply, Matrix is actually a reference to a matrix. Then, they'll ask, why don't you call what you call today "Matrix", RefMatrix or Ref!Matrix or whatever, and call a Matrix a matrix? Um, I don't know. That's what the buzz was on the newsgroup when we were thinking of it. Some said that's what people coming from Java expect.
>
> I guess at that point the would-be D user would be entitled to make me a lavaliere out of my Matrix library and move on.

Python, and therefore NumPy, are reference based.
So far I haven't seen any scientists posting on the list about how
having their arrays and matrices be references is driving them crazy.
It may be surprising to some of them at first, but even non-hard-core
coders seem to be able to handle it.   It helps that dummy assignments
like a = b are rare.   More often you have things like a = b + c, and
that creates a new matrix.  Or  a += b, which pretty clearly mutates
a.

Much more often the discussion on the numpy list takes the form of "how do I make this loop faster" becuase loops are slow in Python so you have to come up with clever transformations to turn your loop into array ops.  This is thankfully a problem that D array libs do not have.  If you think of it as a loop, go ahead and implement it as a loop.  That's why I prefer to do numerical work in D rather than NumPy these days.

--bb
May 02, 2009
Bill Baxter wrote:
> On Fri, May 1, 2009 at 4:23 PM, Andrei Alexandrescu
> <SeeWebsiteForEmail@erdani.org> wrote:
>> I guess at that point the would-be D user would be entitled to make me a
>> lavaliere out of my Matrix library and move on.
> 
> Python, and therefore NumPy, are reference based.

In a language that's reference based, things can't be helped. They can be helped in D though. I wouldn't want to later cringe that I missed the opportunity.

> So far I haven't seen any scientists posting on the list about how
> having their arrays and matrices be references is driving them crazy.
> It may be surprising to some of them at first, but even non-hard-core
> coders seem to be able to handle it.

To me this sounds more like an opportunity to do the right thing.

> It helps that dummy assignments
> like a = b are rare.   More often you have things like a = b + c, and
> that creates a new matrix.  Or  a += b, which pretty clearly mutates
> a.

Oh rly. How about this then?

Matrix a, b, c;
...
c = a;
a += b;

Does the last operation mutate c as well?


Andrei
May 02, 2009
Andrei Alexandrescu wrote:
> Also something that wasn't discussed that much is the connection of whatever design we devise, with the GC. I am mightily excited by the possibility to operate without GC or with tight reference counting, and I thought many around here would share that excitement. If we go for non-gc ways of memory management, that will probably affect container design.

Just out of curiosity, why do you like reference counting more than mark/sweep for containers?

--benji
May 02, 2009
On Fri, 01 May 2009 15:37:56 -0400, Rainer Deyke <rainerd@eldwood.com> wrote:
> Robert Jacques wrote:
>> Precisely, so it's always passed by reference and doesn't need to
>> support copying.
>
> All containers need to support copying.

No they don't. To be clear, I'm talking about copying the container, not the values inside them, in a non-destructive manner. My data structures book and wikipedia make no mention of a .dup like method as being part of a container. I do agree .dup is useful and quite practical for serial containers, but it is by no means required. But you haven't answered the question of a value semantic lock-free container.

May 02, 2009
On Fri, 01 May 2009 19:23:45 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> Rainer Deyke wrote:
>> Robert Jacques wrote:
>>> Precisely, so it's always passed by reference and doesn't need to
>>> support copying.
>>  All containers need to support copying.
>
> Speaking of which, I'm thinking of a field like scientific computing (even of the simplest, most basic kind that we all dabble into) or applied math. People who use for example matrices would NOT expect them to have reference semantics. They'd find it very confusing if a = b would not make matrix "a" a copy of matrix "b" and so on. (That + no operator overloading = R.I.P. my excitement for doing numeric work in Java.)
>
> It would work to ask people who actually use Matrix!double to wrap it in a Value!(Matrix!double). But say someone asks, whatever, but if Value!(Matrix!double) is a matrix, then what is Matrix? Well, I'd reply, Matrix is actually a reference to a matrix. Then, they'll ask, why don't you call what you call today "Matrix", RefMatrix or Ref!Matrix or whatever, and call a Matrix a matrix? Um, I don't know. That's what the buzz was on the newsgroup when we were thinking of it. Some said that's what people coming from Java expect.
>
> I guess at that point the would-be D user would be entitled to make me a lavaliere out of my Matrix library and move on.
>
>
> Andrei

I do scientific computing. Generally, I find it breaks down into two parts: things under 4x4, for which value types are probably better, and everything else, for which value types are to be avoided like the plague. I'll often work with 100's mb of data with algorithms that take minutes to hours to complete. So an unexpected copy is both hard to find (am I slow/crashing because of my algorithm, or because of a typo?) and rather harmful, because its big. But I've generally worked on making something else fast so more data can be crunched, etc. Actual prototype work (for array/matrix based stuff at least) is often done in Matlab, which I think uses COW under-the-hood to provide value semantics. So I think anyone turning to D to do scientific computing will know reference semantics, since they'd already be familiar with them from C/C++, etc (Fortran?). Although successfully attracting algorithm prototypes from Matlab/python/mathmatica/R/etc is probably bigger issue than just the container types, growing the pie was why the Wii won the last console wars.
May 02, 2009
Robert Jacques wrote:
> On Fri, 01 May 2009 15:37:56 -0400, Rainer Deyke <rainerd@eldwood.com> wrote:
>> Robert Jacques wrote:
>>> Precisely, so it's always passed by reference and doesn't need to
>>> support copying.
>>
>> All containers need to support copying.
> 
> No they don't. To be clear, I'm talking about copying the container, not the values inside them, in a non-destructive manner. My data structures book and wikipedia make no mention of a .dup like method as being part of a container. I do agree .dup is useful and quite practical for serial containers, but it is by no means required. But you haven't answered the question of a value semantic lock-free container.
> 

I have.

Andrei
May 02, 2009
On Fri, 01 May 2009 19:25:35 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> Robert Jacques wrote:
>> Do you have a link to your article?
>
> http://tinyurl.com/dac82a
>
>
> Andrei

Should've seen that one coming. :)

Anyways, I'm not sure how you can call the technique lock-free, since you're doing (possibly several) allocations inside the inner CAS loop. (I guess a good, parallel allocator might negate this) It's probably not an issue for the article's use case, where reads vastly dominated updates, but for things like stacks or queues, it wouldn't work. And poor performance leads people to create their own fast, possibly buggy versions, thus defeating the point of a standard library.
May 02, 2009
Robert Jacques wrote:
> On Fri, 01 May 2009 19:25:35 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> 
>> Robert Jacques wrote:
>>> Do you have a link to your article?
>>
>> http://tinyurl.com/dac82a
>>
>>
>> Andrei
> 
> Should've seen that one coming. :)
> 
> Anyways, I'm not sure how you can call the technique lock-free, since you're doing (possibly several) allocations inside the inner CAS loop. 

No, repeated allocations are trivial to eliminate. I didn't even bother to explain that in the article. The loop must only refill the allocated object from the object that needs to be replaced.

> (I guess a good, parallel allocator might negate this) It's probably not an issue for the article's use case, where reads vastly dominated updates, but for things like stacks or queues, it wouldn't work. And poor performance leads people to create their own fast, possibly buggy versions, thus defeating the point of a standard library.

Incidentally Maged Michael himself wrote the first lock-free malloc().


Andrei
May 02, 2009
Benji Smith wrote:
> Andrei Alexandrescu wrote:
>> Also something that wasn't discussed that much is the connection of whatever design we devise, with the GC. I am mightily excited by the possibility to operate without GC or with tight reference counting, and I thought many around here would share that excitement. If we go for non-gc ways of memory management, that will probably affect container design.
> 
> Just out of curiosity, why do you like reference counting more than mark/sweep for containers?

Who told you I like it? All I do is acknowledge that RC and MS have different tradeoffs that may form the basis of a preference.

Andrei