View mode: basic / threaded / horizontal-split · Log in · Help
May 23, 2012
Re: Does D have "structural sharing" of immutable collections?
On Wed, 23 May 2012 17:54:42 -0400, Roman D. Boiko <rb@d-coding.com> wrote:

> On Wednesday, 23 May 2012 at 19:51:02 UTC, Steven Schveighoffer wrote:
>> If you need a sorted tree structure that supports sharing immutable  
>> state, you likely have to find a different algorithm than redblack,  
>> since adding nodes can modify the tree structure via rotates.
>>
>> -Steve
>
> I don't need to invent here, and it is definitely feasible. Okasaki  
> provided efficient algorithm for inserting nodes, and, IIRC, rotations  
> (not for deleting, though). But OCaml doesn't map to D easily (neither  
> does Haskel, I think).

I looked up how it could be done, the one thing I was not sure of was the  
parent node.  If you can have multiple references to the same subtree, how  
do you keep track of the parents.  But if you only are concerned about the  
ancestry of a single node, you can dynamically determine the line in  
O(lgn) time, and use that for the running of the redblack algorithm.

I'm not sure how efficient it would be.

-Steve
May 23, 2012
Re: Does D have "structural sharing" of immutable collections?
On Wednesday, 23 May 2012 at 22:25:35 UTC, Steven Schveighoffer 
wrote:
> I looked up how it could be done, the one thing I was not sure 
> of was the parent node.  If you can have multiple references to 
> the same subtree, how do you keep track of the parents.  But if 
> you only are concerned about the ancestry of a single node, you 
> can dynamically determine the line in O(lgn) time, and use that 
> for the running of the redblack algorithm.
>
> I'm not sure how efficient it would be.
>
> -Steve

My problem is the following:

n elements in some given order, each has an associated value; I 
need to retrieve elements for which the sum of values associated 
with previous elements lies in a given ragne; this should work 
efficiently when elements are added or removed, and history 
should be preserved.

Storing elements as tree leafs and sums of children values in 
other nodes gives me this easily.

Lookup price is 0(log N) number comparisons, which should be very 
fast. Insert / delete / rotate are also logarithmic, but I don't 
know the constant factors.

No need to retrieve parents.

So far it seems like RBT is perfect for me (if cache friendly 
implementation will be possible. Hopefully it will, because nodes 
are small and I can preallocate memory for them).
May 23, 2012
Re: Does D have "structural sharing" of immutable collections?
Roman D. Boiko:

> The end goal is to have something as easy to work with as 
> collections of Scala or F# (I don't know Haskell), and, of 
> course, efficient.

If you plan in creating a start of purely functional collections 
library then I suggest you to also take a look at Clojure and 
various Haskell collections. Clojure is very based on persistent 
purely functional data structures, used in modern ways. And in 
the Haskell tribe there are some very smart persons that have 
worked on those things for fifteen years. So spending one extra 
week reading and looking at those two worlds may help you a lot 
later, and give you many good/better ideas.

Bye,
bearophile
May 24, 2012
Re: Does D have "structural sharing" of immutable collections?
Andrei Alexandrescu:

> Yah, FP doesn't like arrays and immutable containers 
> essentially eschew arrays altogether.

In the Chapel/X10 worlds I have read about various strategies to 
use mutable arrays too in parallel/concurrent situations. 
Probably some of those ideas will be quite useful in D too.


> (The most surprising thing to me is that the FP community
> generally fails to acknowledge that as a problem.)

Again and again I have seen how modern CPUs like linear scans on 
(small) arrays of values. Sometimes the performance is 
surprising, and beats almost everything else, including binary 
searches on the same small array.

This sometimes means that using a single core on such mutable 
arrays from C/C++/D leads to faster code (and far less 
electricity consumed) than using 4-8 cores on less optimal data 
structures (like structures with many pointers).

This is also why your idea of "small string optimization" was 
probably leading to significantly higher performance, and why I 
think the built-in D AAs will need a "small associative array" 
optimization (that even Python dicts have, for a number of items 
less than 9), BigInts will need a small value (no heap) 
optimization, and so on for other Phobos data structures 
(including lists!). Often such optimizations means just storing 
the few values in a small array instead of a fancier data 
structure (like a list), so adding such optimizations is usually 
not hard.

Bye,
bearophile
May 24, 2012
Re: Does D have "structural sharing" of immutable collections?
On Wednesday, 23 May 2012 at 23:51:45 UTC, bearophile wrote:
> Roman D. Boiko:
>
>> The end goal is to have something as easy to work with as 
>> collections of Scala or F# (I don't know Haskell), and, of 
>> course, efficient.
>
> If you plan in creating a start of purely functional 
> collections library then I suggest you to also take a look at 
> Clojure and various Haskell collections. Clojure is very based 
> on persistent purely functional data structures, used in modern 
> ways. And in the Haskell tribe there are some very smart 
> persons that have worked on those things for fifteen years. So 
> spending one extra week reading and looking at those two worlds 
> may help you a lot later, and give you many good/better ideas.
>
> Bye,
> bearophile

Thanks, I will!
May 24, 2012
Re: Does D have "structural sharing" of immutable collections?
On Wednesday, May 23, 2012 23:58:42 Roman D. Boiko wrote:
> On Wednesday, 23 May 2012 at 21:12:04 UTC, Jonathan M Davis wrote:
> > In my personal experience, that's not true at all. I've seen
> > _huge_
> > performance problems in Haskell when using a map. There may be
> > cases where an
> > immutable container may not pose large performance problems,
> > but I would be
> > _very_ wary of using immutable containers, and _very_ careful
> > with them when I
> > did.
> > 
> > - Jonathan M Davis
> 
> I expect problems with eliminating tail calls (especially for
> mutually recursive functions), but cannot find any other reason,
> theoretical or implementation, that would necessarily make it
> execute slowly in D. I think that cache friendliness is possible
> to achieve, at least in my use cases. What else could go wrong?

As part of my thesis work, I wrote a program which was counting possible 
tokens in code (as part of an AI attempting to learn about a programming 
language), which required a map of tokens to the number of times that they'd 
been seen. Because I had previously been doing stuff in Haskell for my thesis, 
I continued to use Haskell for that portion, and it was a _huge_ mistake, 
because the performance was _terrible_ - as in it could only process a few 
files in a day kind of terrible. I ultimately had to switch over to another 
language to get it to work (it ended up being Java, but I could have used a 
variety of other languages).

Maybe if I had simply been adding elements to the map, it wouldn't have been 
anywhere near as bad, but since I had to mutate them (and both the map and its 
elments were technically immutable), that become _insanely_ expensive.

So, it's quite possible that you have a use case where using immutable 
containers works really well and isn't a problem, but in what I've dealt with 
personally, I've found it to be a huge performance problem and so am very 
leery of immutable containers.

- Jonathan M Davis
May 24, 2012
Re: Does D have "structural sharing" of immutable collections?
On Thursday, 24 May 2012 at 01:00:52 UTC, Jonathan M Davis wrote:
> As part of my thesis work, I wrote a program which was counting 
> possible
> tokens in code (as part of an AI attempting to learn about a 
> programming
> language), which required a map of tokens to the number of 
> times that they'd
> been seen. Because I had previously been doing stuff in Haskell 
> for my thesis,
> I continued to use Haskell for that portion, and it was a 
> _huge_ mistake,
> because the performance was _terrible_ - as in it could only 
> process a few
> files in a day kind of terrible. I ultimately had to switch 
> over to another
> language to get it to work (it ended up being Java, but I could 
> have used a
> variety of other languages).
>
> Maybe if I had simply been adding elements to the map, it 
> wouldn't have been
> anywhere near as bad, but since I had to mutate them (and both 
> the map and its
> elments were technically immutable), that become _insanely_ 
> expensive.
>
> So, it's quite possible that you have a use case where using 
> immutable
> containers works really well and isn't a problem, but in what 
> I've dealt with
> personally, I've found it to be a huge performance problem and 
> so am very
> leery of immutable containers.
>
> - Jonathan M Davis

Is source code available?
May 24, 2012
Re: Does D have "structural sharing" of immutable collections?
On Wednesday, 23 May 2012 at 21:12:04 UTC, Jonathan M Davis wrote:
> On Wednesday, May 23, 2012 20:51:31 Roman D. Boiko wrote:
>> On Wednesday, 23 May 2012 at 18:39:02 UTC, Stewart Gordon 
>> wrote:
>> > On 23/05/2012 16:05, Roman D. Boiko wrote:
>> > <snip>
>> > 
>> >> I need some immutable collections for my D Compiler Tools
>> >> project, namely linked list,
>> >> red-black tree, and possibly some others.
>> > 
>> > <snip>
>> > 
>> > What's the point of an immutable red-black tree?
>> > 
>> > The whole point of a red-black tree is to facilitate fast
>> > dynamic addition, removal and lookup of elements.
>> > 
>> > When the tree is immutable, only lookup speed is of any real
>> > relevance, so you might as well create a perfectly balanced
>> > binary tree. Or even better, an array.
>> > 
>> > Stewart.
>> 
>> Immutable collections in most cases have the same performance
>> characteristics as mutable ones. Space consumption may be 
>> higher,
>> but not necessarily.
>> 
>> Instead of mutating a collection, a new one is returned each 
>> time
>> you add or remove an element. It shares most nodes with a
>> previous one.
>
> In my personal experience, that's not true at all. I've seen 
> _huge_
> performance problems in Haskell when using a map. There may be 
> cases where an
> immutable container may not pose large performance problems, 
> but I would be
> _very_ wary of using immutable containers, and _very_ careful 
> with them when I
> did.
>
> - Jonathan M Davis


While I am an Haskell newbie, I think most of those problems are
related to how hard it is for many developers to learn how to 
properly
optimize Haskell code.

The straightforward way to do some things is not the most 
performant, and
some optimizations require fine tuning of strict/lazy semantics 
on the
data structures manipulations.

--
Paulo
May 24, 2012
Re: Does D have "structural sharing" of immutable collections?
On Wednesday, 23 May 2012 at 22:39:33 UTC, Roman D. Boiko wrote:
> On Wednesday, 23 May 2012 at 22:25:35 UTC, Steven Schveighoffer 
> wrote:
>> I looked up how it could be done, the one thing I was not sure 
>> of was the parent node.  If you can have multiple references 
>> to the same subtree, how do you keep track of the parents.  
>> But if you only are concerned about the ancestry of a single 
>> node, you can dynamically determine the line in O(lgn) time, 
>> and use that for the running of the redblack algorithm.
>>
>> I'm not sure how efficient it would be.
>>
>> -Steve
>
> My problem is the following:
>
> n elements in some given order, each has an associated value; I 
> need to retrieve elements for which the sum of values 
> associated with previous elements lies in a given ragne; this 
> should work efficiently when elements are added or removed, and 
> history should be preserved.
>
> Storing elements as tree leafs and sums of children values in 
> other nodes gives me this easily.
>
> Lookup price is 0(log N) number comparisons, which should be 
> very fast. Insert / delete / rotate are also logarithmic, but I 
> don't know the constant factors.
>
> No need to retrieve parents.
>
> So far it seems like RBT is perfect for me (if cache friendly 
> implementation will be possible. Hopefully it will, because 
> nodes are small and I can preallocate memory for them).

I am not sure if I entirely understand your problem, but would a
persistent search tree work? See:

http://www.sciencedirect.com/science/article/pii/0022000089900342

I have a small write up on them from a course I took years ago:

http://cg.scs.carleton.ca/~cdillaba/comp5008/sarnak.html

Regards,
Craig

 -Steve
May 24, 2012
Re: Does D have "structural sharing" of immutable collections?
On Thursday, 24 May 2012 at 13:15:30 UTC, Craig Dillabaugh wrote:
> I am not sure if I entirely understand your problem, but would a
> persistent search tree work? See:
>
> http://www.sciencedirect.com/science/article/pii/0022000089900342
>
> I have a small write up on them from a course I took years ago:
>
> http://cg.scs.carleton.ca/~cdillaba/comp5008/sarnak.html
Yes, it would. Actually, what I described is how to use RBT as
search tree. RBT is selected because it bounds number of levels
to the optimum, and thus gives log N worst case for all
operations.

I'll take a look at the links, thanks!
Next ›   Last »
1 2 3
Top | Discussion index | About this forum | D home