std.container & ranges (page 5) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Learn » std.container & ranges (page 5)

November 03, 2011

Re: std.container & ranges

Posted by Timon Gehr
in reply to Steven Schveighoffer

Timon Gehr

Posted in reply to Steven Schveighoffer

On 11/03/2011 07:15 PM, Steven Schveighoffer wrote:
> On Thu, 03 Nov 2011 13:47:28 -0400, Timon Gehr <timon.gehr@gmx.ch> wrote:
>
>> On 11/03/2011 06:13 PM, Steven Schveighoffer wrote:
>>> On Thu, 03 Nov 2011 12:35:36 -0400, Dmitry Olshansky
>>> <dmitry.olsh@gmail.com> wrote:
>>>
>>>> On 03.11.2011 19:34, Steven Schveighoffer wrote:
>>>>> On Thu, 03 Nov 2011 10:02:22 -0400, Tobias Pankrath
>>>>> <tobias@pankrath.net> wrote:
>>>>>
>>>>>> Dmitry Olshansky wrote:
>>>>>>
>>>>>>> And more importantly, it still would be horribly slow O(N^2).
>>>>>>> Personally, because of that I'd prefer hand-rolled intrusive
>>>>>>> singly-linked list any time of day.
>>>>>>>
>>>>>>
>>>>>> To be honest, I don't understand this.
>>>>>> A "remove_if" for non-intrusive single-linked lists should be
>>>>>> doable in
>>>>>> O(N).
>>>>
>>>> Yeah, poor wording going to kill me one day :) And looking at how
>>>> SList works with "checked" remove does O(N^2) turned out to be a
>>>> context for reference for intrusive singly-linked list, just my bad.
>>>>
>>>> As for why I'd rather go with intrusive lists, it's because of it
>>>> usually uses less memory due to node structures being more
>>>> padding-free.
>>>> Anyway the point should have been focused on _hand-rolled_ part,
>>>> mostly because SList is plain not ready for prime time*. And btw
>>>> singly-linked list it's not hard to implement and you can fine tune it
>>>> how you like it e.g. they do play nice with free list allocation
>>>> strategy. Not to say of circular lists and/or using sentinels.
>>>>>
>>>>> SList is a poor singly linked list implementation. It does not support
>>>>> O(1) removal.
>>>>>
>>>>> -Steve
>>>>
>>>> Aye, looking at SList implementation I can say that it sort of tries
>>>> to verify that this is a correct list. Otherwise it would be O(1).
>>>> Indeed passing wrong range/iterator is quite bad in linked lists and
>>>> could lead to all manner of funky bugs, but the cost is horrific.
>>>> Especially since insert doesn't do this extra check.
>>>
>>> No, it's necessarily doing O(n) search for current node because it has
>>> no reference to the previous node.
>>>
>>> The range type for a SList has a single pointer to the currently
>>> iterated node. How do you remove that node without having access to the
>>> head/previous pointer?
>>>
>>
>> The container interface does not expose references to the 'Node'
>> struct, therefore the following approach would be practical:
>
> It does, actually. A range is a reference to a Node struct. But I still
> think the following approach will work.
>
>>
>> 0. Have a sentinel for end of list. (O(1) additional memory for the
>> entire list). It is the only node in the list that has a null 'next'
>> pointer.
>>
>> example, for removing just the current node:
>> 1. if the following node is not the sentinel
>> 1.1. Move value from following node into the current node.
>> 1.2. Remove following node.
>> 2. if the following node is the sentinel
>> 2.1. erase the contents of the current node (iff it has indirections)
>> 2.2. set it's 'next' field to null
>>
>>
>> Removing an entire range is just erasing the contents of the current
>> node and setting the 'next' field of the associated Node to zero.
>> Removing a Take!Range is a simple generalization of the algorithm above.
>
> This would be a good addition to the type. It could not be a
> stableRemove, however, because it would invalidate any ranges pointing
> at the node you removed. Can you put together a pull request?

I can do that, but it will have to wait a few days, as I am quite busy at the moment.

> If not, I
> can see about doing it. One issue, you could not do this if the value is
> immutable/const.

That is true. But how to fix this? Resolving it by simply not exposing the remove method if it is impossible to implement the straightforward way would be an option, but I don't think that is satisfying. Probably it could be made to work by creating some kind of head mutable structure. I will think about it. It should even be possible to generalize std.typecons.Rebindable for structs to help this idea.

>
> I could use this idea, I think, to implement a singly linked list in
> dcollections as well (the prospect of not having O(1) removal is what
> has stopped me). Thanks for the idea!
>

Nice!

November 03, 2011

Re: std.container & ranges

Posted by Dmitry Olshansky
in reply to Steven Schveighoffer

Dmitry Olshansky

Posted in reply to Steven Schveighoffer

On 03.11.2011 22:37, Steven Schveighoffer wrote:
> On Thu, 03 Nov 2011 14:08:31 -0400, Dmitry Olshansky
> <dmitry.olsh@gmail.com> wrote:
>
>> On 03.11.2011 21:13, Steven Schveighoffer wrote:
>>>
>>> dcollections stipulates that all ranges/cursors can be verified in
>>> O(lgn) time or better to belong to a specific container. In some cases,
>>> this adds an extra word to the range/cursor, and in others, it's easy to
>>> determine or the owner-reference was already needed. Since everything is
>>> a class, the fallback is to just stick an owner class instance in the
>>> range.
>>>
>>> This stipulation is necessary to allow safe slicing.
>>>
>>
>> Seems reasonable, I'd expect checks to go away in release, right(?).
>
> For the moment, no. I am not sure whether this is the right decision or
> not, because once you get beyond arrays, when to do bounds checks
> becomes fuzzy.
>
> For example, imagine you have this:
>
> auto ts = new TreeSet!int(1, 3, 5, 7, 9);
>
> What does this mean?
>
> auto r = ts[2..4];
>
> Note that a range type for a treeset has a pointer to a begin and end
> node for the container. For arrays, not doing a bounds check is simply
> less code. For a RBTree, you still have to look for the specific node,
> even if you are in release mode.
>
> Another example:
>
> auto ts2 = ts.dup; // duplicate the treeset
>
> auto c1 = ts2.elemAt(3); // get the node for 3 in ts2
>
> auto r2 = ts[c1..ts.end];

hm-h will that actually work? I mean searching in ts for node from ts2?

>
> Here I can say, we can skip the belongs check (which is an O(lgn) walk
> up the tree).
>

Well, I was mostly talking about this kind of scenario i.e. skip checking that cursor do belong to this collection. While in ts[2..4] example there is no (explicit) cursors so it's a different situation.

> But I'm still doing it in release mode. I'm not sure what's best. Should
> I just do the least work possible, or should I make it consistent with
> ts[2..4]?
>

I'd say release mode should avoid as much extra work as possible while keeping primary functionality intact, but that's just me.


-- 
Dmitry Olshansky

November 03, 2011

Re: std.container & ranges

Posted by Steven Schveighoffer
in reply to Timon Gehr

Steven Schveighoffer

Posted in reply to Timon Gehr

On Thu, 03 Nov 2011 14:50:31 -0400, Timon Gehr <timon.gehr@gmx.ch> wrote:

> On 11/03/2011 07:15 PM, Steven Schveighoffer wrote:
>> On Thu, 03 Nov 2011 13:47:28 -0400, Timon Gehr <timon.gehr@gmx.ch> wrote:
>>
>>> On 11/03/2011 06:13 PM, Steven Schveighoffer wrote:
>>>> On Thu, 03 Nov 2011 12:35:36 -0400, Dmitry Olshansky
>>>> <dmitry.olsh@gmail.com> wrote:
>>>>
>>>>> On 03.11.2011 19:34, Steven Schveighoffer wrote:
>>>>>> On Thu, 03 Nov 2011 10:02:22 -0400, Tobias Pankrath
>>>>>> <tobias@pankrath.net> wrote:
>>>>>>
>>>>>>> Dmitry Olshansky wrote:
>>>>>>>
>>>>>>>> And more importantly, it still would be horribly slow O(N^2).
>>>>>>>> Personally, because of that I'd prefer hand-rolled intrusive
>>>>>>>> singly-linked list any time of day.
>>>>>>>>
>>>>>>>
>>>>>>> To be honest, I don't understand this.
>>>>>>> A "remove_if" for non-intrusive single-linked lists should be
>>>>>>> doable in
>>>>>>> O(N).
>>>>>
>>>>> Yeah, poor wording going to kill me one day :) And looking at how
>>>>> SList works with "checked" remove does O(N^2) turned out to be a
>>>>> context for reference for intrusive singly-linked list, just my bad.
>>>>>
>>>>> As for why I'd rather go with intrusive lists, it's because of it
>>>>> usually uses less memory due to node structures being more
>>>>> padding-free.
>>>>> Anyway the point should have been focused on _hand-rolled_ part,
>>>>> mostly because SList is plain not ready for prime time*. And btw
>>>>> singly-linked list it's not hard to implement and you can fine tune it
>>>>> how you like it e.g. they do play nice with free list allocation
>>>>> strategy. Not to say of circular lists and/or using sentinels.
>>>>>>
>>>>>> SList is a poor singly linked list implementation. It does not support
>>>>>> O(1) removal.
>>>>>>
>>>>>> -Steve
>>>>>
>>>>> Aye, looking at SList implementation I can say that it sort of tries
>>>>> to verify that this is a correct list. Otherwise it would be O(1).
>>>>> Indeed passing wrong range/iterator is quite bad in linked lists and
>>>>> could lead to all manner of funky bugs, but the cost is horrific.
>>>>> Especially since insert doesn't do this extra check.
>>>>
>>>> No, it's necessarily doing O(n) search for current node because it has
>>>> no reference to the previous node.
>>>>
>>>> The range type for a SList has a single pointer to the currently
>>>> iterated node. How do you remove that node without having access to the
>>>> head/previous pointer?
>>>>
>>>
>>> The container interface does not expose references to the 'Node'
>>> struct, therefore the following approach would be practical:
>>
>> It does, actually. A range is a reference to a Node struct. But I still
>> think the following approach will work.
>>
>>>
>>> 0. Have a sentinel for end of list. (O(1) additional memory for the
>>> entire list). It is the only node in the list that has a null 'next'
>>> pointer.
>>>
>>> example, for removing just the current node:
>>> 1. if the following node is not the sentinel
>>> 1.1. Move value from following node into the current node.
>>> 1.2. Remove following node.
>>> 2. if the following node is the sentinel
>>> 2.1. erase the contents of the current node (iff it has indirections)
>>> 2.2. set it's 'next' field to null
>>>
>>>
>>> Removing an entire range is just erasing the contents of the current
>>> node and setting the 'next' field of the associated Node to zero.
>>> Removing a Take!Range is a simple generalization of the algorithm above.
>>
>> This would be a good addition to the type. It could not be a
>> stableRemove, however, because it would invalidate any ranges pointing
>> at the node you removed. Can you put together a pull request?
>
> I can do that, but it will have to wait a few days, as I am quite busy at the moment.

No rush.  I just wanted to know if you would do it, and if not, I would do it.

>
>> If not, I
>> can see about doing it. One issue, you could not do this if the value is
>> immutable/const.
>
> That is true. But how to fix this? Resolving it by simply not exposing the remove method if it is impossible to implement the straightforward way would be an option, but I don't think that is satisfying. Probably it could be made to work by creating some kind of head mutable structure. I will think about it. It should even be possible to generalize std.typecons.Rebindable for structs to help this idea.

Hm... rebindable doesn't help if the item is a struct.  You'd have to store an allocated struct on the heap.

-Steve

November 03, 2011

Re: std.container & ranges

Posted by Steven Schveighoffer
in reply to Dmitry Olshansky

Steven Schveighoffer

Posted in reply to Dmitry Olshansky

On Thu, 03 Nov 2011 16:27:46 -0400, Dmitry Olshansky <dmitry.olsh@gmail.com> wrote:

> On 03.11.2011 22:37, Steven Schveighoffer wrote:
>> On Thu, 03 Nov 2011 14:08:31 -0400, Dmitry Olshansky
>> <dmitry.olsh@gmail.com> wrote:
>>
>>> On 03.11.2011 21:13, Steven Schveighoffer wrote:
>>>>
>>>> dcollections stipulates that all ranges/cursors can be verified in
>>>> O(lgn) time or better to belong to a specific container. In some cases,
>>>> this adds an extra word to the range/cursor, and in others, it's easy to
>>>> determine or the owner-reference was already needed. Since everything is
>>>> a class, the fallback is to just stick an owner class instance in the
>>>> range.
>>>>
>>>> This stipulation is necessary to allow safe slicing.
>>>>
>>>
>>> Seems reasonable, I'd expect checks to go away in release, right(?).
>>
>> For the moment, no. I am not sure whether this is the right decision or
>> not, because once you get beyond arrays, when to do bounds checks
>> becomes fuzzy.
>>
>> For example, imagine you have this:
>>
>> auto ts = new TreeSet!int(1, 3, 5, 7, 9);
>>
>> What does this mean?
>>
>> auto r = ts[2..4];
>>
>> Note that a range type for a treeset has a pointer to a begin and end
>> node for the container. For arrays, not doing a bounds check is simply
>> less code. For a RBTree, you still have to look for the specific node,
>> even if you are in release mode.
>>
>> Another example:
>>
>> auto ts2 = ts.dup; // duplicate the treeset
>>
>> auto c1 = ts2.elemAt(3); // get the node for 3 in ts2
>>
>> auto r2 = ts[c1..ts.end];
>
> hm-h will that actually work? I mean searching in ts for node from ts2?

c1 is a cursor, so there is no need to search for it (the point of a cursor is to keep a reference to an element for later use).  All you have to do is verify it's in the container (and the ordering is valid).

If unchecked, it will result in likely a segfault, because the two endpoints are not connected.

>> Here I can say, we can skip the belongs check (which is an O(lgn) walk
>> up the tree).
>>
>
> Well, I was mostly talking about this kind of scenario i.e. skip checking that cursor do belong to this collection. While in ts[2..4] example there is no (explicit) cursors so it's a different situation.

So what should happen?  Should it throw?

>> But I'm still doing it in release mode. I'm not sure what's best. Should
>> I just do the least work possible, or should I make it consistent with
>> ts[2..4]?
>>
>
> I'd say release mode should avoid as much extra work as possible while keeping primary functionality intact, but that's just me.

Yes, it's a dilemma that I'm not sure what the right answer is.  It does make sense that release mode should just avoid the checks altogether.

-Steve

November 03, 2011

Re: std.container & ranges

Posted by Timon Gehr
in reply to Steven Schveighoffer

Timon Gehr

Posted in reply to Steven Schveighoffer

On 11/03/2011 09:46 PM, Steven Schveighoffer wrote:
> On Thu, 03 Nov 2011 14:50:31 -0400, Timon Gehr <timon.gehr@gmx.ch> wrote:
>
>> On 11/03/2011 07:15 PM, Steven Schveighoffer wrote:
>>> On Thu, 03 Nov 2011 13:47:28 -0400, Timon Gehr <timon.gehr@gmx.ch>
>>> wrote:
>>>
>>>> On 11/03/2011 06:13 PM, Steven Schveighoffer wrote:
>>>>> On Thu, 03 Nov 2011 12:35:36 -0400, Dmitry Olshansky
>>>>> <dmitry.olsh@gmail.com> wrote:
>>>>>
>>>>>> On 03.11.2011 19:34, Steven Schveighoffer wrote:
>>>>>>> On Thu, 03 Nov 2011 10:02:22 -0400, Tobias Pankrath
>>>>>>> <tobias@pankrath.net> wrote:
>>>>>>>
>>>>>>>> Dmitry Olshansky wrote:
>>>>>>>>
>>>>>>>>> And more importantly, it still would be horribly slow O(N^2).
>>>>>>>>> Personally, because of that I'd prefer hand-rolled intrusive
>>>>>>>>> singly-linked list any time of day.
>>>>>>>>>
>>>>>>>>
>>>>>>>> To be honest, I don't understand this.
>>>>>>>> A "remove_if" for non-intrusive single-linked lists should be
>>>>>>>> doable in
>>>>>>>> O(N).
>>>>>>
>>>>>> Yeah, poor wording going to kill me one day :) And looking at how
>>>>>> SList works with "checked" remove does O(N^2) turned out to be a
>>>>>> context for reference for intrusive singly-linked list, just my bad.
>>>>>>
>>>>>> As for why I'd rather go with intrusive lists, it's because of it
>>>>>> usually uses less memory due to node structures being more
>>>>>> padding-free.
>>>>>> Anyway the point should have been focused on _hand-rolled_ part,
>>>>>> mostly because SList is plain not ready for prime time*. And btw
>>>>>> singly-linked list it's not hard to implement and you can fine
>>>>>> tune it
>>>>>> how you like it e.g. they do play nice with free list allocation
>>>>>> strategy. Not to say of circular lists and/or using sentinels.
>>>>>>>
>>>>>>> SList is a poor singly linked list implementation. It does not
>>>>>>> support
>>>>>>> O(1) removal.
>>>>>>>
>>>>>>> -Steve
>>>>>>
>>>>>> Aye, looking at SList implementation I can say that it sort of tries
>>>>>> to verify that this is a correct list. Otherwise it would be O(1).
>>>>>> Indeed passing wrong range/iterator is quite bad in linked lists and
>>>>>> could lead to all manner of funky bugs, but the cost is horrific.
>>>>>> Especially since insert doesn't do this extra check.
>>>>>
>>>>> No, it's necessarily doing O(n) search for current node because it has
>>>>> no reference to the previous node.
>>>>>
>>>>> The range type for a SList has a single pointer to the currently
>>>>> iterated node. How do you remove that node without having access to
>>>>> the
>>>>> head/previous pointer?
>>>>>
>>>>
>>>> The container interface does not expose references to the 'Node'
>>>> struct, therefore the following approach would be practical:
>>>
>>> It does, actually. A range is a reference to a Node struct. But I still
>>> think the following approach will work.
>>>
>>>>
>>>> 0. Have a sentinel for end of list. (O(1) additional memory for the
>>>> entire list). It is the only node in the list that has a null 'next'
>>>> pointer.
>>>>
>>>> example, for removing just the current node:
>>>> 1. if the following node is not the sentinel
>>>> 1.1. Move value from following node into the current node.
>>>> 1.2. Remove following node.
>>>> 2. if the following node is the sentinel
>>>> 2.1. erase the contents of the current node (iff it has indirections)
>>>> 2.2. set it's 'next' field to null
>>>>
>>>>
>>>> Removing an entire range is just erasing the contents of the current
>>>> node and setting the 'next' field of the associated Node to zero.
>>>> Removing a Take!Range is a simple generalization of the algorithm
>>>> above.
>>>
>>> This would be a good addition to the type. It could not be a
>>> stableRemove, however, because it would invalidate any ranges pointing
>>> at the node you removed. Can you put together a pull request?
>>
>> I can do that, but it will have to wait a few days, as I am quite busy
>> at the moment.
>
> No rush. I just wanted to know if you would do it, and if not, I would
> do it.
>
>>
>>> If not, I
>>> can see about doing it. One issue, you could not do this if the value is
>>> immutable/const.
>>
>> That is true. But how to fix this? Resolving it by simply not exposing
>> the remove method if it is impossible to implement the straightforward
>> way would be an option, but I don't think that is satisfying. Probably
>> it could be made to work by creating some kind of head mutable
>> structure. I will think about it. It should even be possible to
>> generalize std.typecons.Rebindable for structs to help this idea.
>
> Hm... rebindable doesn't help if the item is a struct. You'd have to
> store an allocated struct on the heap.
>
> -Steve

My point is, if you have a struct like this:

struct S{
    const int* x;
    immutable int y;
}

Then that could be stored in the list as

struct _S{
    const(int)* x;
    int y;

    S getS(){return S(x,y);}
}

Without putting the type system under pressure.
But on second thought, std.typecons.Rebindable can probably not meaningfully be generalized to structs because it could not deal with member functions.

November 03, 2011

Re: std.container & ranges

Posted by Steven Schveighoffer
in reply to Timon Gehr

Steven Schveighoffer

Posted in reply to Timon Gehr

On Thu, 03 Nov 2011 17:13:55 -0400, Timon Gehr <timon.gehr@gmx.ch> wrote:

> On 11/03/2011 09:46 PM, Steven Schveighoffer wrote:
>> On Thu, 03 Nov 2011 14:50:31 -0400, Timon Gehr <timon.gehr@gmx.ch> wrote:
>>
>>> On 11/03/2011 07:15 PM, Steven Schveighoffer wrote:
>>>> On Thu, 03 Nov 2011 13:47:28 -0400, Timon Gehr <timon.gehr@gmx.ch>
>>>> wrote:
>>>>
>>>>> On 11/03/2011 06:13 PM, Steven Schveighoffer wrote:
>>>>>> On Thu, 03 Nov 2011 12:35:36 -0400, Dmitry Olshansky
>>>>>> <dmitry.olsh@gmail.com> wrote:
>>>>>>
>>>>>>> On 03.11.2011 19:34, Steven Schveighoffer wrote:
>>>>>>>> On Thu, 03 Nov 2011 10:02:22 -0400, Tobias Pankrath
>>>>>>>> <tobias@pankrath.net> wrote:
>>>>>>>>
>>>>>>>>> Dmitry Olshansky wrote:
>>>>>>>>>
>>>>>>>>>> And more importantly, it still would be horribly slow O(N^2).
>>>>>>>>>> Personally, because of that I'd prefer hand-rolled intrusive
>>>>>>>>>> singly-linked list any time of day.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> To be honest, I don't understand this.
>>>>>>>>> A "remove_if" for non-intrusive single-linked lists should be
>>>>>>>>> doable in
>>>>>>>>> O(N).
>>>>>>>
>>>>>>> Yeah, poor wording going to kill me one day :) And looking at how
>>>>>>> SList works with "checked" remove does O(N^2) turned out to be a
>>>>>>> context for reference for intrusive singly-linked list, just my bad.
>>>>>>>
>>>>>>> As for why I'd rather go with intrusive lists, it's because of it
>>>>>>> usually uses less memory due to node structures being more
>>>>>>> padding-free.
>>>>>>> Anyway the point should have been focused on _hand-rolled_ part,
>>>>>>> mostly because SList is plain not ready for prime time*. And btw
>>>>>>> singly-linked list it's not hard to implement and you can fine
>>>>>>> tune it
>>>>>>> how you like it e.g. they do play nice with free list allocation
>>>>>>> strategy. Not to say of circular lists and/or using sentinels.
>>>>>>>>
>>>>>>>> SList is a poor singly linked list implementation. It does not
>>>>>>>> support
>>>>>>>> O(1) removal.
>>>>>>>>
>>>>>>>> -Steve
>>>>>>>
>>>>>>> Aye, looking at SList implementation I can say that it sort of tries
>>>>>>> to verify that this is a correct list. Otherwise it would be O(1).
>>>>>>> Indeed passing wrong range/iterator is quite bad in linked lists and
>>>>>>> could lead to all manner of funky bugs, but the cost is horrific.
>>>>>>> Especially since insert doesn't do this extra check.
>>>>>>
>>>>>> No, it's necessarily doing O(n) search for current node because it has
>>>>>> no reference to the previous node.
>>>>>>
>>>>>> The range type for a SList has a single pointer to the currently
>>>>>> iterated node. How do you remove that node without having access to
>>>>>> the
>>>>>> head/previous pointer?
>>>>>>
>>>>>
>>>>> The container interface does not expose references to the 'Node'
>>>>> struct, therefore the following approach would be practical:
>>>>
>>>> It does, actually. A range is a reference to a Node struct. But I still
>>>> think the following approach will work.
>>>>
>>>>>
>>>>> 0. Have a sentinel for end of list. (O(1) additional memory for the
>>>>> entire list). It is the only node in the list that has a null 'next'
>>>>> pointer.
>>>>>
>>>>> example, for removing just the current node:
>>>>> 1. if the following node is not the sentinel
>>>>> 1.1. Move value from following node into the current node.
>>>>> 1.2. Remove following node.
>>>>> 2. if the following node is the sentinel
>>>>> 2.1. erase the contents of the current node (iff it has indirections)
>>>>> 2.2. set it's 'next' field to null
>>>>>
>>>>>
>>>>> Removing an entire range is just erasing the contents of the current
>>>>> node and setting the 'next' field of the associated Node to zero.
>>>>> Removing a Take!Range is a simple generalization of the algorithm
>>>>> above.
>>>>
>>>> This would be a good addition to the type. It could not be a
>>>> stableRemove, however, because it would invalidate any ranges pointing
>>>> at the node you removed. Can you put together a pull request?
>>>
>>> I can do that, but it will have to wait a few days, as I am quite busy
>>> at the moment.
>>
>> No rush. I just wanted to know if you would do it, and if not, I would
>> do it.
>>
>>>
>>>> If not, I
>>>> can see about doing it. One issue, you could not do this if the value is
>>>> immutable/const.
>>>
>>> That is true. But how to fix this? Resolving it by simply not exposing
>>> the remove method if it is impossible to implement the straightforward
>>> way would be an option, but I don't think that is satisfying. Probably
>>> it could be made to work by creating some kind of head mutable
>>> structure. I will think about it. It should even be possible to
>>> generalize std.typecons.Rebindable for structs to help this idea.
>>
>> Hm... rebindable doesn't help if the item is a struct. You'd have to
>> store an allocated struct on the heap.
>>
>> -Steve
>
> My point is, if you have a struct like this:
>
> struct S{
>      const int* x;
>      immutable int y;
> }
>
> Then that could be stored in the list as
>
> struct _S{
>      const(int)* x;
>      int y;
>
>      S getS(){return S(x,y);}
> }
>
> Without putting the type system under pressure.
> But on second thought, std.typecons.Rebindable can probably not meaningfully be generalized to structs because it could not deal with member functions.

Yes, this looks like a viable solution.  Not sure how it would be done in a generic way.  I think what is best is to have getS be like this:

S *getS(){ return cast(S*)&this;}

And never provide access to this

As long as S.opCmp, S.opEquals, and S.toHash are all const, this should be no issue (this needs to be a requirement).

I wonder if we really have to go through such trouble...

-Steve

November 03, 2011

Re: std.container & ranges

Posted by Steven Schveighoffer
in reply to Timon Gehr

Steven Schveighoffer

Posted in reply to Timon Gehr

On Thu, 03 Nov 2011 14:50:31 -0400, Timon Gehr <timon.gehr@gmx.ch> wrote:

> On 11/03/2011 07:15 PM, Steven Schveighoffer wrote:
>>
>> I could use this idea, I think, to implement a singly linked list in
>> dcollections as well (the prospect of not having O(1) removal is what
>> has stopped me). Thanks for the idea!
>>
>
> Nice!

Looking at dcollections' List interface, the one thing I can't implement is back() (i.e. get the last element in the list).  I can implement everything else.  It might be worth it to make an exception for this (i.e. just throw if someone calls back()) in order to have a singly-linked list implementation.

-Steve

November 04, 2011

Re: std.container & ranges

Posted by Max Wolter
in reply to Steven Schveighoffer

Max Wolter

Posted in reply to Steven Schveighoffer

> As long as you don't need to search for the element to remove using its
> value, removal in a linked list should be O(1). A linked list that does
> not allow O(1) removal and O(1) insertion given a topological reference
> is a failure (yes, that includes the current version of SList).

Well, thank god that's cleared up.

Time to move on.

/Max

November 05, 2011

Re: std.container & ranges

Posted by Marco Leise
in reply to Steven Schveighoffer

Marco Leise

Posted in reply to Steven Schveighoffer

Am 02.11.2011, 12:48 Uhr, schrieb Steven Schveighoffer <schveiguy@yahoo.com>:
> In dcollections, removing a specific element (using the default comparison operator for that element) on a *fast lookup* container is as simple as:
>
> container.remove(container.find(x));
>
> Which removes the element x if it's found.  However, this is not defined for containers which use O(n) time to search (such as linked list), you must use std.algorithm.find for that:
>
> container.remove(find(container[], x).begin);
>
> Should work, and takes O(n) time.
>
> -Steve

While I stumble over this. I found that I sometimes need a certain container (data structure/algorithm), but with an additional fast removal or lookup. So I returned a "handle" (a pointer or array index) for any inserted element that you can store (in the element itself for example) and use for lookup/removal. Templates bake the correct struct for this:

// The element I want to store

struct Element {
  // ... actual data
  size_t handle; // a pointer to a node in disguise, returned from insert(...)
}

// A collection node

struct Node {
  Element e;
  Node* prev;
  Node* next;
}

// Usage

elem.handle = list.insert(elem);
list.remove(elem.handle);

Now you have a double-linked list with O(1) removal, as long as you keep the handle around. The limitation is, that the handle has to remain constant. So a binary heap with the elements embedded in an array and no indirection through a pointer would not work. As soon as elements are reorganized, their handle (array index in this case) becomes invalid.
But the point is, that you can reduce the time complexity for removal in every data structure like this in an implementation agnostic way (unless plain array of elements) and I think it is better than Java's approach that just works, but may result in an O(n) lookup operation.
Iterators with remove functionality overlap with this partially. (They don't need an extra handle stored somewhere, but generally go over the complete list.)

November 05, 2011

Re: std.container & ranges

Posted by Steven Schveighoffer
in reply to Marco Leise

Steven Schveighoffer

Posted in reply to Marco Leise

On Sat, 05 Nov 2011 17:28:34 -0400, Marco Leise <Marco.Leise@gmx.de> wrote:

> Am 02.11.2011, 12:48 Uhr, schrieb Steven Schveighoffer <schveiguy@yahoo.com>:
>> In dcollections, removing a specific element (using the default comparison operator for that element) on a *fast lookup* container is as simple as:
>>
>> container.remove(container.find(x));
>>
>> Which removes the element x if it's found.  However, this is not defined for containers which use O(n) time to search (such as linked list), you must use std.algorithm.find for that:
>>
>> container.remove(find(container[], x).begin);
>>
>> Should work, and takes O(n) time.
>>
>> -Steve
>
> While I stumble over this. I found that I sometimes need a certain container (data structure/algorithm), but with an additional fast removal or lookup. So I returned a "handle" (a pointer or array index) for any inserted element that you can store (in the element itself for example) and use for lookup/removal. Templates bake the correct struct for this:
>
> // The element I want to store
>
> struct Element {
>    // ... actual data
>    size_t handle; // a pointer to a node in disguise, returned from insert(...)
> }
>
> // A collection node
>
> struct Node {
>    Element e;
>    Node* prev;
>    Node* next;
> }
>
> // Usage
>
> elem.handle = list.insert(elem);
> list.remove(elem.handle);

You should be able to do this with dcollections:

struct Element(T) {
   T data;
   List!T.cursor handle;
}

elem.handle = list.insert(list.end, elem.data); // x is the value being inserted
list.remove(elem.handle);  // removes the element you inserted earlier.

Note, you have to specify an insertion point (this is similar to C++).  But note that you can't exactly create a handle to a list of the Element type, because you need that cursor type to define Element.

> Now you have a double-linked list with O(1) removal, as long as you keep the handle around. The limitation is, that the handle has to remain constant. So a binary heap with the elements embedded in an array and no indirection through a pointer would not work. As soon as elements are reorganized, their handle (array index in this case) becomes invalid.
> But the point is, that you can reduce the time complexity for removal in every data structure like this in an implementation agnostic way (unless plain array of elements) and I think it is better than Java's approach that just works, but may result in an O(n) lookup operation.
> Iterators with remove functionality overlap with this partially. (They don't need an extra handle stored somewhere, but generally go over the complete list.)

Definitely.  This is the benefit of iterators/cursors.  The ability to refer to exactly one element makes things much more straightforward.  It's why dcollections has cursors.

-Steve

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation