November 07, 2006
Sean Kelly wrote:
> Is there really any reason to support pointers as iterators though?  C++ libraries even seem to be moving away from that towards more robust and less error-prone iterator objects.

Can you be more specific about what the problems and solutions are?
November 07, 2006
Bill Baxter wrote:
> Sean Kelly wrote:
>>> The problem is if the iterator is an actual pointer, there is no .value property for a pointer.
> 
> Well, they could.  It's up to Mr. Compiler Writer if a pointer has a value property or not.

Consider the following struct:

	struct Foo
	{
		int value;
	}

Foo* p;

p.value;

Is p.value the entire contents of Foo (as it would be for a proposed .value property) or just Foo.value? "value" is a pretty common field name.
November 07, 2006
Walter Bright wrote:
> Knud Sørensen wrote:
>> On Mon, 06 Nov 2006 12:46:01 -0800, Walter Bright wrote:
>>
>>> It's becoming increasingly obvious that D needs iterators. While opApply   is far better for some kinds of iteration (such as recursively traversing a directory), iterators are more efficient in some cases, and allow for things that opApply makes difficult.
>>>
>>
>> What are those cases ? Maybe we can find a way to fix the problem with
>> opApply.
> 
> One such case is the usefulness of being able to provide an input iterator to a parsing function, which may itself pass the iterator off to other parsing functions.

Another case is the desire to iterate over multiple containers at the same time, say as you would do in a sorted list merge operation. opApply could handle this one if D had real coroutines (i.e. multiple opApply's could be running simultaneously)

Another case is the basic desire to hold onto a pointer to a container for later use.  For instance if you have an STL style linked list container where the implementation (the actual nodes) are hidden from the user.  You still want the user to be able to keep something that functions like a pointer to a particular insertion point in the list (and be able to use that in generic sequence manipulation algorithms).

--bb

November 07, 2006
Walter Bright wrote:
> Kirk McDonald wrote:
>
> 
>>> I think it does provide enough, in fact, it will be less wacky than in C++ (look at the wretched iterator/const_iterator dichotomy). It'll work with core arrays, and will not need those typedefs.
>> Without the ability to overload the dereference operator, any attempt to write a class that behaves like a pointer in D will be unwieldy, or at least ugly. Admittedly, this isn't much of an issue when just using foreach, but might be an issue with any kind of STL-like algorithms library. (Which is what you're inviting when using C++'s iterator semantics.)
> 
> I still think it isn't necessary, but you might be thinking of a use case I haven't. Can you provide an example?

The only thing that comes to mind is if you get multiple indirections. For instance, in C++ I frequently end up creating things like vectors of lists or lists of vectors.  Or vectors of pointers.  Or sets of list iterators. Then to actually use the things I need two dereferences.

But with built-in GC, you don't really need smart pointers so often. And with "dot-is-all-you-need" that also eliminates many cases for dereferences.

But still I'm sure you could come up with situations where to get at the thing pointed to you'd need a string of [0]'s

     iteriteriter[0][0][0]

Actually, in some ways it's an improvement over the prefix deref operator because it reads consistently from left to right.  In c++ if you had iterator to vector of iterator and you wanted to deref it you'd need:

     *((*iterveciter)[i])

Or maybe the inner parens are unnecessary.  I can't recall, which is another reason why using postfix for everything is better -- no question about precedence rules.

So I'm sold on dereferencing being a postfix operation.  But I would still like it better if there were some way to get a compile-time error if I try to do
    iter[2]  // Error! This iterator isn't random access!

Would introducing a special symbol be of any use?  Like [*]?

    iter[*]

It would just compile to opIndex(0), but it declares the intent "this is dereferencing, not arbitrary indexing".  I guess it's kind of pointless if there's no compiler support for enforcing its use.  I guess it would only be useful if there were an opDeref.

Anyway, whatever you do with dereferencing -- I'm convinced you should absolutely not make it a unary prefix operator.  It should be postfix.

--bb
November 07, 2006
Walter Bright wrote:
> Bill Baxter wrote:
>> Sean Kelly wrote:
>>>> The problem is if the iterator is an actual pointer, there is no .value property for a pointer.
>>
>> Well, they could.  It's up to Mr. Compiler Writer if a pointer has a value property or not.
> 
> Consider the following struct:
> 
>     struct Foo
>     {
>         int value;
>     }
> 
> Foo* p;
> 
> p.value;
> 
> Is p.value the entire contents of Foo (as it would be for a proposed .value property) or just Foo.value? "value" is a pretty common field name.

It's a name conflict.  They happen.  I'd have trouble if I wanted to have a sizeof member in my struct too.  If you want to be sure to get the struct's member then use p[0].value.

'Value' is too common, though.  If you were going to go that route it should definitely be something like .iterVal or some other word less likely to conflict.

I'm not saying it's the right way to go, just that it could be made to work if needed.

--bb
November 07, 2006
Walter Bright wrote:
> Sean Kelly wrote:
> 
>> Is there really any reason to support pointers as iterators though?  C++ libraries even seem to be moving away from that towards more robust and less error-prone iterator objects.
> 
> 
> Can you be more specific about what the problems and solutions are?

For one thing, if pointers still are supported, then all algorithms have to be written so they can use them.

If we skip "real pointer compatibility", then we can decide on smarter iterators.

Example:

With pointer-compatible iterators, it is impossible to traverse the tree.

With smarter iterators one could.
November 07, 2006
Bill Baxter wrote:
> Sean Kelly wrote:
>>>>
>>>> Would it be nicer to just use a .value property instead?
>>>>
>>>> foreach(i; foo)
>>>> {
>>>>     auto x = i.value;
>>>>     i.value = x + 1;
>>>> } 
>>>
>>> The problem is if the iterator is an actual pointer, there is no .value property for a pointer.
> 
> Well, they could.  It's up to Mr. Compiler Writer if a pointer has a value property or not.
> 
>> Is there really any reason to support pointers as iterators though?  C++ libraries even seem to be moving away from that towards more robust and less error-prone iterator objects.
> 
> Agreed.  I don't think I've ever actually used a raw pointer with a fancy STL algorithm.  I was actually really surprised the first time I saw it in the examples on the SGI STL page.  Other than those examples, I've never seen it in real code.

I've used it a few times, but generally more when I'm throwing together a quick test than writing actual production code.  But since D arrays aren't the same as C++ arrays and we have the option of getting a robust object back (via a.begin() or whatever), why not take it?


Sean
November 07, 2006
Walter Bright wrote:
> Sean Kelly wrote:
>> Is there really any reason to support pointers as iterators though?  C++ libraries even seem to be moving away from that towards more robust and less error-prone iterator objects.
> 
> Can you be more specific about what the problems and solutions are?

A direct application of pointers as iterators couldn't be done exactly the C++ way because C++ uses traits templates to describe the operations an iterator supports.  And D will not overload templates from different modules, so the user could not provide overloads for his own iterators.  This isn't a huge problem however, as we could probably write a set of generic traits templates that apply to classes/structs implemented the 'right' way rather than requiring the user to provide his own.  This would be roughly similar to how a user can derive from std::iterator<> to add the required typedefs to his object in C++.

More generally though, I find the need to use two iterators to identify a range to be somewhat cumbersome and annoying.  And there have been enough others who feel this way that there has been at least an informal proposal for "all in one" iterators for C++.  I haven't bothered to check whether it was ever formalized though.

Basically, I'm wondering whether it might be easier and more D-like to use iterators for stepping and opIndex for random access, instead of iterators for everything.  Since D has slicing, the argument for using iterators to define the boundaries of a range of randomly accessible elements seems kind of small to me.  ie.

    sort( a.begin, a.begin + 5 ); // sort the first 5 elements
or
    sort( a[0 .. 5] );

I find the second to be cleaner to look at.  But I'm undecided whether we'd lose any power or flexibility by not bothering with random access iterators.


Sean
November 07, 2006
When writing custom C++ iterators, I find that end() is not ever necessary. If end() is not used, it means that a little more smarts have to be added to the iterator itself so that the iterator knows when to stop.  In some cases this means that the iterator needs a pointer to the collection/container object in order to get that information.

-Craig

"Walter Bright" <newshound@digitalmars.com> wrote in message news:eioqvl$26t0$2@digitaldaemon.com...
> Bill Baxter wrote:
> > I think the iterator-as-range idea is at least worth discussing.  99% of the time you use an iterator you also need to know where to stop.  It's pretty similar to how arrays in D keep a length, because 99% of the time when you have an array you need to know how long it is too.  So it makes sense to keep those two bits of information together.  Similarly in C++, when you have an iterator, you almost always need the end() too.  And any time you want to pass around iterators you end up having to pass the two bits of information around separately.
>
> That's a very good point. Got any ideas on that?


November 07, 2006
Craig Black wrote:
> When writing custom C++ iterators, I find that end() is not ever necessary.
> If end() is not used, it means that a little more smarts have to be added to
> the iterator itself so that the iterator knows when to stop.  In some cases
> this means that the iterator needs a pointer to the collection/container
> object in order to get that information.

Also in some cases, 'end()' is really unnatural to write. E.g. with cyclic containers.