March 21, 2012
On Thursday, March 22, 2012 10:39:23 Daniel Murphy wrote:
> "Jonathan M Davis" <jmdavisProg@gmx.com> wrote in message news:mailman.985.1332364578.4860.digitalmars-d@puremagic.com...
> 
> > I know that. Much point is that length == 0 is a bad thing to do in
> > general,
> > because it's ineffecient with some containers. The language itself is
> > pretty
> > much irrelevant as far as that goes. As such, I'd argue in pretty much
> > _any_
> > language that using length == 0 instead of empty is _not_ a good habit to
> > be
> > in. Doing it with arrays will make it much more likely that you'll end up
> > doing it with containers without thinking about it. On the other hand, if
> > you're in the habit of _always_ using empty rather than length == 0, then
> > you
> > don't have the problem.
> 
> Yes, .length is inefficient on some containers... but so is indexing. That doesn't mean using indexing on arrays is a bad habit.

Except that containers shouldn't provide indexing if it's not efficient. And from what I've seen, it's far too common for programmers to check length == 0, and they end up doing it on stuff like linked lists where it _is_ inefficient. It's considered good practice in C++ to use empty rather than length for exactly the reasons that I've listed. D is no different in this regard.

> If you're writing general code for ranges, you are going to have to use the
> range interface. But when you're writing code for arrays only, you can take
> advantage of the fact that indexing is O(1), length is O(1), slicing is
> O(1) etc.

True, but save for length, those should all be O(1) (or at maybe O(log n) at the worst). So, save for length, those operations are supposed to be efficient for _any_ container.

And yes, you can do whatever you want with arrays without regard for other containers if you want to. My point is that if you're in the habit of using length == 0, you'll end up using it with stuff _other_ than arrays - including containers where it's very inefficient. It doesn't help that it seems to be many programmers' natural instinct to check whether a container's length is 0 rather than checking whether it's empty.

Using empty rather than length == 0 is all about getting into the habit of doing things the efficient way so that you don't have to think about it, and you're less likely to make mistakes. It's the same as why you should always use the pre-increment operator in C++ if it doesn't matter whether you use pre-increment or post-increment (D's design was smart enough to avoid the problem). It's a habit which doesn't cost you anything and promotes efficient code. But length == 0 is far worse, because it can result in an operation costing O(n) instead of O(1) rather than having a constant cost added to it.

> I'm not even advocating using arr.length == 0 to check for empty. Just use
> `if (!arr)`.

I don't like that either, because of the potential ambiguity for newbies (given the whole null vs empty mess and the fact that if(var) checks for null with all other reference types), but at least that's not going to cost you anything in terms of efficiency, whereas if you're in the habit of always using length == 0, you're likely to use it when it would really cost you. So, if(arr) is more of a style choice and is more debatable, I think, whereas I'd consider the issue of length == 0 vs empty to be more black and white.

- Jonathan M Davis
March 22, 2012
On Wed, 21 Mar 2012 19:56:41 -0400, Jonathan M Davis <jmdavisProg@gmx.com> wrote:


> Except that containers shouldn't provide indexing if it's not efficient. And
> from what I've seen, it's far too common for programmers to check length == 0,
> and they end up doing it on stuff like linked lists where it _is_ inefficient.

Wait, if someone provides inefficient length, what makes you think they won't provide inefficient indexing?

-Steve
March 22, 2012
On Wednesday, March 21, 2012 20:17:06 Steven Schveighoffer wrote:
> On Wed, 21 Mar 2012 19:56:41 -0400, Jonathan M Davis <jmdavisProg@gmx.com>
> 
> wrote:
> > Except that containers shouldn't provide indexing if it's not efficient.
> > And
> > from what I've seen, it's far too common for programmers to check length
> > == 0,
> > and they end up doing it on stuff like linked lists where it _is_
> > inefficient.
> 
> Wait, if someone provides inefficient length, what makes you think they won't provide inefficient indexing?

Both C++'s STL and D's std.container put requirements on the algorithmic complexity of various operations. O(n) indexing would violate them, whereas they allow O(n) length/size. So, as long as you use standard containers, you can rely on the efficiency of indexing but not that of length/size. And I'd argue that 3rd party containers which violate those algorithmic requirements are generally doing something wrong (possibly with some exceptions for specific situations but not in general).

So, as long as you use standard containers, it's a non-issue. And if you're dealing with a 3rd party library that doesn't provide appropriate algorithmic guarantees for its containers, you should probably rethink using it.

It _is_ true that some languages and libraries are completely idiotic about it though (e.g. Java providing a get function to LinkedList which takes an index - which would be the equivalent of providing horribly efficient indexing if they had operator overloading in Java). Those are broken APIs IMHO though, in which case, you must tread carefully. Well-designed libraries _do_ guarantee efficient indexing.

- Jonathan M Davis
March 22, 2012
On Wednesday, March 21, 2012 20:46:05 Jonathan M Davis wrote:
> On Wednesday, March 21, 2012 20:17:06 Steven Schveighoffer wrote:
> > On Wed, 21 Mar 2012 19:56:41 -0400, Jonathan M Davis <jmdavisProg@gmx.com>
> > 
> > wrote:
> > > Except that containers shouldn't provide indexing if it's not efficient.
> > > And
> > > from what I've seen, it's far too common for programmers to check length
> > > == 0,
> > > and they end up doing it on stuff like linked lists where it _is_
> > > inefficient.
> > 
> > Wait, if someone provides inefficient length, what makes you think they won't provide inefficient indexing?
> 
> Both C++'s STL and D's std.container put requirements on the algorithmic
> complexity of various operations. O(n) indexing would violate them, whereas
> they allow O(n) length/size.

Actually, it looks like std.container _does_ guarantee that length is O(1), unlike C++'s STL, in which case it's not the same issue that it is in C++. I'd still tend to argue that checking for empty is better, but I guess that it's not as big a deal. In C++, it's a definite problem when programmers keep checking that size() == 0, since inevitably, it ends up happening with linked lists and the like, making for inefficient code, and simply being in the habit of using empty() rather than size() == 0 avoids such problems.

- Jonathan M Davis
March 22, 2012
On 3/21/12 6:56 PM, Jonathan M Davis wrote:
> Except that containers shouldn't provide indexing if it's not efficient. And
> from what I've seen, it's far too common for programmers to check length == 0,
> and they end up doing it on stuff like linked lists where it _is_ inefficient.
> It's considered good practice in C++ to use empty rather than length for
> exactly the reasons that I've listed. D is no different in this regard.

In D it's poor style to define linear-time length. That's called walkLength.

Andrei
March 22, 2012
On 3/21/12 7:53 PM, Jonathan M Davis wrote:
> Actually, it looks like std.container _does_ guarantee that length is O(1),
> unlike C++'s STL, in which case it's not the same issue that it is in C++.

Here we learned from a small mistake of C++. (BTW it's O(log n).)

Andrei

March 22, 2012
On Wednesday, March 21, 2012 21:12:36 Andrei Alexandrescu wrote:
> On 3/21/12 7:53 PM, Jonathan M Davis wrote:
> > Actually, it looks like std.container _does_ guarantee that length is O(1), unlike C++'s STL, in which case it's not the same issue that it is in C++.
> Here we learned from a small mistake of C++.

Yes. It looks that way. I'd forgotten that length had better guarantees in D than in C++.

> (BTW it's O(log n).)

What's O(log n)? length in D? According to std.container, it's supposed to
be O(1).

- Jonathan M Davis

March 22, 2012
On Wed, 21 Mar 2012 20:53:42 -0400, Jonathan M Davis <jmdavisProg@gmx.com> wrote:

> On Wednesday, March 21, 2012 20:46:05 Jonathan M Davis wrote:
>> On Wednesday, March 21, 2012 20:17:06 Steven Schveighoffer wrote:
>> > On Wed, 21 Mar 2012 19:56:41 -0400, Jonathan M Davis  
>> <jmdavisProg@gmx.com>
>> >
>> > wrote:
>> > > Except that containers shouldn't provide indexing if it's not  
>> efficient.
>> > > And
>> > > from what I've seen, it's far too common for programmers to check  
>> length
>> > > == 0,
>> > > and they end up doing it on stuff like linked lists where it _is_
>> > > inefficient.
>> >
>> > Wait, if someone provides inefficient length, what makes you think  
>> they
>> > won't provide inefficient indexing?
>>
>> Both C++'s STL and D's std.container put requirements on the algorithmic
>> complexity of various operations. O(n) indexing would violate them, whereas
>> they allow O(n) length/size.
>
> Actually, it looks like std.container _does_ guarantee that length is O(1),
> unlike C++'s STL, in which case it's not the same issue that it is in C++. I'd
> still tend to argue that checking for empty is better, but I guess that it's
> not as big a deal. In C++, it's a definite problem when programmers keep
> checking that size() == 0, since inevitably, it ends up happening with linked
> lists and the like, making for inefficient code, and simply being in the habit
> of using empty() rather than size() == 0 avoids such problems.

Yes, that was my point.

Note that the default std::list has O(1) length (as does dcollections' LinkedList).  It's not as inevitable as you think.

-Steve
March 22, 2012
On Thursday, March 22, 2012 07:12:22 Steven Schveighoffer wrote:
> Note that the default std::list has O(1) length (as does dcollections'
> LinkedList). It's not as inevitable as you think.

It depends on bothe container its implementation as to how efficient length/size is (and in the case of linked lists, it's generally a question of whether you want splicing to be efficient or size/length to be efficient). But you don't have to worry about how efficient length/size's implementation is when checking whether a container is empty if you just always use empty rather than length == 0, so it's a good habit to get into. There's generally no advantage to using length == 0 over empty.

There's a slight advantage with D and arrays only in that empty isn't built into arrays, so you have to import std.array to use empty, and empty is just a wrapper around length == 0, which will then be _slightly_ less efficient without -inline, but I'd still argue for using empty, because it avoids the whole issue of length/size's efficiency.

Regardless, fortunately, it's not as big an issue in D as in C++ thanks to the fact that Phobos defines length to be O(1) - and it's good that dcollections does the same.

- Jonathan M Davis
March 25, 2012
On 2012-03-21 23:25, Simen Kjærås wrote:

> I hope you mean canFind!("true")([3, 4, 5]);. canFind!"a" fails for
> arrays where all elements are 0.

Yes. I'm not really familiar with canFind or std.algorithms in general.

-- 
/Jacob Carlborg
1 2 3
Next ›   Last »