View mode: basic / threaded / horizontal-split · Log in · Help
August 28, 2008
Re: Why Strings as Classes? [C++ iterator]
On 2008-08-28 00:24:50 +0200, Dee Girl <deegirl@noreply.com> said:

> Derek Parnell Wrote:
> 
>> On Wed, 27 Aug 2008 17:08:47 -0400, Nick Sabalausky wrote:
>> 
>>> The way I see it, encapsulation is all about the black box idea. And the
>>> only things you can see from outside the black box are the inputs and
>>> outputs.
>> 
>> Well said.
> 
> I am sorry I will say my opinion. This sounds good but is simplistic. 
> Black box is good in principle. But it assume you found right interface 
> for the black box. If you define bad interface you have a bad black 
> box. Also please remember that iterator is black box also. But it 
> defines right interface.

I agree with the meaning, but I disagree with the example.
I think that iterators are an example of bad interface, as also others 
brought up the iterator as good example I though that I should say 
something.

An iterator should be like a generator, have a method next, and one 
at_end or something similar packaged (and maybe prev() and at_start() 
if it can also go back) in a single struct, furthermore it should work 
seamlessly with a kind of for_each(x;iterator) construct.

Instead C++ choose to have begin & end iterators, simply because with 
that construct it is trivial for the compiler to optimize it for 
arrays, and you can use pointers as iterators without a 
cast/constructor.

This means a worse interface for 99% of the uses, apart form arrays and 
vectors I think one is better off without end iterator, and even when 
this is not the case writing something like for_each(x;FromTo(a,b)), 
with FromTo constructing a generator is (I think) better than 
for(i=t.begin();i!=t.end();++t), and the implementation of an a 
generator itself is easier (no ==,!=,increment, decrement(pre/post),... 
for performance reasons)

As I believe that the optimizations to make the better interface be as 
efficient as the iterator one are perfectly doable (some work, yes, but 
not extremely much so), I see no good reason for C++ design.

Fawzi
August 28, 2008
Re: Why Strings as Classes? [C++ iterator]
On 2008-08-28 11:47:22 +0200, Fawzi Mohamed <fmohamed@mac.com> said:

> On 2008-08-28 00:24:50 +0200, Dee Girl <deegirl@noreply.com> said:
> 
>> Derek Parnell Wrote:
>> 
>>> On Wed, 27 Aug 2008 17:08:47 -0400, Nick Sabalausky wrote:
>>> 
>>>> The way I see it, encapsulation is all about the black box idea. And the
>>>> only things you can see from outside the black box are the inputs and
>>>> outputs.
>>> 
>>> Well said.
>> 
>> I am sorry I will say my opinion. This sounds good but is simplistic. 
>> Black box is good in principle. But it assume you found right interface 
>> for the black box. If you define bad interface you have a bad black 
>> box. Also please remember that iterator is black box also. But it 
>> defines right interface.
> 
> I agree with the meaning, but I disagree with the example.
> I think that iterators are an example of bad interface, as also others 
> brought up the iterator as good example I though that I should say 
> something.
> 
> An iterator should be like a generator, have a method next, and one 
> at_end or something similar packaged (and maybe prev() and at_start() 
> if it can also go back) in a single struct, furthermore it should work 
> seamlessly with a kind of for_each(x;iterator) construct.
> 
> Instead C++ choose to have begin & end iterators, simply because with 
> that construct it is trivial for the compiler to optimize it for 
> arrays, and you can use pointers as iterators without a 
> cast/constructor.
> 
> This means a worse interface for 99% of the uses, apart form arrays and 
> vectors I think one is better off without end iterator, and even when 
> this is not the case writing something like for_each(x;FromTo(a,b)), 
> with FromTo constructing a generator is (I think) better than 
> for(i=t.begin();i!=t.end();++t), and the implementation of an a 
> generator itself is easier (no ==,!=,increment, decrement(pre/post),... 
> for performance reasons)
> 
> As I believe that the optimizations to make the better interface be as 
> efficient as the iterator one are perfectly doable (some work, yes, but 
> not extremely much so), I see no good reason for C++ design.
> 
> Fawzi

Please note before the discussion gets messy, that I find the hierarchy 
and kinds of iterators in c++ well done, (input,output,...random), but 
the interface of them (especially the most used) flawed.
August 28, 2008
Re: Why Strings as Classes?
Nick Sabalausky wrote:
> "Don" <nospam@nospam.com.au> wrote in message 
> news:g95ks5$2aon$1@digitalmars.com...
>> Nick Sabalausky wrote:
>>> "Dee Girl" <deegirl@noreply.com> wrote in message 
>>> news:g94j7a$2875$1@digitalmars.com...
>>>> I appreciate your view point. Please allow me explain. The view point is 
>>>> in opposition with STL. In STL each algorithm defines what kind of 
>>>> iterator it operates with. And it requires what iterator complexity.
>>>>
>>>> I agree that other design can be made. But STL has that design. In my 
>>>> opinion is much part of what make STL so successful.
>>>>
>>>> I disagree that algorithm that knows complexity of iterator is concrete. 
>>>> I think exactly contrary. Maybe it is good that you read book about STL 
>>>> by Josuttis. STL algorithms are the most generic I ever find in any 
>>>> language. I hope std.algorithm in D will be better. But right now 
>>>> std.algorithm works only with array.
>>>>
>>>>> If an algoritm uses [] and doesn't know the
>>>>> complexity of the []...good! It shouldn't know, and it shouldn't care. 
>>>>> It's
>>>>> the code that sends the collection to the algoritm that knows and 
>>>>> cares.
>>>> I think this is mistake. Algorithm should know. Otherwise "linear find" 
>>>> is not "linear find"! It is "cuadratic find" (spell?). If you want to 
>>>> define something called linear find then you must know iterator 
>>>> complexity.
>>>>
>>> If a generic algorithm describes itself as "linear find" then I know damn 
>>> well that it's referring to the behavior of *just* the function itself, 
>>> and is not a statement that the function *combined* with the behavior of 
>>> the collection and/or a custom comparison is always going to be O(n).
>>>
>>> A question about STL: If I create a collection that, internally, is like 
>>> a linked list, but starts each indexing operation from the position of 
>>> the last indexing operation (so that a "find first" would run in O(n) 
>>> instead of O(n*n)), is it possible to send that collection to STL's 
>>> generic "linear find first"? I would argue that it should somehow be 
>>> possible *even* if the STL's generic "linear find first" guarantees a 
>>> *total* performance of O(n) (Since, in this case, it would still be O(n) 
>>> anyway). Because otherwise, the STL wouldn't be very extendable, which 
>>> would be a bad thing for a library of "generic" algorithms.
>> Yes, it will work.
>>
>>> Another STL question: It is possible to use STL to do a "linear find" 
>>> using a custom comparison? If so, it is possible to make STL's "linear 
>>> find" function use a comparison that just happens to be O(n)? If so, 
>>> doesn't that violate the linear-time guarantee, too? If not, how does it 
>>> know that the custom comparison is O(n) instead of O(1) or O(log n)?
>> This will work too.
>>
>> IF you follow the conventions THEN the STL gives you the guarantees.
> 
> I'm not sure that's really a "guarantee" per se, but that's splitting hairs.
> 
> In any case, it sounds like we're all arguing more or less the same point:
> 
> Setting aside the issue of "should opIndex be used and when?", suppose I 
> have the following collection interface and find function (not guaranteed to 
> compile):
> 
> interface ICollection(T)
> {
>     T getElement(index);
>     int getSize();
> }
> 
> int find(T)(ICollection(T) c, T elem)
> {
>     for(int i=0; i<c.size(); i++)
>     {
>  if(c.getElement(i) == elem)
>             return i;
>     }
> }
> 
> It sounds like STL's approach is to do something roughly like that and say:
> 
> "find()'s parameter 'c' should be an ICollection for which getElement() is 
> O(1), in which case find() is guaranteed to be O(n)"
> 
> What I've been advocating is, again, doing something like the code above and 
> saying:
> 
> "find()'s complexity is dependant on the complexity of the ICollection's 
> getElement(). If getElement()'s complexity is O(m), then find()'s complexity 
> is guaranteed to be O(m * n). Of course, this means that the only way to get 
> ideal complexity from find() is to use an ICollection for which getElement() 
> is O(1)".
> 
> But, you see, those two statements are effectively equivilent.

They are. But...
if you don't adhere to the conventions, your code gets really hard to 
reason about.

"This class has an opIndex which is in O(n). Is that OK?" Well, that 
depends on what it's being used for. So you have to look at all of the 
places where it is used.

It's much simpler to use the convention that opIndex _must_ be fast; 
this way the performance requirements for containers and algorithms are 
completely decoupled from each other. It's about good design.
August 28, 2008
Benchmark Design (was: Why Strings as Classes?)
bearophile wrote:

>> Biggest of all: using the wrong tool. I.e. using a hash map for a
>> maximal populated key range.
> 
> But if the hash machinery is good it must work well in this common
> situation too. 

This statement seems to be as true as the statement:

   But if a house is good it must perform well
   in the common situation of an off shore
   speed boat race.

The problem with your approach is, that you have close to no idea, 
whether your candidates are speed boats or houses. The only thing you 
know seems to be, that in both candidates some humans can live for some 
time.

Your first design seems to have placed both candidates off shore. Now 
you have introduced some randomness in the location.

However, you might only be designing some overly complicated tool for 
computing the percentage of landmass not covered by water on some  
random planet. 

-manfred

--
If life is going to exist in this Universe, then the one thing it 
cannot afford to have is a sense of proportion. (Douglas Adams)
August 28, 2008
Re: Why Strings as Classes? [C++ iterator]
On 2008-08-28 05:47:22 -0400, Fawzi Mohamed <fmohamed@mac.com> said:

> An iterator should be like a generator, have a method next, and one 
> at_end or something similar packaged (and maybe prev() and at_start() 
> if it can also go back) in a single struct, furthermore it should work 
> seamlessly with a kind of for_each(x;iterator) construct.

I perfectly agree with this. This is why I prefer much Objective-C's 
NSEnumerator over C++ iterators: they're much simpler to use.

Some time ago, Walter asked if he could change D arrays to have 
internally an end pointer instead of a length value. He said it was to 
allow arrays, or slices, to be used as iterator efficiently. I hope he 
hasn't given up on that idea since slices are much easier to understand 
and use than C++-style iterators.


> Instead C++ choose to have begin & end iterators, simply because with 
> that construct it is trivial for the compiler to optimize it for 
> arrays, and you can use pointers as iterators without a 
> cast/constructor.

I would add that if you are using many iterators for the same array 
(vector) in a given algorithm or in function parameters, having only 
one "end" pointer save some space.


-- 
Michel Fortin  
michel.fortin@michelf.com  
http://michelf.com/
August 28, 2008
Re: Why Strings as Classes?
bearophile wrote:
> Tomas Lindquist Olsen:
>> Might I ask where that site is?
> 
> I have sent you an email with information and more things, etc.
> 
> Bye,
> bearophile

Got it. Thanx :) I'll give it a go over the weekend :)

Tomas
August 28, 2008
Re: Prototypes (was: Why Strings as Classes?)
On 2008-08-28 13:33:47 +0200, "Manfred_Nowak" <svv1999@hotmail.com> said:

> Walter Bright wrote:
> 
>> I invite you to look at the code in internal/aaA.d and do some
>> testing!
> 
> This invitation is a red herring without the offer to change the
> language, because there exists no implementation for AA covering all
> possible use cases.
> 
> The bare minimum to get anything out of fiddling with the
> implemenations for AA is the posibility to use the results of adaptions
> without considerable overhead especially for the declarations. However,
> currenty I do not see any elegant solution, because types of
> implementation of maps are given implicitely.
> 
> At least something like prototyping seems to be necessary:
> 
> |    int[ char[]] map;
> |    Prototype= typeof( map);
> |    Prototype.implementation= MyAA;
> 
> where MyAA is some class type implementing the interface required for
> AA.
> 
> Are you willing to do something in this direction?

I think that the invitation should be read as the possibility to 
experiment with some changes for AA, see their effect, and if 
worthwhile provide them back, so that they can be applied to the 
"official" version.

making the standard version changeable seems just horrible form the 
portability and maintainability and clarity of the code: if the 
standard version is not ok for your use you should explicitly use 
another one, otherwise mixing codes that use two different standard 
versions becomes a nightmare.
On the other hand if you think that you can improve the standard 
version for everybody, changing internal/aaA.d is what you should do...

Fawzi
> 
> -manfred
August 28, 2008
Re: Why Strings as Classes?
"Nick Sabalausky" wrote
> "Steven Schveighoffer" wrote in message 
> news:g95b89$1jii$1@digitalmars.com...
>>
>> When someone sees an index operator the first thought is that it is a 
>> quick lookup.
>
> This seems to be a big part of the disagreement. Personally, I think it's 
> insane to look at [] and just assume it's a cheap lookup. That's was true 
> in pure C where, as someone else said, it was nothing more than a 
> shorthand for a specific pointer arithmentic operation, but carrying that 
> idea over to more modern languages (especially one that also uses [] for 
> associative arrays) is a big mistake.

Perhaps it is a mistake to assume it, but it is a common mistake.  And the 
expectation is intuitive.  You don't see people making light switches that 
look like outlets, even though it's possible.  You might perhaps make a 
library where opIndex is a linear search in your list, but I would expect 
that people would not use that indexing feature correctly.  Just as if I 
plug my lamp into the light switch that looks like an outlet, I'd expect it 
to get power, and be confused when it doesn't.  Except the opIndex mistake 
is more subtle because I *do* get what I actually want, but I just am not 
realizing the cost of it.

> The '+' operator means "add". Addition is typically O(1). But vectors can 
> be added, and that's an O(n) operation. Should opAdd never be used for 
> vectors?

As long as it's addition, I have no problem with O(n) operation (and it 
depends on your view of n).  Indexing implies speed, look at all other cases 
where indexing is used.  For addition to be proportional to the size of the 
element is natural and expected.

>> You can force yourself to think differently, but the reality is that most 
>> people think that because of the universal usage of square brackets 
>> (except for VB, and I feel pity for anyone who needs to use VB) to mean 
>> 'lookup by key', and usually this is only useful on objects where the 
>> lookup is quick ( < O(n) ).  Although there is no requirement, nor 
>> enforcement, the 'quick' contract is expected by the user, no matter how 
>> much docs you throw at them.
>>
>> Look, for instance, at Tango's now-deprecated LinkMap, which uses a 
>> linked-list of key/value pairs (copied from Doug Lea's implementation). 
>> Nobody in their right mind would use link map because lookups are O(n), 
>> and it's just as easy to use a TreeMap or HashMap.  Would you ever use 
>> it?
>>
>>> If you *really* need that sort of guarantee (and I can imagine it may be 
>>> useful in some cases), then the implementation of the guarantee does 
>>> *not* belong in the realm of "implements vs doesn't-implement a 
>>> particular operator overload". Doing so is an abuse of operator 
>>> overloading, since operator overloading is there for defining syntactic 
>>> sugar, not for acting as a makeshift contract.
>>
>> I don't think anybody is asking for a guarantee from the compiler or any 
>> specific tool.  I think what we are saying is that violating the 'opIndex 
>> is fast' notion is bad design because you end up with users thinking they 
>> are doing something that's quick.  You end up with people posting 
>> benchmarks on your containers saying 'why does python beat the pants off 
>> your list implementation?'.  You can say 'hey, it's not meant to be used 
>> that way', but then why can the user use it that way?  A better design is 
>> to nudge the user into using a correct container for the job by only 
>> supporting operations that make sense on the collections.
>>
>> And as far as operator semantic meaning, D's operators are purposely 
>> named after what they are supposed to do.  Notice that the operator for + 
>> is opAdd, not opPlus.  This is because opAdd is supposed to mean you are 
>> performing an addition operation.  Assigning a different semantic meaning 
>> is not disallowed by the compiler, but is considered bad design.  opIndex 
>> is supposed to be an index function, not a linear search.  It's not 
>> called opSearch for a reason.  Sure you can redefine it however you want 
>> semantically, but it's considered bad design.  That's all we're saying.
>>
>
> Nobody is suggesting using [] to invoke a search (Although we have talked 
> about using [] *in the search function's implementation*). Search means 
> you want to get the position of a given element, or in other words, 
> "element" -> search -> "key/index". What we're talking about is the 
> reverse: getting the element at a given position, ie, "key/index" -> [] -> 
> "element". It doesn't matter if it's an array, a linked list, or a 
> super-duper-collection-from-2092: that's still indexing, not searching.

It is a search, you are searching for the element whose key matches.  When 
the key can be used to lookup the element efficiently, we call that 
indexing.  Indexing is a more constrained form of searching.

-Steve
August 28, 2008
Re: Benchmark Design (was: Why Strings as Classes?)
Manfred_Nowak:
> This statement seems to be as true as the statement:<

No, it's a different statement.
I'll try to post another more real-looking benchmark later, anyway.

In the meantime I'll keep using Python for many of my purposes where I need hash maps and sets (and regular expressions, etc), instead of D, because in tons of real-world tests I have seen Python dicts are quite faster.

Bye,
bearophile
August 28, 2008
Re: Why Strings as Classes? [C++ iterator]
2008/8/28 Fawzi Mohamed <fmohamed@mac.com>:
> On 2008-08-28 00:24:50 +0200, Dee Girl <deegirl@noreply.com> said:
>
>> Derek Parnell Wrote:
>>
>>> On Wed, 27 Aug 2008 17:08:47 -0400, Nick Sabalausky wrote:
>>>
>>>> The way I see it, encapsulation is all about the black box idea. And the
>>>> only things you can see from outside the black box are the inputs and
>>>> outputs.
>>>
>>> Well said.
>>
>> I am sorry I will say my opinion. This sounds good but is simplistic.
>> Black box is good in principle. But it assume you found right interface for
>> the black box. If you define bad interface you have a bad black box. Also
>> please remember that iterator is black box also. But it defines right
>> interface.
>
> I agree with the meaning, but I disagree with the example.
> I think that iterators are an example of bad interface, as also others
> brought up the iterator as good example I though that I should say
> something.
>
> An iterator should be like a generator, have a method next, and one at_end
> or something similar packaged (and maybe prev() and at_start() if it can
> also go back) in a single struct, furthermore it should work seamlessly with
> a kind of for_each(x;iterator) construct.
>
> Instead C++ choose to have begin & end iterators, simply because with that
> construct it is trivial for the compiler to optimize it for arrays, and you
> can use pointers as iterators without a cast/constructor.
>
> This means a worse interface for 99% of the uses, apart form arrays and
> vectors I think one is better off without end iterator, and even when this
> is not the case writing something like for_each(x;FromTo(a,b)), with FromTo
> constructing a generator is (I think) better than
> for(i=t.begin();i!=t.end();++t), and the implementation of an a generator
> itself is easier (no ==,!=,increment, decrement(pre/post),... for
> performance reasons)

It is worse in simple "for each element" uses, but one of the cooler
things about C++ iterators is exactly that they are exchangeable for
any pointer-like object, and that it's trivial to iterate over a
subset of a container. Emulating pointers means that in some ways,
once you grok how they work, they are more trivial than e.g. Python or
D since there is no further specialised syntax beyond "*", "++", and
maybe "--".

I love the concise D and Python syntaxes, just playing devil's advocate here. :)


>
> As I believe that the optimizations to make the better interface be as
> efficient as the iterator one are perfectly doable (some work, yes, but not
> extremely much so), I see no good reason for C++ design.
>
> Fawzi
>
>
14 15 16 17 18 19 20 21
Top | Discussion index | About this forum | D home