October 08, 2010 Re: "in" everywhere | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | Steven Schveighoffer Wrote:
> On Thu, 07 Oct 2010 21:18:56 -0400, Juanjo Alvarez <fake@fakeemail.com> wrote:
>
> > On Thu, 7 Oct 2010 15:53:13 -0700, Jonathan M Davis <jmdavisProg@gmx.com> wrote:
> >> Except that when you're dealing with generic code which has to deal
> > with
> >> multiple container types (like std.algorithm), you _need_ certain
> > complexity
> >> guarantees about an operation since it could happen on any
> > container that it's
> >
> > Then, don't use it in std.algorithm or any other code that needs guaranteed complexity, just like now. I don't see the problem with a generic "in" operator, nobody would be forced to use it.
>
> That kind of "documentation" is useless, it doesn't prevent use, and it doesn't feel right to the person who accidentally uses it. When I call
>
> sort(x);
>
> and it performs horribly, am I going to blame x or sort? Certainly, I'll never think it's my own fault :)
>
> -Steve
True! And that's the only drawback I see on generalizing "in", but there are many things in programming languages that doesn't feel right when you don't
know the language well. That doesn't mean that D should be the "programming for dummies on rails with a constant automated tutor included" language; if I
read well the site, it is mean to be a practical language with the ability to shot yourself in the foot.
Still, I don't understand how generalizing "in" could affect std.algorithm et al if they only use "in" for AAs, just like now.
| |||
October 08, 2010 Re: "in" everywhere | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Juanjo Alvarez | Juanjo Alvarez <juanjux@gmail.com> wrote: > True! And that's the only drawback I see on generalizing "in", but there are many things in programming languages that doesn't feel right when you don't > know the language well. That doesn't mean that D should be the "programming for dummies on rails with a constant automated tutor included" language; if I > read well the site, it is mean to be a practical language with the ability to shot yourself in the foot. Absolutely. And one of its trademarks is being as fast as C. Now, C clearly does not have the 'in' operator, but it is a goal in D that the obvious way to do something should be fast and correct. > Still, I don't understand how generalizing "in" could affect std.algorithm et al if they only use "in" for AAs, just like now. Because if 'in' is available for other uses for other containers, one would be tempted to use it also there. The alternative is to put it in the coding standards: 43. Thou shalt not use magic numbers. 44. Thou shalt not use 'in', as it may be slow as heck. 45. Thou shalt not write spaghetti code. Nor macaroni. This would make the feature useless. -- Simen | |||
October 08, 2010 Re: "in" everywhere | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Juanjo Alvarez | On Fri, 08 Oct 2010 09:46:54 -0400, Juanjo Alvarez <juanjux@gmail.com> wrote: > Steven Schveighoffer Wrote: >> That kind of "documentation" is useless, it doesn't prevent use, and it >> doesn't feel right to the person who accidentally uses it. When I call >> >> sort(x); >> >> and it performs horribly, am I going to blame x or sort? Certainly, I'll >> never think it's my own fault :) >> >> -Steve > > True! And that's the only drawback I see on generalizing "in", but there are many things in programming languages that doesn't feel right when you don't > know the language well. That doesn't mean that D should be the "programming for dummies on rails with a constant automated tutor included" language; if I > read well the site, it is mean to be a practical language with the ability to shot yourself in the foot. > > Still, I don't understand how generalizing "in" could affect std.algorithm et al if they only use "in" for AAs, just like now. Let's move off of in for a minute, so I can illustrate what I mean. in is kind of a non-universal operator (other languages define it differently). Let's try indexing. If you see the following code: for(int i = 0; i < x.length; i++) x[i]++; If you see this code, you might think it's O(n). But let's say the author of x didn't care about complexity, and just felt like [] should be for indexing, no matter what the cost. Then x could possibly be a linked list, and each index operation is O(n), then this block of code becomes O(n^2), and your performance suffers. You may not notice it, it might be "acceptable", but then somewhere down the road you start calling this function more, and your program all of a sudden gets really slow. What gives? Let's say you spend 1-2 hours looking for this and find out that the problem is that indexing x is linear, you can change the loop to this: foreach(ref i; x) i++; and all of a sudden your performance comes back. Maybe the author of x puts right in his docs that indexing is an O(n) operation. You might grumble about it, and move on. But if this happens all the time, you are just going to start blaming the author of x more than your own incompetence. It's one of those things where user interface designers get it, and most engineers don't -- people have certain flaws, and a big one is not reading the manual. Making the interface as intuitive as possible is very important for the success of a product. But what if we made the language such that this *couldn't possibly happen* because you don't allow indexing on linked lists? This has the two very good properties: 1. It makes the user more aware of the limitations, even though the syntax is harder to use (hm.. it doesn't let me use indexing, there must be a good reason). 2. It makes *even experienced users* avoid this bug because they can't possibly compile it. #2 is what I care about most. As an experienced developer, I still might make the above mistake (look at Walter's mistake on the compiler that I mentioned earlier). We don't want to set minefields for experienced developers if we can help it. Yes they can shoot themselves in the foot, but we don't want to remove the safety on the gun so they are *more likely* to shoot themselves in the foot. -Steve | |||
October 08, 2010 Re: "in" everywhere | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Juanjo Alvarez | Juanjo Alvarez wrote:
> Steven Schveighoffer Wrote:
>
>> On Thu, 07 Oct 2010 21:18:56 -0400, Juanjo Alvarez <fake@fakeemail.com> wrote:
>>
>> > On Thu, 7 Oct 2010 15:53:13 -0700, Jonathan M Davis <jmdavisProg@gmx.com> wrote:
>> >> Except that when you're dealing with generic code which has to deal
>> > with
>> >> multiple container types (like std.algorithm), you _need_ certain
>> > complexity
>> >> guarantees about an operation since it could happen on any
>> > container that it's
>> >
>> > Then, don't use it in std.algorithm or any other code that needs guaranteed complexity, just like now. I don't see the problem with a generic "in" operator, nobody would be forced to use it.
>>
>> That kind of "documentation" is useless, it doesn't prevent use, and it doesn't feel right to the person who accidentally uses it. When I call
>>
>> sort(x);
>>
>> and it performs horribly, am I going to blame x or sort? Certainly, I'll never think it's my own fault :)
>>
>> -Steve
>
> True! And that's the only drawback I see on generalizing "in", but there are many things in programming languages that doesn't feel right when you don't know the language well. That doesn't mean that D should be the "programming for dummies on rails with a constant automated tutor included" language; if I read well the site, it is mean to be a practical language with the ability to shot yourself in the foot.
>
> Still, I don't understand how generalizing "in" could affect std.algorithm et al if they only use "in" for AAs, just like now.
Suppose I would like to use a faster AA (for some cases) and define opIn with good lookup behavior. Then the algorithms in phobos would think my opIn is crap, what now? All code using regular AA's could have just been replaced by my super- duper user-defined AA if it were not for this generalizing of OpIn. So we need another operation that replaces the previously fast opIn and make phobos use that instead. But why bother if we have a perfectly good one to begin with?
| |||
October 08, 2010 Re: "in" everywhere | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Juanjo Alvarez | On 10/08/2010 01:45 PM, Juanjo Alvarez wrote: > Pelle Wrote: > >> On 10/08/2010 03:18 AM, Juanjo Alvarez wrote: >>> On Thu, 7 Oct 2010 15:53:13 -0700, Jonathan M Davis >>> <jmdavisProg@gmx.com> wrote: >>>> Except that when you're dealing with generic code which has to deal >>> with >>>> multiple container types (like std.algorithm), you _need_ certain >>> complexity >>>> guarantees about an operation since it could happen on any >>> container that it's >>> >>> Then, don't use it in std.algorithm or any other code that needs >>> guaranteed complexity, just like now. I don't see the problem with a >>> generic "in" operator, nobody would be forced to use it. >> >> What do you suggest for fast lookup in a container? > > What is being used now? How can having "in" and not using it (just like now) in functions requiring guaranteed complexity can be worse than not having it? If I write a generic algorithm, I can use opIn and assume it is fast. If I don't need the speed, I can use canFind over the container's range instead. If we say opIn can be slow, the fast version goes away. > > The only drawback I can see to having an "in" operator with all containers is that some programmers would not read the documentation and use it expecting it to be fast. But then that also happens with many other language constructs and some programmers will write crappy algoritms anyway. | |||
October 08, 2010 Re: "in" everywhere | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer Attachments:
| Steven Schveighoffer wrote: > On Thu, 07 Oct 2010 16:23:47 -0400, Rainer Deyke <rainerd@eldwood.com> wrote: > >> On 10/7/2010 13:57, Andrei Alexandrescu wrote: >>> On 10/7/10 14:40 CDT, bearophile wrote: >>>> Another solution is just to accept O(n) as the worst complexity for the "in" operator. I don't understand what's the problem in this. >>> >>> That means we'd have to define another operation, i.e. "quickIn" that >>> has O(log n) bound. >> >> Why? >> >> I can't say I've ever cared about the big-O complexity of an operation. > > Then you don't understand how important it is. If big O complexity is so important, then why does everyone use quicksort (which is O(n**2)) and not heap sort or merge sort (which are O(n*log(n)))? Jerome -- mailto:jeberger@free.fr http://jeberger.free.fr Jabber: jeberger@jabber.fr | |||
October 08, 2010 Re: "in" everywhere | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Jérôme M. Berger Attachments:
| 2010/10/8 "Jérôme M. Berger" <jeberger@free.fr>
> Steven Schveighoffer wrote:
> > On Thu, 07 Oct 2010 16:23:47 -0400, Rainer Deyke <rainerd@eldwood.com> wrote:
> >
> >> On 10/7/2010 13:57, Andrei Alexandrescu wrote:
> >>> On 10/7/10 14:40 CDT, bearophile wrote:
> >>>> Another solution is just to accept O(n) as the worst complexity for the "in" operator. I don't understand what's the problem in this.
> >>>
> >>> That means we'd have to define another operation, i.e. "quickIn" that
> >>> has O(log n) bound.
> >>
> >> Why?
> >>
> >> I can't say I've ever cared about the big-O complexity of an operation.
> >
> > Then you don't understand how important it is.
>
> If big O complexity is so important, then why does everyone use
> quicksort (which is O(n**2)) and not heap sort or merge sort (which
> are O(n*log(n)))?
>
> Jerome
> --
> mailto:jeberger@free.fr
> http://jeberger.free.fr
> Jabber: jeberger@jabber.fr
>
>
Because on average quicksort is faster than heap sort, and uses much less space than merge sort. Also, trivial guards can be put in place to avoid running quicksort in a worst case (pre-sorted data) scenario.
| |||
October 08, 2010 Re: "in" everywhere | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | Jonathan M Davis wrote: > On Friday 08 October 2010 03:06:01 Stanislav Blinov wrote: >> [...] >> What I mean is you'll always have algorithms that will perform >> differently for different containers, and you'll always have to choose >> containers that best fit your needs [...] > > All true. However, the point is that operations need to have a know complexity. If in is known to be O(1), the algorithms can safely use it, knowing that it's O(1). Any container that can't do it in O(1) doesn't implement in. You could, on the other hand, have in be O(n) (which doesn't stop containers from implementing it as O(1) since that's better than O(n)), and then any algorithm that uses it has to assume that it could be O(n) and therfore may need to avoid it. The real question is what complexity we want to define it as. At the moment, the only container to use it are the built-in associative arrays, and for them, it's O(1). Since find() is already going to be O(n), it seems to me that in should be better than O(n) - be it O(1) or O(log n) - and Andrei appears to agree, but others don't, hence this long discussion... > Ahh, I think I get the perspective now, though I had to reread the whole thread two times. Thank you. | |||
October 08, 2010 Re: "in" everywhere | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Jérôme M. Berger | "Jérôme M. Berger" wrote:
> Steven Schveighoffer wrote:
>> On Thu, 07 Oct 2010 16:23:47 -0400, Rainer Deyke <rainerd@eldwood.com> wrote:
>>
>>> On 10/7/2010 13:57, Andrei Alexandrescu wrote:
>>>> On 10/7/10 14:40 CDT, bearophile wrote:
>>>>> Another solution is just to accept O(n) as the worst complexity for the "in" operator. I don't understand what's the problem in this.
>>>>
>>>> That means we'd have to define another operation, i.e. "quickIn" that
>>>> has O(log n) bound.
>>>
>>> Why?
>>>
>>> I can't say I've ever cared about the big-O complexity of an operation.
>>
>> Then you don't understand how important it is.
>
> If big O complexity is so important, then why does everyone use
> quicksort (which is O(n**2)) and not heap sort or merge sort (which
> are O(n*log(n)))?
>
> Jerome
For average case big O and dwarfing heap/merge sort in the constant factor. But in fact its not totally true that everyone uses quicksort pur sang: most sort implementations deal with the worst case of quicksort, by switching to heapsort for example. It is exactly because of this behavior.
| |||
October 09, 2010 Re: "in" everywhere | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Juanjo Alvarez | On Fri, 08 Oct 2010 14:45:43 +0300, Juanjo Alvarez <juanjux@gmail.com> wrote: > Pelle Wrote: > > The only drawback I can see to having an "in" operator with all containers is that some programmers would not read the documentation and use it expecting it to be fast. But then that also happens with many other language constructs and some programmers will write crappy algoritms anyway. This is pretty much it. I would never use a language which is designed for "some programmers". -- Using Opera's revolutionary e-mail client: http://www.opera.com/mail/ | |||
Copyright © 1999-2021 by the D Language Foundation
Permalink
Reply