[Issue 1323] Implement opIn_r for arrays (page 2)

July 11, 2007

[Issue 1323] Implement opIn_r for arrays

Posted by d-bugmail
in reply to d-bugmail

Permalink

d-bugmail

Posted in reply to d-bugmail

Permalink

http://d.puremagic.com/issues/show_bug.cgi?id=1323





------- Comment #10 from wbaxter@gmail.com  2007-07-10 20:55 -------
(In reply to comment #9)
> (In reply to comment #4)
> <snip>
> >> This inconsistency might be undesirable from a generic programming POV.
> > 
> > I can't see how.  Arrays and associative arrays are different beasts.  You can't really generically interchange the two, regardless of what the behavior of 'in' is, because of other semantic differences.
> 
> Other than in how you can add/remove elements, are there many differences that actually compromise their interchangeability?

Well the biggies to me are
1) "keys" in a linear array must be of type size_t.
2) "keys" in a linear array must contiguous.
   (You can say you'll put a special value in the slots you're not using, but
the hypothetical built-in 'in' operator for arrays will have no knowledge of
your convention.)

We can represent any set!(T) using an array T[].
We cannot represent all AAs T[K] using an array, because we need two arbitrary
types and arrays only have one.

I can see that it might be nice to be able to write generically:
   if (x in y)
      return y[x];

But if you really need to write an algorithm that works generically on both arrays and AAs, you're going to need a whole lot more than that one syntactic similarity to make it work.  So you might as well include something like the 'contains' functions that Regan posted.

Ok, I can see that pretty much any argument here can be posed either way. Here's the way I see it:

LA's as a special case of AA's.
* Special case, only very particular AA's can be implemented using an LA
* Syntactically/theoretically simple -- x in y  means you can say y[x]
* Use case is kind of specialized
    (Only applies if you're treating contiguous size_t's as keys)
* No real need:
    - Checking if an index is in range is already easy (i<A.length).
    - If you need an AA just use an AA.  It's built-in!
      (I believe it's pretty uncommon not to know up front if you'll need to
support discontiguous indexes or not).
* Leads to uninuitive code like
     assert(2 in ["Hi", "There", "Bob"]);
     (Ask five random beginning programmers if this is true or not...)
* Lacks precedent?  Is there some other language where this is used?

LA's as sets:
* General case -- a set can always be implemented as an LA
* Syntactically non-uniform "x in y"==>"x[y] ok" for AA, but not for LA.
  But then again lots of other syntax is different between AA and LA too:
     ~= doesn't work on AA
     assigning length doesn't work on AA
     .remove, .rehash, .keys, .values don't work on LA
     foo["hi"] doesn't work on an any type of LA
* Plenty of use cases from everyday coding in the trenches:
    char[][] what_we_seek = ["hi", "there", "bob"];
    foreach(w; words) { if(w in what_we_seek) { return true; }
* Useful
    - Checking if a value is in an array is not nearly as easy as checking if
an index is valid for the array.
    - D lacks any sort of built in set or bag, so makes arrays more usable as
"poor man's set/bag"
* Intuitive:  assert("Bob" in ["Jill","Jim","Bob"])
     (Again, ask five random beginning programmers if this is true or not...)
* Has precedent in Python's behavior.


--

http://d.puremagic.com/issues/show_bug.cgi?id=1323 ------- Comment #11 from wbaxter@gmail.com 2007-07-10 20:59 ------- > * Syntactically non-uniform "x in y"==>"x[y] ok" for AA, but not for LA. Whoops, 'course that should be "x in y" ==> "y[x] ok". --

http://d.puremagic.com/issues/show_bug.cgi?id=1323 ------- Comment #12 from wbaxter@gmail.com 2007-11-04 15:39 ------- Walter weighs in on the NG: Walter Bright wrote: > Denton Cockburn wrote: >> I just figure it's done often enough that adding it wouldn't be hard, especially considering the keyword is already there, and with the same general meaning. > > With AAs, the 'in' expression looks for the key, not the value. It would be inconsistent to overload it to search for values in regular arrays. digitalmars.com digitalmars.D:61103 http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=61103 --

http://d.puremagic.com/issues/show_bug.cgi?id=1323 bearophile_hugs@eml.cc changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |bearophile_hugs@eml.cc --- Comment #13 from bearophile_hugs@eml.cc 2010-08-12 16:32:15 PDT --- Associative arrays are a set (where each item has associated one value), where item order doesn't matter. While arrays are an ordered sequence, each item has one or less predecessor and one or less successor. In both cases I often want to know if an item is present in this set or this ordered sequence. Walter is wrong on this. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------

http://d.puremagic.com/issues/show_bug.cgi?id=1323 Andrei Alexandrescu <andrei@metalanguage.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED CC| |andrei@metalanguage.com Resolution| |WONTFIX -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------

http://d.puremagic.com/issues/show_bug.cgi?id=1323 --- Comment #14 from bearophile_hugs@eml.cc 2011-01-08 15:10:38 PST --- The reverse "in" operator for built-in arrays is a really useful feature, so useful that "Although practicality beats purity." as Python Zen says. I am not reopening this yet, but closing this needs some discussion, it's not a uncontroversial issue. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------

http://d.puremagic.com/issues/show_bug.cgi?id=1323 --- Comment #15 from Roy Crihfield <rscrihf@gmail.com> 2012-10-30 21:32:47 PDT --- I think this absolutely needs to be reconsidered. The success of the same semantics in Python should be enough proof of the convenience and low risk of this feature. On top of that, the `in' operator is overloadable anyway, meaning it already has semantic ambiguity for different types. Should `~' be considered an issue since it can accept both T and T[]? Practicality seems preferable here. I think this should be reopened. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------

http://d.puremagic.com/issues/show_bug.cgi?id=1323 Jonathan M Davis <jmdavisProg@gmx.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jmdavisProg@gmx.com --- Comment #16 from Jonathan M Davis <jmdavisProg@gmx.com> 2012-10-30 22:00:03 PDT --- > I think this absolutely needs to be reconsidered. The success of the same semantics in Python should be enough proof of the convenience and low risk of this feature. > On top of that, the `in' operator is overloadable anyway, meaning it already has semantic ambiguity for different types. Should `~' be considered an issue since it can accept both T and T[]? > Practicality seems preferable here. I think this should be reopened. We care a lot more about efficiency than python does, and we hold to one of the key concepts that C++ does - that certain operators and operations need to have a maximum algorithmic complexity. A prime example of this is std.container. All of the common operations have a maximum algorithmic complexity, and any container which can't implement them with tha complexity, doesn't have them. For instance, SList and DList don't have a length property, because they can't implement it in O(1). The same goes for range-base operations. front, popFront, length, slicing, etc. have to be O(1). If they can't be, then they're not implemented. That's why narrow strings aren't sliceable, have no length, and don't provide random access. They can't do it in O(1). In the case of in, it's required to have at most O(log n), which is what it takes to search for an element in a balanced binary tree. AAs can do it in O(1), which is even better, so they get in. Dynamic arrays require O(n), which is worse than O(log n), so they don't have it and never will. If we allowed them to have in, then functions could not rely on in's algorithmic complexity, which would make it useless for generic code. Of course, it's true that anyone can overload an operator or define a function which has worse algorithmic complexity than it's required to have (e.g. they define length on a range which has to count all of its elements to get its length, making it O(n) instead of the required O(1)), but in that case, the programmer screwed up, and it's their fault when algorithms using their type have horrible performance. But as long as they implement those functions with the correct algorithmic complexity, then algorithms using those functions can guarantee a certain level of performance. They can't do that if they can't rely on the maximum algorithmic complexity of the functions and operations that they use. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------

Forums