[Issue 4625] New: "in" operator for AAs in SafeD code

August 11, 2010
Posted by bearophile_hugs@eml.cc
Permalink
bearophile_hugs@eml.cc
Permalink
http://d.puremagic.com/issues/show_bug.cgi?id=4625

           Summary: "in" operator for AAs in SafeD code
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: druntime
        AssignedTo: sean@invisibleduck.org
        ReportedBy: bearophile_hugs@eml.cc


--- Comment #0 from bearophile_hugs@eml.cc 2010-08-11 15:19:01 PDT ---
This comes after a short discussion in D.learn, where simendsjo in D.learn has shown few examples.

I presume "in" done on an Associative Array currently returns a pointer to remove dual AA lookup in a common situation:

auto ptr = x in aa;
if (ptr) {
    // do something with *ptr
} else {
    // do something else
}



But this code shows that it's hard to accept in SafeD the currently designed "in" operator for associative arrays:

void main() {
    auto aa = [1 : 2];
    auto p1 = 1 in aa;
    aa.rehash;
    // p1 invalidated by rehashing

    auto p2 = 1 in aa;
    aa.remove(1);
    // p2 invalidated by removal
}


On the other hand "x in AA" is a basic operation that I need to perform in SafeD code too.

I can see two possible solutions, but I like only the second one:

----------------

1)

This first solutions needs two changes at the same time:
- "in" done on associative arrays always returns a bool, this is memory safe.
- improve the optimizer part of the compiler so it is able to remove most cases
of dual lookups in AAs.

If the compiler is naive then code like this:

if (x in aa) {
    auto value = aa[x];
    // ...
}

requires two searches inside the hash, the first to tell if the key is present, and the second to find it again and fetch its value.

A better compiler (LDC1 is already able to do this) can recognize that the code is performing two nearby key searches with the same key, and it can remove the second one, essentially replacing that code with this one:

auto __tmp = x in aa;
if (__tmp) {
    auto value = *__tmp;
    // ...
}

If the key is removed or a rehash is performed, the compiler doesn't perform that optimization. This is good in theory, but in practice sometimes there is some distance between the first and second lookup, so I think sometimes the compiler may not be able to optimize away the second lookup.

----------------

2) So I prefer a second solution that has less demands on the optimizer:

- Add to AAs a contains() method that always return a boolean.
- In SafeD code Disallow the "in" operator for AAs. So in SafeD code you can
use aa.contains().
- Improve the optimizer a bit so it's able to remove some cases of dual lookups
in AAs (both in SafeD and non SafeD code).

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Forums