On 12/18/21 3:46 PM, Dennis wrote:
> On Saturday, 18 December 2021 at 18:47:33 UTC, Adam Ruppe wrote:
> The in
operator on AAs has a speed of O(log n) or O(1). in
on slices would exceed that being O(n).
Who do people keep repeating that?
- Worst case time complexity of the
in
operator in AA's is actually O(n). This can easily happen in practice when your toHash
has a poor implementation, which I have had before
Well, yeah. But that doesn't mean it happens often or even regularly. You need to have a bad toHash
function.
We shouldn't lower the bar for in
just because it's possible to make it perform poorly.
>
- In custom types you can overload the
in
operator to something with arbitrarily bad time complexity.
There's your answer -- make a custom type that allows in
with an array-like structure.
Or make a UFCS function.
>
- Despite scaling with O(n), a linear scan is pretty fast. You can probably scan your entire RAM in a matter of seconds.
Complexity is not some old-fashioned idea that has no relevance on today's hardware. Linear searching is a real bottleneck, and identifying in
to be sub-linear is a useful design decision.
If you are searching your entire memory by using an opCmp
, I bet it will be more than a few seconds.
> The whole "it ruins performance guarantees" so far seems completely hypothetical and is not something that came from experience with actual code.
The design is sound. You use in
, you get sub-linear (i.e. fast) performance. That kind of expectation is useful from a generic point of view. It's similar to opIndex for a linked list -- it's possible, it makes it so you can used linked lists in places that might expect arrays, but generic code that uses it might perform really badly as the code expects opIndex to be fast.
> The problem with in
for slices to me is that it would be inconsistent with how it works for AAs, where in
searches for a key and returns a pointer to a value. The same for slices would look like this:
[snip]
> But people suggest in
to search a value in a slice.
Yes, this is a good point against. But the thing is, we also have find
, which works fine, I'm not sure why we need in
support.
> I've once overloaded the in
operator for a custom 2D array type used to represent a maze in which path finding was done. There it was useful for bounds checking since writing comparisons with width
and height
all the time is cumbersome. I wouldn't mind giving in
this meaning so you can do easy bounds checks in slices as well, but I think it will be to confusing for newcomers expecting in
to work like in Python.
Yes, that's not a terrible use case for it. But you are right that in
will be confusing to newcomers, especially when you learn that an array is a collection of data. if(x in arr)
doesn't seem to suggest that it will be looking at indexes.
-Steve