February 14, 2009
On Sat, Feb 14, 2009 at 1:17 PM, Bill Baxter <wbaxter@gmail.com> wrote:

> On Sat, Feb 14, 2009 at 1:03 PM, Andrei Alexandrescu < SeeWebsiteForEmail@erdani.org> wrote:
>
>> bearophile wrote:
>>
>>> Andrei Alexandrescu:
>>>
>>>> Say at some point there are k available (not taken) slots out of "n". There is a k/n chance that a random selection finds an unoccupied slot. The average number of random trials needed to find an unoccupied slot is proportional to 1/(k/n) = n/k. So the total number of random trials to span the entire array is quadratic. Multiplying that by 0.9 leaves it quadratic.
>>>>
>>>
>>> It's like in hashing: if you want to fill 99% of the available space in a hash, then you take ages to find empty slots. But if you fill it only at 75-90%, then on average you need only one or two tries to find an empty slot. So your time is linear, with a small multiplicative constant. When the slots start to get mostly full, you change algorithm, copying the empty slots elsewhere.
>>>
>>
>> Well I don't buy it. If you make a point, you need to be more precise than such hand-waving. It's not like in hashing. It's like in the algorithm we discuss. If you make a clear point that your performance is better than O(n*n) by stopping at 90% then make it. I didn't go through much formalism, but my napkin says you're firmly in quadratic territory.
>
>
> Well he has a point that the number of trials required to find an empty depends not on the absolute number of empty items, but only the ratio of empties to fulls.   Even your own claim about average number of trials was n/k -- not sure how you got that though.  If he stops when that reaches a maximum of 9 then the algo can't be quadratic up to that point.  It's O(n) expected, with a possibly fairly high hidden constant.  It also has a fairly long tail, as in there's a small probability of it taking a very long time to complete.
>

... the real gotcha is that he's using O(n) extra memory.   10% of n is
still O(n) memory for shuffling the last 10% of the items.


--bb


February 14, 2009
Bill Baxter wrote:
> On Sat, Feb 14, 2009 at 1:03 PM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org <mailto:SeeWebsiteForEmail@erdani.org>> wrote:
> 
>     bearophile wrote:
> 
>         Andrei Alexandrescu:
> 
>             Say at some point there are k available (not taken) slots out of
>             "n". There is a k/n chance that a random selection finds an
>             unoccupied slot. The average number of random trials needed
>             to find
>             an unoccupied slot is proportional to 1/(k/n) = n/k. So the
>             total
>             number of random trials to span the entire array is quadratic.
>             Multiplying that by 0.9 leaves it quadratic.
> 
> 
>         It's like in hashing: if you want to fill 99% of the available space
>         in a hash, then you take ages to find empty slots. But if you
>         fill it
>         only at 75-90%, then on average you need only one or two tries to
>         find an empty slot. So your time is linear, with a small
>         multiplicative constant. When the slots start to get mostly
>         full, you
>         change algorithm, copying the empty slots elsewhere.
> 
> 
>     Well I don't buy it. If you make a point, you need to be more
>     precise than such hand-waving. It's not like in hashing. It's like
>     in the algorithm we discuss. If you make a clear point that your
>     performance is better than O(n*n) by stopping at 90% then make it. I
>     didn't go through much formalism, but my napkin says you're firmly
>     in quadratic territory.
> 
>  Well he has a point that the number of trials required to find an empty depends not on the absolute number of empty items, but only the ratio of empties to fulls.   Even your own claim about average number of trials was n/k -- not sure how you got that though.

If you toss a N-side dice hoping for a specific face to show up (and stopping afterwards), how many times do you have to toss it on average? I recall (without being sure) that you need to toss it a number of times proportional to N. Could anyone confirm or deny?

Now, assuming the above is true, say we decide to do the linear search for a fraction f of the n elements in the array. The average number of steps will be:

S = n * (1/n + 1/(n - 1) + 1/(n - 2) + ... + 1/(n - fn))

So we're looking for the harmonic series that does not start from 1, but instead starts from its fn'th term (you drop the first (1-f)n terms). Does it converge? (I don't know.)


Andrei
February 14, 2009
On Sat, Feb 14, 2009 at 2:20 PM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> Bill Baxter wrote:
>>
>> On Sat, Feb 14, 2009 at 1:03 PM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org <mailto:SeeWebsiteForEmail@erdani.org>> wrote:
>>
>>    bearophile wrote:
>>
>>        Andrei Alexandrescu:
>>
>>            Say at some point there are k available (not taken) slots out
>> of
>>            "n". There is a k/n chance that a random selection finds an
>>            unoccupied slot. The average number of random trials needed
>>            to find
>>            an unoccupied slot is proportional to 1/(k/n) = n/k. So the
>>            total
>>            number of random trials to span the entire array is quadratic.
>>            Multiplying that by 0.9 leaves it quadratic.
>>
>>
>>        It's like in hashing: if you want to fill 99% of the available
>> space
>>        in a hash, then you take ages to find empty slots. But if you
>>        fill it
>>        only at 75-90%, then on average you need only one or two tries to
>>        find an empty slot. So your time is linear, with a small
>>        multiplicative constant. When the slots start to get mostly
>>        full, you
>>        change algorithm, copying the empty slots elsewhere.
>>
>>
>>    Well I don't buy it. If you make a point, you need to be more
>>    precise than such hand-waving. It's not like in hashing. It's like
>>    in the algorithm we discuss. If you make a clear point that your
>>    performance is better than O(n*n) by stopping at 90% then make it. I
>>    didn't go through much formalism, but my napkin says you're firmly
>>    in quadratic territory.
>>
>>  Well he has a point that the number of trials required to find an empty
>> depends not on the absolute number of empty items, but only the ratio of
>> empties to fulls.   Even your own claim about average number of trials was
>> n/k -- not sure how you got that though.
>
> If you toss a N-side dice hoping for a specific face to show up (and stopping afterwards), how many times do you have to toss it on average? I recall (without being sure) that you need to toss it a number of times proportional to N. Could anyone confirm or deny?

The probability of *not* getting the number after t tries is (1-1/n)^t, so it's just a pure decaying exponential, the chance of getting it after t tries should be 1 minus that.  The average number of tries is going to be some kind of integral of that curve, and an integral of an exponential is still an exponential, so it seems unlikely to me that your memory was correct on this one.

> Now, assuming the above is true, say we decide to do the linear search for a fraction f of the n elements in the array.

Maybe you were switching subjects (to your algorithm?) here, but just to be clear, Bearophile didn't say anything about doing a linear search.  His first phase is just to do repeated dart-throwing trials till an empty is found.

>. The average number of steps will
> be:
>
> S = n * (1/n + 1/(n - 1) + 1/(n - 2) + ... + 1/(n - fn))
>
> So we're looking for the harmonic series that does not start from 1, but
> instead starts from its fn'th term (you drop the first (1-f)n terms). Does
> it converge? (I don't know.)

In the limit of what?  You seem to be asking if a finite series converges.

--bb
February 14, 2009
Bill Baxter wrote:
> On Sat, Feb 14, 2009 at 2:20 PM, Andrei Alexandrescu
> <SeeWebsiteForEmail@erdani.org> wrote:
>> Bill Baxter wrote:
>>> On Sat, Feb 14, 2009 at 1:03 PM, Andrei Alexandrescu
>>> <SeeWebsiteForEmail@erdani.org <mailto:SeeWebsiteForEmail@erdani.org>>
>>> wrote:
>>>
>>>    bearophile wrote:
>>>
>>>        Andrei Alexandrescu:
>>>
>>>            Say at some point there are k available (not taken) slots out
>>> of
>>>            "n". There is a k/n chance that a random selection finds an
>>>            unoccupied slot. The average number of random trials needed
>>>            to find
>>>            an unoccupied slot is proportional to 1/(k/n) = n/k. So the
>>>            total
>>>            number of random trials to span the entire array is quadratic.
>>>            Multiplying that by 0.9 leaves it quadratic.
>>>
>>>
>>>        It's like in hashing: if you want to fill 99% of the available
>>> space
>>>        in a hash, then you take ages to find empty slots. But if you
>>>        fill it
>>>        only at 75-90%, then on average you need only one or two tries to
>>>        find an empty slot. So your time is linear, with a small
>>>        multiplicative constant. When the slots start to get mostly
>>>        full, you
>>>        change algorithm, copying the empty slots elsewhere.
>>>
>>>
>>>    Well I don't buy it. If you make a point, you need to be more
>>>    precise than such hand-waving. It's not like in hashing. It's like
>>>    in the algorithm we discuss. If you make a clear point that your
>>>    performance is better than O(n*n) by stopping at 90% then make it. I
>>>    didn't go through much formalism, but my napkin says you're firmly
>>>    in quadratic territory.
>>>
>>>  Well he has a point that the number of trials required to find an empty
>>> depends not on the absolute number of empty items, but only the ratio of
>>> empties to fulls.   Even your own claim about average number of trials was
>>> n/k -- not sure how you got that though.
>> If you toss a N-side dice hoping for a specific face to show up (and
>> stopping afterwards), how many times do you have to toss it on average? I
>> recall (without being sure) that you need to toss it a number of times
>> proportional to N. Could anyone confirm or deny?
> 
> The probability of *not* getting the number after t tries is
> (1-1/n)^t, so it's just a pure decaying exponential, the chance of
> getting it after t tries should be 1 minus that.  The average number
> of tries is going to be some kind of integral of that curve, and an
> integral of an exponential is still an exponential, so it seems
> unlikely to me that your memory was correct on this one.

Your handwaving ain't much better than my memory. Hey, either somebody goes through the math over here or we can give up on the whole thing and use O(n) storage for the blessed thing.

Andrei
February 14, 2009
Andrei Alexandrescu:
>Well I don't buy it. If you make a point, you need to be more precise than such hand-waving.<

You are of right, but I am not able to give you good math here.
(I have used that algorithm to extract all the possible uints, in few minutes and with less than 1GB of RAM. That, plus my experience with hashes, tells me that this algorithm is good enough. If you want I can also implement it agan and test it with a growing number of items, to see from the graph of the timings if it's indeed linear.)


Bill B.:
>.. the real gotcha is that he's using O(n) extra memory. 10% of n is still O(n) memory for shuffling the last 10% of the items.<

No, you are wrong, in real computers using 110% of the memory needed to store your n items is generally acceptable.


Andrei Alexandrescu:
>Your handwaving ain't much better than my memory. Hey, either somebody goes through the math over here or we can give up on the whole thing and use O(n) storage for the blessed thing.<

We can implement both, and we can look what's better in practical situations.

Bye,
bearophile
February 14, 2009
Andrei Alexandrescu wrote:
> Yigal Chripun wrote:
>> Andrei Alexandrescu wrote:
>>> Andrei Alexandrescu wrote:
>>>> Andrei Alexandrescu wrote:
>>>>> Given an array of length m, return a range that iterates the array in
>>>>> random order such that the entire array is visited without going
>>>>> through the same element more than once. Consume minimal amounts of
>>>>> memory and time. Baseline solution: allocate an array of indices,
>>>>> shuffle it, then follow indices stored in the array.
>>>>
>>>> Ok, here's something that should work.
>>>>
>>>> Start with array a of length a.length, allocate a bitmap with one bit
>>>> per element in the map. Also the number of unselected elements left in
>>>> the array is kept, call it k.
>>>>
>>>> To implement next():
>>>>
>>>> 1. Walk from the beginning of the array to the first unselected
>>>> element.
>>>>
>>>> 2. Roll a fair dice with k faces. If the dice chooses a specific face,
>>>> choose that first unselected element. Otherwise continue.
>>>>
>>>> 3. Continue walking until the next unselected element.
>>>>
>>>> 4. Roll a fair dice with k-1 faces. If the dice chooses a specific
>>>> face, choose that unselected element. Otherwise continue from step 3
>>>> using dices with k-1, k-2, ..., 1 faces.
>>>>
>>>> This has O(n log n) complexity. There is one obvious optimization:
>>>> eliminate the first selected elements so we don't need to walk over
>>>> them at each iteration. It's unclear to me how this affects complexity.
>>>>
>>>>
>>>> Andrei
>>>
>>> This seems to work for forward ranges as well. I'll be darned.
>>>
>>> Andrei
>>
>> what do you mean by "If the dice chooses a specific face"?
>> I don't understand this wording..
>
> If you have a dice with 6 faces (the classic dice), a "specific face"
> means you bet the dice will show e.g. "1" after rolling.
>
> Andrei

oh, ok, thanks for clarifying
February 14, 2009
On Sat, Feb 14, 2009 at 8:28 PM, bearophile <bearophileHUGS@lycos.com> wrote:
> Andrei Alexandrescu:
>>Well I don't buy it. If you make a point, you need to be more precise than such hand-waving.<
>
> You are of right, but I am not able to give you good math here.
> (I have used that algorithm to extract all the possible uints, in few minutes and with less than 1GB of RAM. That, plus my experience with hashes, tells me that this algorithm is good enough. If you want I can also implement it agan and test it with a growing number of items, to see from the graph of the timings if it's indeed linear.)
>
>
> Bill B.:
>>.. the real gotcha is that he's using O(n) extra memory. 10% of n is still O(n) memory for shuffling the last 10% of the items.<
>
> No, you are wrong, in real computers using 110% of the memory needed to store your n items is generally acceptable.

Yah, I just meant it was a big-oh gotcha.  Though I guess now that I
think of it the bit array is also using O(N) extra memory, too.
I agree that big oh doesn't always give you a very good picture of how
an algorithm will perform in practice.  Big-oh completely ignores very
important aspects like cache effects and physical memory limitations
on practical problem sizes.

I found a pdf[1] that has a summary of the probabilities associated
with finding empty slots in the context of hashing.
If I'm reading it right, if you have a load factor (fraction full) of
f, then it's expected it will take you 1/(1-f) trials to get an empty
if you just continue picking randomly one after the other.

--bb
[1] http://www.cse.unsw.edu.au/~cs2011/lect/2711_HashProbe-4.pdf
February 14, 2009
Bill Baxter:
> if you have a load factor (fraction full) of
> f, then it's expected it will take you 1/(1-f) trials to get an empty
> if you just continue picking randomly one after the other.

That goes well with my practical experience. So instead of 90% it may be better to use 85%, this percentage has to be tuned.

Bye,
bearophile
February 14, 2009
bearophile wrote:
> Andrei Alexandrescu:
>> Your handwaving ain't much better than my memory. Hey, either somebody goes through the math over here or we can give up on the whole thing and use O(n) storage for the blessed thing.<
> 
> We can implement both, and we can look what's better in practical situations.

Agreed. Worst-case quadratic can perform much better in many circumstances, and the proposed algorithm is tunable.

Or as an alternative, when you have hit (say) half the elements, you copy over the remaining elements and repeat. This gives you an amortized O(n log n) algorithm, with O(n log n) allocated memory.

> Bye,
> bearophile
February 14, 2009
Christopher Wright:
> Agreed. Worst-case quadratic can perform much better in many circumstances, and the proposed algorithm is tunable.

I think it's not quadratic :-) (See the answer by Bill B.)


> Or as an alternative, when you have hit (say) half the elements, you copy over the remaining elements and repeat. This gives you an amortized O(n log n) algorithm, with O(n log n) allocated memory.

There are some variants of that algorithm designed to be recursive, like you say. I have tried them, and they are slower, etc. That's why I have shown here the simpler version. Sometimes more complex algorithms aren't better.

Bye,
bearophile