January 25, 2008 Re: Shuffle | ||||
---|---|---|---|---|
| ||||
Posted in reply to Frits van Bommel | Frits van Bommel wrote: > As mentioned by Oskar, this won't generate uniformly random lists. > However, both your algorithm and his version have an additional flaw: both shuffle the entire list when you only need a random prefix (i.e. you only need to select as many random files as will fit onto your second disk). > > Try this small modification to the original algorithm instead: > ----- > > /* The loop will normally quit via an exception when the target > * device is full. But if there are not enough files in the source > * to fill up the target, the loop ensures it will still eventually > * quit. > */ > while (files.length > 0) > { > auto j = std.random.rand() % files.length; > auto fromfile = files[j]; > auto tofile = std.path.join(todir, basename(fromfile)); > writefln("%s => %s", fromfile, tofile); > std.file.copy(fromfile, tofile); > > // Remove copied file from the list > files[j] = files[$-1]; > files = files[0 .. $-1]; > } > ----- You're right, but I prefer having the shuffle bit be separate so people will see how to do a shuffle. > Though technically, this will still prefer the files at the front of the list during each iteration because (rand() % files.length) is more likely to be a number <= (RAND_MAX/files.length) than a larger number. (In fact, for RAND_MAX < files.length the higher-ups will *never* be chosen during that iteration) > However, this shouldn't be a significant problem as long as your number of files is much less than RAND_MAX; and RAND_MAX appears to be uint.max for std.random, so for any reasonable music collection it should be close enough. You're right again. (This algorithm is obviously harder than it looks!) > To fix that, ignore the result of rand() if it's > ((RAND_MAX / files.length) * files.length), for example by replacing the first line of the loop with: > --- > uint r = std.random.max; > // assumes RAND_MAX == uint.max > auto max = (uint.max / files.length) * files.length; > while (r > max) > r = std.random.rand(); > auto j = r % files.length; > --- > (also, add "assert(files.length <= RAND_MAX);" before the loop if RAND_MAX < uint.max) I think it's (r >= max). |
January 25, 2008 Re: Shuffle | ||||
---|---|---|---|---|
| ||||
Posted in reply to BCS | BCS wrote:
> Oskar Linde wrote:
>> Walter Bright wrote:
>>
>>> /* Shuffle the files[] array
>>> */
>>
>>
>> To be nitpicky, your algorithm will not give a perfect shuffling. Consider shuffling the sequence 012. The algorithm has 3^3 = 27 different ways it can shuffle the sequence, and there exists 3! = 6 different permutations of the sequence. Since 27 isn't divisible by 6, the algorithm will give a slight bias towards some end sequences. (021,102 and 120 in this case). A quick fix is to change the first line of the loop body:
>>
>
> What would you get if you qsorted the list with rand used as the compare function?
At best, an O(N log N) shuffle instead of O(N).
--bb
|
January 25, 2008 Re: Shuffle | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Walter Bright wrote: > Frits van Bommel wrote: [snip stuff I was right about :)] > >> To fix that, ignore the result of rand() if it's > ((RAND_MAX / files.length) * files.length), for example by replacing the first line of the loop with: >> --- >> uint r = std.random.max; >> // assumes RAND_MAX == uint.max >> auto max = (uint.max / files.length) * files.length; >> while (r > max) >> r = std.random.rand(); >> auto j = r % files.length; >> --- >> (also, add "assert(files.length <= RAND_MAX);" before the loop if RAND_MAX < uint.max) > > I think it's (r >= max). You're right, that loop condition was missing an '='. I was typing all that from memory of a class I took years ago, so I'm just glad I got the most important stuff right. |
January 26, 2008 Re: Shuffle | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | On Jan 25, 2008 8:51 PM, Walter Bright <newshound1@digitalmars.com> wrote: > I remember reading a criticism of the shuffle/swap algorithm and the fix a while back, but don't remember where I read it or what the fix was. I'm glad Oskar has set it straight here. > > Although this is just a dumb program, because it's going into a 'cookbook' that others may use for more serious applications, the algorithms should be correct. > > Kudos to Oskar. > http://www.codinghorror.com/blog/archives/001015.html Might have been that? Anders |
January 26, 2008 Re: Shuffle | ||||
---|---|---|---|---|
| ||||
Posted in reply to Anders Bergh | Anders Bergh wrote:
> Might have been that?
Yes, I think so!
|
January 26, 2008 Re: Shuffle | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Latest based on suggestions: /* Program to randomly copy music files from source to destination device. * Written in the D programming language. * Written by Walter Bright, http://www.digitalmars.com * Placed into the Public Domain. * Thanks to help from Oskar Linde, Anders Bergh, and Fritz van Bommel */ import std.file; import std.stdio; import std.string; import std.c.stdlib; import std.path; import std.random; int main(string[] args) { if (args.length != 3) { writefln("Usage: shuffle fromdir todir"); exit(1); } auto fromdir = args[1]; auto todir = args[2]; /* Recursively search for all the mp3 and wma files in directory fromdir * and put them into files[] */ string[] files; bool callback(DirEntry *de) { if (de.isdir) listdir(de.name, &callback); // recurse into subdirectories else { // Collect only files with mp3 and wma extensions auto ext = getExt(de.name); if (fnmatch(ext, "mp3") || fnmatch(ext, "wma")) files ~= de.name; } return true; // keep going } std.file.listdir(fromdir, &callback); auto n = files.length; writefln(n, " music files"); if (!n) return 0; /* Shuffle the files[] array using Durstenfeld's algorithm * based on the Fisher-Yates method */ auto max = (typeof(std.random.rand()).max / n) * n; while (--n) { /* Pick random r in range 0..max, discarding others * to eliminate modulo bias */ auto r = max; do r = std.random.rand(); while (r >= max); auto j = r % (n + 1); // pick element to swap with // swap [n] and [j] auto tmp = files[n]; files[n] = files[j]; files[j] = tmp; } /* Sequentially fill the target until done or it quits with an * exception when the device is full. */ foreach (fromfile; files) { auto tofile = std.path.join(todir, basename(fromfile)); writefln("%s => %s", fromfile, tofile); std.file.copy(fromfile, tofile); } writefln("Done"); return 0; } |
January 26, 2008 Re: Shuffle | ||||
---|---|---|---|---|
| ||||
Posted in reply to John Reimer | John Reimer wrote: > And what would be the Tango equivalent, I wonder? ;) (see http://pastie.caboo.se/143887 for a syntax highlighted edition) import tango.io.File, tango.io.Stdout, tango.io.FilePath, tango.io.FileScan; import Array = tango.core.Array; int main (char[][] args) { if (args.length != 3) { Stdout.formatln ("usage: shuffle fromdir todir"); return 1; } // set destination path, and add trailing separator as necessary auto dst = new FilePath; dst.path = args[2]; // recursively search for all the mp3, wma and w4a files in specified directory auto songs = new FileScan; songs (args[1], (FilePath fp, bool isDir) {return isDir || fp.suffix == ".mp3" || fp.suffix == ".wma"|| fp.suffix == ".m4a";}); // shuffle the files Stdout.formatln ("{} music files", songs.files.length); Array.shuffle (songs.files); // sequentially fill the target until done, or it quits with an // exception when the device is full foreach (song; songs.files) dst.file(song.file).copy(song); Stdout.formatln ("Done"); return 0; } -- Lars Ivar Igesund blog at http://larsivi.net DSource, #d.tango & #D: larsivi Dancing the Tango |
January 27, 2008 Re: Shuffle | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Walter Bright さんは書きました:
> For fun, I ordered a new car stereo that would play music from an SD card or USB stick(rather than from CD), and installed it over the weekend. The problem is loading songs onto the SD card from my home music server, which I like to hear played at random.
>
> The solution is a simple D program, shuffle, which will randomly copy music files to an SD card until it fills up. Have some fun with it!
> ----------------------------
>
> /* Program to randomly copy music files from source to destination device.
> * Written in the D programming language.
> * Written by Walter Bright, http://www.digitalmars.com
> * Placed into the Public Domain.
> */
>
>
> import std.file;
> import std.stdio;
> import std.string;
> import std.c.stdlib;
> import std.path;
> import std.random;
>
> int main(string[] args)
> {
> if (args.length != 3)
> { writefln("Usage: shuffle fromdir todir");
> exit(1);
> }
> auto fromdir = args[1];
> auto todir = args[2];
>
> /* Recursively search for all the mp3 and wma files in directory fromdir
> * and put them into files[]
> */
> string[] files;
> bool callback(DirEntry *de)
> {
> if (de.isdir)
> listdir(de.name, &callback); // recurse into subdirectories
> else
> {
> // Collect only files with mp3 and wma extensions
> auto ext = getExt(de.name);
> if (fnmatch(ext, "mp3") || fnmatch(ext, "wma"))
> files ~= de.name;
> }
> return true; // keep going
> }
> std.file.listdir(fromdir, &callback);
>
> writefln(files.length, " music files");
>
> /* The loop will normally quit via an exception when the target device
> * is full. But if there are not enough files in the source to fill
> * up the target, the loop ensures it will still eventually quit.
> */
> for (size_t i = 0; i < files.length; i++)
> {
> auto j = std.random.rand() % files.length;
> auto fromfile = files[j];
> auto tofile = std.path.join(todir, basename(fromfile));
> writefln("%s => %s", fromfile, tofile);
> std.file.copy(fromfile, tofile);
> }
>
> writefln("Done");
> return 0;
> }
Appreciate the example, it actually came in quite handy. Quick question, how does one modify this to copy an entire directory? Additionaly, version(linux) std.file.copy() preserves the filestamps of the files copied. Is there a way to build this characteristic into the windows version?
Thanks
|
January 27, 2008 Re: Shuffle | ||||
---|---|---|---|---|
| ||||
Posted in reply to Tyro[a.c.edwards] | Tyro[a.c.edwards] さんは書きました:
> Walter Bright さんは書きました:
>> For fun, I ordered a new car stereo that would play music from an SD card or USB stick(rather than from CD), and installed it over the weekend. The problem is loading songs onto the SD card from my home music server, which I like to hear played at random.
>>
>> The solution is a simple D program, shuffle, which will randomly copy music files to an SD card until it fills up. Have some fun with it!
>> ----------------------------
>>
>> /* Program to randomly copy music files from source to destination device.
>> * Written in the D programming language.
>> * Written by Walter Bright, http://www.digitalmars.com
>> * Placed into the Public Domain.
>> */
>>
>>
>> import std.file;
>> import std.stdio;
>> import std.string;
>> import std.c.stdlib;
>> import std.path;
>> import std.random;
>>
>> int main(string[] args)
>> {
>> if (args.length != 3)
>> { writefln("Usage: shuffle fromdir todir");
>> exit(1);
>> }
>> auto fromdir = args[1];
>> auto todir = args[2];
>>
>> /* Recursively search for all the mp3 and wma files in directory fromdir
>> * and put them into files[]
>> */
>> string[] files;
>> bool callback(DirEntry *de)
>> {
>> if (de.isdir)
>> listdir(de.name, &callback); // recurse into subdirectories
>> else
>> {
>> // Collect only files with mp3 and wma extensions
>> auto ext = getExt(de.name);
>> if (fnmatch(ext, "mp3") || fnmatch(ext, "wma"))
>> files ~= de.name;
>> }
>> return true; // keep going
>> }
>> std.file.listdir(fromdir, &callback);
>>
>> writefln(files.length, " music files");
>>
>> /* The loop will normally quit via an exception when the target device
>> * is full. But if there are not enough files in the source to fill
>> * up the target, the loop ensures it will still eventually quit.
>> */
>> for (size_t i = 0; i < files.length; i++)
>> {
>> auto j = std.random.rand() % files.length;
>> auto fromfile = files[j];
>> auto tofile = std.path.join(todir, basename(fromfile));
>> writefln("%s => %s", fromfile, tofile);
>> std.file.copy(fromfile, tofile);
>> }
>>
>> writefln("Done");
>> return 0;
>> }
>
> Appreciate the example, it actually came in quite handy. Quick question, how does one modify this to copy an entire directory? Additionaly, version(linux) std.file.copy() preserves the filestamps of the files copied. Is there a way to build this characteristic into the windows version?
>
> Thanks
I thought there might be a way to copy the entire directory intact. I didn't find anyting in the library that allows this but it really isn't that important. I can recreat the directory structure as I go along.
Thanks.
Andrew
|
January 27, 2008 Re: Shuffle | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Walter Bright wrote: > Latest based on suggestions: > > [snip start] > /* Shuffle the files[] array using Durstenfeld's algorithm > * based on the Fisher-Yates method > */ > auto max = (typeof(std.random.rand()).max / n) * n; > while (--n) > { > /* Pick random r in range 0..max, discarding others > * to eliminate modulo bias > */ > auto r = max; > do > r = std.random.rand(); > while (r >= max); > > auto j = r % (n + 1); // pick element to swap with > > // swap [n] and [j] > auto tmp = files[n]; > files[n] = files[j]; > files[j] = tmp; > } [snip rest] Wrong again. The calculationof "max" should be inside the loop, since it depends on "n" which is different for each iteration. |
Copyright © 1999-2021 by the D Language Foundation