June 16, 2012
Am Wed, 13 Jun 2012 18:57:51 -0700
schrieb Jonathan M Davis <jmdavisProg@gmx.com>:

> * I'm not terribly fond of having names like Variant and Version, but they _are_ local to UUID, so I guess that they're clear enough. But the fact that you have to put a note about it _not_ being std.variant.Variant definitely indicates that another name might be better. I know that the RFC uses the word variant, but it also indicates that type would probably be a better name, and if Variant is unclear enough that a note is needed, then perhaps it really isn't the right name.

Well I added the note as a result of this review. I think whether the name variant is confusing depends a lot on whether you've already used a UUID library or std.variant. Naming it 'Type' might confuse those who already worked with UUIDs. And 'Type' is a very generic name, even more than UUID and at some point we'll have other 'Type' symbols in phobos as well.

So I'd rather like to keep the name consistent with the RFC.
> 
> * nameBasedMd5 should probably be nameBasedMD5, and nameBasedSha1 should probably be nameBasedSHA1. The correct names are MD5 and SHA-1, not md5 and sha1, so it would be more correct to use nameBasedMD5 and nameBasedSHA1. With acronyms, the thing that most of Phobos does is to keep all of the letters the same case, so if the first letter is lowercase, then the whole thing is lowercase, and if the first letter is uppercase, then the whole thing is uppercase. I'm not sure if MD5 and SHA-1 are acronyms or not, but they're similar enough, that I'd say that the same naming scheme should be used, which you do in most cases, but you didn't with the enums for some reason.
yep seems I missed those when I refactored the names.

> * For clarity's sake, please always put const after the function signatures for all const functions. In same cases you do, and in others you don't.
OK (Imho it's no confusing though if const is part of a attribute list, as in '@safe nothrow const pure')
> 
> * If all that's preventing toString from being nothrow is idup, then just do this:
> 
> string toString() @safe const pure nothrow
> {
>     try
>         return _toString().idup;
>     catch(Exception e)
>         assert(0, "It should be impossible for idup to throw.");
> }
> 
> and if there's a bug report for idup, then list it in a comment so that we know to get rid of the try-catch block once that's been fixed (and if there isn't a bug report, one should be created).

_toString is already nothrow, so yes, it's only idups fault ;-) Added links to http://d.puremagic.com/issues/show_bug.cgi?id=5700

> 
> * Get rid of nilUUID. Just use UUID.init. It just seems like an extra, unnecessary complication when the two are identical.
OK

> 
> * Please use SHA-1 in the documentation rather than SHA1, since SHA-1 is the correct name. Also, make sure that it's SHA-1 and not just SHA. You're not always consistent (e.g. take a look at sha1UUID).
OK

> 
> * parseUUID should be clear on whether it takes both lowercase and uppercase hex digits, and I'd argue that it should definitely accept both. [0-9a-f] implies that it only accepts lowercase hex digits.
OK

> * Use std.exception's assertThrown. It'll make your tests cleaner. And if you can't do that, because you need to look at the exception itself, then use collectException. So,
> 
> thrown = false;
> try
>     //make sure 36 characters in total or we'll get a 'tooMuch' reason
>     id = UUID("{8ab3060e-2cba-4f23-b74c-b52db3bdf6}");
> catch(UUIDParserException e)
> {
>     assert(e.reason == UUIDParserException.Reason.invalidChar);
>     thrown = true;
> }
> assert(thrown);
> 
> becomes
> 
> //make sure 36 characters in total or we'll get a 'tooMuch' reason
> auto e =
> collectException!UUIDParserException(UUID("{8ab3060e-2cba-4f23-b74c-
> b52db3bdf6}")); assert(e !is null);
> assert(e.reason == UUIDParserException.Reason.invalidChar);
> 
> It's much shorter that way and less error-prone.

OK

> * You should probably change empty's body to
> 
> enum ubyte[16] emptyUUID = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]; return data == emptyUUID[];
> 
> I'm not sure whether it currently allocates or not if you do that (I _think_ that it'll do the heap allocation at compile time and then only have the value type at runtime), but regardless of whether it does or not, I'd expect that once it's fixed so that assigning an array literal to a static array, it definitely _won't_ allocate on the heap, whereas your current implementation will regardless of that fix.
> 
> And if that doesn't seem like a good change, then you can just make a module- level or struct variable which is immutable ubyte[16] and have empty use that. As it stands, it's always going to allocate the array literal on the heap. It's probably not a big deal, since empty probably won't be called all that often, but if we can easily avoid the extra heap allocation, we should.

I totally agree, we shouldn't allocate there.

> * I know that
> 
> void toString(scope void delegate(const(char)[]) sink)
> 
> is what has been previously discussed for a toString which doesn't allocate, but I have to wonder why we don't just use an output range. I suppose that that's a whole other discussion, but that's what I would have suggested rather than a delegate - though in some circumstances it probably _is_ easier to just create a nested function rather than a whole output range. Maybe we should be moving to having 3 overloads of toString (normal, accepts delegate, and accepts output range).

Probably, but as you said that's a whole other discussion :-)

> 
> * rndGen returns a Random (which is already aliased to Mt19937), and already initializes it with an unpredictable seed. So, why not just us rndGen in randomUUID? It becomes
> 
> return randomUUID(rndGen());
> 
> It's much shorter and just as good as far as I can tell.

Good catch. We'd have to change rndGen to use the new seed function though. And the documentation for Random says: "You may want to use it if [...] you don't care for the minutiae of the method being used"

I'm not sure, do we care about the method? We need something which generates 'good' random numbers, I don't think a LCG would be acceptable. But rndGen could default to something like that on embedded systems?

Then again, I know nothing about random number generators, so someone please tell me: Do we need  Mersenne twister for random UUIDS or could we use any RNG?

> 
> * Change randomValue in randomUUID to auto. The default random number generator _does_ use uint, but others could use ubyte, ushort, or ulong instead. I'd also be tempted to argue that you should make it immutable and use a different variable for each random value (you won't ever accidentally fail to set it to a new value that way - that I don't think that reusing variables is a good idea in general), but the benefits of that are debatable.
> 
> Actually, just make it a loop. Something like
> 
> foreach(size_t i; iota(0, 16, 4))
> {
>     randomGen.popFront();
>     immutable randomValue = randomGen.front;
>     u.data[i .. i + 4] = *cast(ubyte[4]*)&randomValue;
> }
> 
> That's probably slightly less efficient (though I wouldn't think by much), but it's much cleaner. And actually, since the randomValue isn't guarantee to be a uint, but your code needs it to be 4 bytes, you should probably change that line to
> 
> immutable uint randomValue = cast(uint)randomGen.front;

I don't think that's a good idea. If the randomGen.front the is e.g. ubyte, casting it to uint will produce 3 bytes which are not set to a random value. That's not acceptable.

------------
@trusted UUID randomUUID(RNG)(ref RNG randomGen) if(isUniformRNG!(RNG)
&&
    isIntegral!(typeof(RNG.front)) && typeof(RNG.front).sizeof <= 16)
{
    enum elemSize = typeof(RNG.front).sizeof;
    UUID u;
    foreach(size_t i; iota(0, 16, elemSize))
    {
        randomGen.popFront();
        immutable randomValue = randomGen.front;
        u.data[i .. i + elemSize] = *cast(ubyte[elemSize]*)&randomValue;
    }
------------
?

> 
> * I'd advise giving more descriptive names to parseError's parameters. It's true that it's a nested function and not in the actual API, but it would make the function easier to understand.
OK


> * You should probably use R or Range for template parameters which are ranges. That's the typical thing to do. T is usually used for generic types rather than for ranges. The code will be somewhat easier to read if you use R or Range given that that's how the rest of Phobos is.
OK

> 
> * In parseUUID, element should be a size_t, since it's used specifically for indexing. As it stands, the compiler is going to have to do extra conversions in some of the places that it's used (e.g. converting it to size_t when indexing).
OK

> 
> * In the unit tests, I'd argue that it would be cleaner to just use the result directly in an assertion rather than saving it if it's not going to be used outside of the assertion. For instance,
> 
> id = parseUUID("5668122d-9df0-49a4-ad0b-b9b0a57f886a"d);
> assert(id.data == [86, 104, 18, 45, 157, 240, 73, 164, 173,
>         11, 185, 176, 165, 127, 136, 106]);
> 
> should probably be
> 
> assert(parseUUID("5668122d-9df0-49a4-ad0b-b9b0a57f886a"d).data ==
>            [86, 104, 18, 45, 157, 240, 73, 164, 173, 11, 185, 176,
> 165, 127, 136, 106]);
> 
> It's generally shorter, cleaner, and avoids the risk of reusing the variable without setting it again.
OK

> * For your functions which can take range types, test with filter!"true" so that you get a range which isn't an array or string.
Does take!(string, int) return a slice? Should have thought about
that....

I can't use filter (and map) though:
std/algorithm.d(1141): Error: nested structs with constructors are not
yet supported in CTFE (Bug 6419) std/uuid.d(1273):        called from
here: filter("00000000-0000-0000-0000-000000000000"d)


> 
> * When testing various string types, you should take advantage of TypeTuple and std.conv.to to be thorough. For instance, instead of
> 
> id = parseUUID("5668122d-9df0-49a4-ad0b-b9b0a57f886a"d);
> assert(id.data == [86, 104, 18, 45, 157, 240, 73, 164, 173,
>     11, 185, 176, 165, 127, 136, 106]);
> id = parseUUID("5668122d-9df0-49a4-ad0b-b9b0a57f886a"w);
> assert(id.data == [86, 104, 18, 45, 157, 240, 73, 164, 173,
>     11, 185, 176, 165, 127, 136, 106]);
> 
> do something like
> 
> foreach(S; TypeTuple!(char, const char, immutable char,
>                       wchar, const wchar, immutable wchar,
>                       dchar, const dchar, immutable dchar))
> {
>     assert(parseUUID(to!S("5668122d-9df0-49a4-ad0b-b9b0a57f886a")).data
> == [86, 104, 18, 45, 157, 240, 73, 164, 173, 11, 185, 176, 165, 127,
> 136, 106]);
> }
> 
> And while you're at it, you could throw _all_ of your string tests in that loop save for those which wouldn't really make sense to put in there (and you probably don't want to put very many exception tests in the loop, since those get to be expensive). You get much better testing this way.
OK (I'll put the exception tests in the loop as well - There'll be ~100 exception tests in total and throwing/catching 100 exceptions takes ~120 msec here, so it should be fast enough)

> 
> However, do make sure that some of the exception tests use multiple string types even if they aren't in the TypeTuple loop, since some of your exceptions (in parseUUID in particular) depend on the type of range passed in.
OK

> 
> * I'm not sure if you do or not, but if you have a spot where you have to create a UUIDParserException from another exception but don't want to have to specify the file and line number, then you should create an overload of UUIDParserException which takes the Throwable first (and probably forwards to the other constructor to avoid code duplication). Exception does this, and it can be quite useful. But UUIDParserException can't be publicly constructed, so only do it if it's actually useful within std.uuid.
OK
> 
> * Personally, I'd prefer UUIDParsingException to UUIDParserException, since you're not writing a full-on parser but rather just doing parsing in some of your functions. But that's fairly subjective.
OK
> 
> - Jonathan M Davis

Thanks for your feedback!
June 16, 2012
Am Tue, 12 Jun 2012 13:46:17 +0200
schrieb Johannes Pfau <nospam@example.com>:

> Am Mon, 11 Jun 2012 13:12:49 +0200
> schrieb Johannes Pfau <nospam@example.com>:
> 
> > Am Sun, 10 Jun 2012 18:49:03 +0200
> > schrieb Johannes Pfau <nospam@example.com>:
> > 
> > > Am Sat, 09 Jun 2012 21:30:57 +0400
> > > schrieb Dmitry Olshansky <dmitry.olsh@gmail.com>:
> > > > Code: https://github.com/jpf91/phobos/blob/std.uuid/std/uuid.d API-Docs: http://dl.dropbox.com/u/24218791/d/src/uuid.html
> > > 
> > > I pushed these changes suggested by Dmitry Olshansky and Jonas Drewsen:
> > > 
> > > * Add documentation table
> > > * Rename UUID.isNil --> UUID.empty
> > > * Merge ParserException and IsufficientInputException into
> > >   UUIDParserException
> > > * Add note about std.variant.Variant
> > > * Rewrite example to avoid cast
> > > * Add non-working(commented out) variadic constructor
> > > * Rename extractRegex --> uuidRegex
> > > * Use std.algorithm.swap instead of swapRanges
> > 
> > * parseUUID now supports InputRanges with ElementType == dchar (all
> >   string types are still supported as well)
> > 
> > 
> 
> * randomUUID now creates the RNG only once per thread.
> * randomUUID lazily seeds the RNG once (per thread) on the first
>   function call
> * randomUUID now uses the new seed overload (requires
>   https://github.com/D-Programming-Language/phobos/pull/627 )
> * the unsafe randomUUID overload taking only a type has been removed.
> 

* Add changes suggested by Jonathan M Davis (see
  https://github.com/jpf91/phobos/commit/91cd1c1f4385cfe1cd868c0720aff257d9436822
  )

June 16, 2012
On 16.06.2012 15:30, Johannes Pfau wrote:
> Am Tue, 12 Jun 2012 13:46:17 +0200
> schrieb Johannes Pfau<nospam@example.com>:
>
>> Am Mon, 11 Jun 2012 13:12:49 +0200
>> schrieb Johannes Pfau<nospam@example.com>:
>>
>>> Am Sun, 10 Jun 2012 18:49:03 +0200
>>> schrieb Johannes Pfau<nospam@example.com>:
>>>
>>>> Am Sat, 09 Jun 2012 21:30:57 +0400
>>>> schrieb Dmitry Olshansky<dmitry.olsh@gmail.com>:
>>>>> Code: https://github.com/jpf91/phobos/blob/std.uuid/std/uuid.d
>>>>> API-Docs: http://dl.dropbox.com/u/24218791/d/src/uuid.html
>>>>
>>>> I pushed these changes suggested by Dmitry Olshansky and Jonas
>>>> Drewsen:
>>>>
>>>> * Add documentation table
>>>> * Rename UUID.isNil -->  UUID.empty
>>>> * Merge ParserException and IsufficientInputException into
>>>>    UUIDParserException
>>>> * Add note about std.variant.Variant
>>>> * Rewrite example to avoid cast
>>>> * Add non-working(commented out) variadic constructor
>>>> * Rename extractRegex -->  uuidRegex
>>>> * Use std.algorithm.swap instead of swapRanges
>>>
>>> * parseUUID now supports InputRanges with ElementType == dchar (all
>>>    string types are still supported as well)
>>>
>>>
>>
>> * randomUUID now creates the RNG only once per thread.
>> * randomUUID lazily seeds the RNG once (per thread) on the first
>>    function call
>> * randomUUID now uses the new seed overload (requires
>>    https://github.com/D-Programming-Language/phobos/pull/627 )
>> * the unsafe randomUUID overload taking only a type has been removed.
>>
>
> * Add changes suggested by Jonathan M Davis (see
>    https://github.com/jpf91/phobos/commit/91cd1c1f4385cfe1cd868c0720aff257d9436822
>    )

Surely, a better way then making a global is to routinely check for zeros.

Plus, you can cast .ptr to size_t* and compare word by word in 4 or 2 ops(!). You'd better watch out for __ctfe though, it deon't allow casting pointers this way.


-- 
Dmitry Olshansky
June 16, 2012
On 16.06.2012 21:06, Dmitry Olshansky wrote:
> On 16.06.2012 15:30, Johannes Pfau wrote:
>> Am Tue, 12 Jun 2012 13:46:17 +0200
>> schrieb Johannes Pfau<nospam@example.com>:
>>
>>> Am Mon, 11 Jun 2012 13:12:49 +0200
>>> schrieb Johannes Pfau<nospam@example.com>:
>>>
>>>> Am Sun, 10 Jun 2012 18:49:03 +0200
>>>> schrieb Johannes Pfau<nospam@example.com>:
>>>>
>>>>> Am Sat, 09 Jun 2012 21:30:57 +0400
>>>>> schrieb Dmitry Olshansky<dmitry.olsh@gmail.com>:
>>>>>> Code: https://github.com/jpf91/phobos/blob/std.uuid/std/uuid.d
>>>>>> API-Docs: http://dl.dropbox.com/u/24218791/d/src/uuid.html
>>>>>
>>>>> I pushed these changes suggested by Dmitry Olshansky and Jonas
>>>>> Drewsen:
>>>>>
>>>>> * Add documentation table
>>>>> * Rename UUID.isNil --> UUID.empty
>>>>> * Merge ParserException and IsufficientInputException into
>>>>> UUIDParserException
>>>>> * Add note about std.variant.Variant
>>>>> * Rewrite example to avoid cast
>>>>> * Add non-working(commented out) variadic constructor
>>>>> * Rename extractRegex --> uuidRegex
>>>>> * Use std.algorithm.swap instead of swapRanges
>>>>
>>>> * parseUUID now supports InputRanges with ElementType == dchar (all
>>>> string types are still supported as well)
>>>>
>>>>
>>>
>>> * randomUUID now creates the RNG only once per thread.
>>> * randomUUID lazily seeds the RNG once (per thread) on the first
>>> function call
>>> * randomUUID now uses the new seed overload (requires
>>> https://github.com/D-Programming-Language/phobos/pull/627 )
>>> * the unsafe randomUUID overload taking only a type has been removed.
>>>
>>
>> * Add changes suggested by Jonathan M Davis (see
>> https://github.com/jpf91/phobos/commit/91cd1c1f4385cfe1cd868c0720aff257d9436822
>>
>> )
>
> Surely, a better way then making a global is to routinely check for zeros.
>
> Plus, you can cast .ptr to size_t* and compare word by word in 4 or 2
> ops(!). You'd better watch out for __ctfe though, it deon't allow
> casting pointers this way.
>
>

Ah and another way to go about it is:
union {
	ubyte[16] uuid;
	size_t[16/size_t.sizeof] by_word;
}

-- 
Dmitry Olshansky
June 16, 2012
On Saturday, June 16, 2012 13:27:54 Johannes Pfau wrote:
> Am Wed, 13 Jun 2012 18:57:51 -0700
> So I'd rather like to keep the name consistent with the RFC.

I understand. I don't think that there's a really a good solution. So, if you think that keeping it as-is is best, then we can do that.

> _toString is already nothrow, so yes, it's only idups fault ;-) Added links to http://d.puremagic.com/issues/show_bug.cgi?id=5700

Please just put it as a comment above the try-catch rather than putting it in the ddoc. No one using the function needs to know that there's a workaround for nothrow. They just need to know that it's nothrow. It's the Phobos maintainers that need to know that there's a workaround. It seems fairly common to do something like

//@@@BUG@@@ workaround for bugzilla 5700

so that it's nicely greppable.

> > * parseUUID should be clear on whether it takes both lowercase and uppercase hex digits, and I'd argue that it should definitely accept both. [0-9a-f] implies that it only accepts lowercase hex digits.
> 
> OK

Make sure that you have tests which test UUIDs with both uppercase and lowercase letters. I only see tests with lowercase letters. You seem to have updated an example with some uppercase letters but not any tests from what I can see. Speaking of which, please make sure that you have unittest blocks with your examples in them to make sure that they compile. Personally, I always put

//Verify Example.
unittest
{
    //example code...
}

after a function with an example and make sure that the example and the code in the unittest block match exactly (save for indentation, since any indentation in the example in the ddoc ends up in the final ddoc, and we do want the indentation in the unittest block).

> > * rndGen returns a Random (which is already aliased to Mt19937), and already initializes it with an unpredictable seed. So, why not just us rndGen in randomUUID? It becomes
> > 
> > return randomUUID(rndGen());
> > 
> > It's much shorter and just as good as far as I can tell.
> 
> Good catch. We'd have to change rndGen to use the new seed function though. And the documentation for Random says: "You may want to use it if [...] you don't care for the minutiae of the method being used"
> 
> I'm not sure, do we care about the method? We need something which generates 'good' random numbers, I don't think a LCG would be acceptable. But rndGen could default to something like that on embedded systems?
> 
> Then again, I know nothing about random number generators, so someone please tell me: Do we need  Mersenne twister for random UUIDS or could we use any RNG?

Umm. My point was that as far as I can tell, what you're doing is _identical_ to what rndGen is doing.

Yours
-----------
@trusted UUID randomUUID()()
{
    static Mt19937 randomGen;
    static bool seeded = false;
    if(!seeded)
    {
        randomGen.seed(map!((a) => unpredictableSeed)(repeat(0)));
        seeded = true;
    }
    return randomUUID(randomGen);
}
-----------

vs

-----------
alias Mt19937 Random;

@property ref Random rndGen()
{
    static Random result;
    static bool initialized;
    if (!initialized)
    {
        result = Random(unpredictableSeed);
        initialized = true;
    }
    return result;
}
-----------

Really, the only differences are the fact that you're passing the Random into randomUUID and returning that rather than returing the Random, and you're using a map to do the seed for reasons that completely escape me. In fact, I'm not sure how that code even compiles. It creates an infinite range whose every value is zero and then maps all of those zeroes to unpredictableSeed. So, you've created an infinite range of unpredictableSeed results. But Mt19937's seed takes a single uint, not a range. So, I'm baffled as to how that line even compiles.

In any case, it looks like they're both using Random, and they're both seeding it with an unpredictableSeed. It's just that in noe case, it's using Random's constructor to do it, while in the other, seed is being called explicitly (thereby _re_seeding the Random).

So, I don't see what you're doing that's any different from just doing

return randomUUID(rndGen());

aside from the fact that you seem to have an impossible line of code for seeding your random number generator. And trying to compile that line on my computer actually fails (as I'd expect), so I don't know how it's compiling on yours.

> I don't think that's a good idea. If the randomGen.front the is e.g. ubyte, casting it to uint will produce 3 bytes which are not set to a random value. That's not acceptable.
> 
> ------------
> @trusted UUID randomUUID(RNG)(ref RNG randomGen) if(isUniformRNG!(RNG)
> &&
>     isIntegral!(typeof(RNG.front)) && typeof(RNG.front).sizeof <= 16)
> {
>     enum elemSize = typeof(RNG.front).sizeof;
>     UUID u;
>     foreach(size_t i; iota(0, 16, elemSize))
>     {
>         randomGen.popFront();
>         immutable randomValue = randomGen.front;
>         u.data[i .. i + elemSize] = *cast(ubyte[elemSize]*)&randomValue;
>     }
> ------------
> ?

Looks good except for the fact that the typeof(RNG.front).sizeof <= 16 is completely unnecessary. isIntgeral guarantees that the type is a built-in integral type, the largest of which is 8 bytes (and even if we added cent and ucent, it would still be only 16, not greater than 16).

> > * For your functions which can take range types, test with filter!"true" so that you get a range which isn't an array or string.
> 
> Does take!(string, int) return a slice? Should have thought about
> that....

No. It doesn't, because hasSlicing!string is false.

> I can't use filter (and map) though:
> std/algorithm.d(1141): Error: nested structs with constructors are not
> yet supported in CTFE (Bug 6419) std/uuid.d(1273):        called from
> here: filter("00000000-0000-0000-0000-000000000000"d)

filter!"true"("00000000-0000-0000-0000-000000000000") should compile just fine. You then pass that to your range-based uuid functions which accept ranges or dchar. What does CTFE have to do with it? That would only be an issue if you were trying to use filter in CTFE, and even if you have CTFE tests where you'd like to do that and can't due to that bug, there are plenty of tests that you could run which weren't CTFE. I think that one or both of us is misunderstanding something here.

> Thanks for your feedback!

Also, I just noticed that it looks like you don't have any tests which verify that empty is false when it's supposed to be. You should probably add a few.

And by the way, before this actually gets merged into Phobos (as I'm guessing that it will be, since no way seems to particularly dislike the module thus far), you should fix the ddoc to use LREF and XREF when referencing symbols elsewhere in the module and elsewhere in std respectively.

- Jonathan M Davis
June 17, 2012
Am Sat, 16 Jun 2012 16:57:25 -0700
schrieb Jonathan M Davis <jmdavisProg@gmx.com>:

> Umm. My point was that as far as I can tell, what you're doing is _identical_ to what rndGen is doing.
> 

Yes right now it's exactly the same. But there's no guarantee that the 'Random' alias (and therefore rndGen) will always use Mt19997.

The docs explicitly say:
--------------
The "default", "favorite", "suggested" random number generator type on the current platform. It is an alias for one of the previously-defined generators. You may want to use it if (1) you need to generate some nice random numbers, and (2) you don't care for the minutiae of the method being used.
--------------

so it could be an alias to a 'worse' RNG on some platforms and I'm not sure if that's OK.

I don't mind changing it to rndGen if someone can tell me that it's fine to use 'worse' RNGs for UUIDs, but boost used MersenneTwister and I don't want to change that unless I'm sure it won't cause problems.

> 
> So, I don't see what you're doing that's any different from just doing
> 
> return randomUUID(rndGen());
> 
> aside from the fact that you seem to have an impossible line of code for seeding your random number generator. And trying to compile that line on my computer actually fails (as I'd expect), so I don't know how it's compiling on yours.

Yes, that code requires a new overload for seed, see
https://github.com/D-Programming-Language/phobos/pull/627
And yes, it's generating a infinite range of unpredictableSeed (Mt19937
uses 624 values from that range)

I don't know anything about RNGs, but Andrew Talbot suggested in the thread called "Mersenne Twister Seeding and UUIDs" that we need this new, better seeding method.

> 
> > I don't think that's a good idea. If the randomGen.front the is e.g. ubyte, casting it to uint will produce 3 bytes which are not set to a random value. That's not acceptable.
> > 
> > ------------
> > @trusted UUID randomUUID(RNG)(ref RNG randomGen)
> > if(isUniformRNG!(RNG) &&
> >     isIntegral!(typeof(RNG.front)) && typeof(RNG.front).sizeof <=
> > 16) {
> >     enum elemSize = typeof(RNG.front).sizeof;
> >     UUID u;
> >     foreach(size_t i; iota(0, 16, elemSize))
> >     {
> >         randomGen.popFront();
> >         immutable randomValue = randomGen.front;
> >         u.data[i .. i + elemSize] =
> > *cast(ubyte[elemSize]*)&randomValue; }
> > ------------
> > ?
> 
> Looks good except for the fact that the typeof(RNG.front).sizeof <= 16 is completely unnecessary. isIntgeral guarantees that the type is a built-in integral type, the largest of which is 8 bytes (and even if we added cent and ucent, it would still be only 16, not greater than 16).

I'll make it a static assert. It might not be necessary, but it at least documents that elemSize can't be > 16 and it's another safety measure.

> > > * For your functions which can take range types, test with filter!"true" so that you get a range which isn't an array or string.
> > 
> > Does take!(string, int) return a slice? Should have thought about
> > that....
> 
> No. It doesn't, because hasSlicing!string is false.
> 
> > I can't use filter (and map) though:
> > std/algorithm.d(1141): Error: nested structs with constructors are
> > not yet supported in CTFE (Bug 6419) std/uuid.d(1273):
> > called from here: filter("00000000-0000-0000-0000-000000000000"d)
> 
> filter!"true"("00000000-0000-0000-0000-000000000000") should compile just fine. You then pass that to your range-based uuid functions which accept ranges or dchar. What does CTFE have to do with it? That would only be an issue if you were trying to use filter in CTFE, and even if you have CTFE tests where you'd like to do that and can't due to that bug, there are plenty of tests that you could run which weren't CTFE. I think that one or both of us is misunderstanding something here.

I _do_ have CTFE tests. parseUUID and UUID.this(string) are guaranteed to work in CTFE as well. And if I have to use a custom Range for the CTFE tests anyway, why not use it for all tests? (This also allows me to explicitly test Forward and Input Ranges)

> 
> > Thanks for your feedback!
> 
> Also, I just noticed that it looks like you don't have any tests which verify that empty is false when it's supposed to be. You should probably add a few.
There's one such test, but I can add some more.

> 
> And by the way, before this actually gets merged into Phobos (as I'm guessing that it will be, since no way seems to particularly dislike the module thus far), you should fix the ddoc to use LREF and XREF when referencing symbols elsewhere in the module and elsewhere in std respectively.

OK, I replaced some $(D ) with XREF/LREF but MYREF1,2,3 probably have to
stay. Those are needed for the table (at least it's done that way in
std.algorithm)


June 17, 2012
Am Sat, 16 Jun 2012 21:11:51 +0400
schrieb Dmitry Olshansky <dmitry.olsh@gmail.com>:

> On 16.06.2012 21:06, Dmitry Olshansky wrote:
> > On 16.06.2012 15:30, Johannes Pfau wrote:
> >> Am Tue, 12 Jun 2012 13:46:17 +0200
> >> schrieb Johannes Pfau<nospam@example.com>:
> >>
> >>> Am Mon, 11 Jun 2012 13:12:49 +0200
> >>> schrieb Johannes Pfau<nospam@example.com>:
> >>>
> >>>> Am Sun, 10 Jun 2012 18:49:03 +0200
> >>>> schrieb Johannes Pfau<nospam@example.com>:
> >>>>
> >>>>> Am Sat, 09 Jun 2012 21:30:57 +0400
> >>>>> schrieb Dmitry Olshansky<dmitry.olsh@gmail.com>:
> >>>>>> Code: https://github.com/jpf91/phobos/blob/std.uuid/std/uuid.d API-Docs: http://dl.dropbox.com/u/24218791/d/src/uuid.html
> >>>>>
> >>>>> I pushed these changes suggested by Dmitry Olshansky and Jonas Drewsen:
> >>>>>
> >>>>> * Add documentation table
> >>>>> * Rename UUID.isNil --> UUID.empty
> >>>>> * Merge ParserException and IsufficientInputException into
> >>>>> UUIDParserException
> >>>>> * Add note about std.variant.Variant
> >>>>> * Rewrite example to avoid cast
> >>>>> * Add non-working(commented out) variadic constructor
> >>>>> * Rename extractRegex --> uuidRegex
> >>>>> * Use std.algorithm.swap instead of swapRanges
> >>>>
> >>>> * parseUUID now supports InputRanges with ElementType == dchar (all string types are still supported as well)
> >>>>
> >>>>
> >>>
> >>> * randomUUID now creates the RNG only once per thread.
> >>> * randomUUID lazily seeds the RNG once (per thread) on the first
> >>> function call
> >>> * randomUUID now uses the new seed overload (requires
> >>> https://github.com/D-Programming-Language/phobos/pull/627 )
> >>> * the unsafe randomUUID overload taking only a type has been
> >>> removed.
> >>>
> >>
> >> * Add changes suggested by Jonathan M Davis (see https://github.com/jpf91/phobos/commit/91cd1c1f4385cfe1cd868c0720aff257d9436822
> >>
> >> )
> >
> > Surely, a better way then making a global is to routinely check for zeros.
> >
> > Plus, you can cast .ptr to size_t* and compare word by word in 4 or 2 ops(!). You'd better watch out for __ctfe though, it deon't allow casting pointers this way.
> >
> >
> 
> Ah and another way to go about it is:
> union {
> 	ubyte[16] uuid;
> 	size_t[16/size_t.sizeof] by_word;
> }
> 

Isn't that an optimization which should really be done by the compiler? It already knows that it's supposed to compare two ubyte[16]...

Also how could the union solution be used without having to copy the data?
June 17, 2012
On Sunday, June 17, 2012 09:42:40 Johannes Pfau wrote:
> Am Sat, 16 Jun 2012 16:57:25 -0700
> 
> schrieb Jonathan M Davis <jmdavisProg@gmx.com>:
> > Umm. My point was that as far as I can tell, what you're doing is _identical_ to what rndGen is doing.
> 
> Yes right now it's exactly the same. But there's no guarantee that the 'Random' alias (and therefore rndGen) will always use Mt19997.
> 
> The docs explicitly say:
> --------------
> The "default", "favorite", "suggested" random number generator type on the current platform. It is an alias for one of the previously-defined generators. You may want to use it if (1) you need to generate some nice random numbers, and (2) you don't care for the minutiae of the method being used.
> --------------
> 
> so it could be an alias to a 'worse' RNG on some platforms and I'm not sure if that's OK.
> 
> I don't mind changing it to rndGen if someone can tell me that it's fine to use 'worse' RNGs for UUIDs, but boost used MersenneTwister and I don't want to change that unless I'm sure it won't cause problems.

I seriously question that it will _ever_ be anything worse then Mt19997, but if you're that worried about it, maybe you should add a static assert that Random is Mt19997? If you want to leave it as-is, then you should probably at least put a comment in there as to why you're using Mt19997 instead of just using rndGen, otherwise, someone may come along and change it to use rndGen later.

> > So, I don't see what you're doing that's any different from just doing
> > 
> > return randomUUID(rndGen());
> > 
> > aside from the fact that you seem to have an impossible line of code for seeding your random number generator. And trying to compile that line on my computer actually fails (as I'd expect), so I don't know how it's compiling on yours.
> 
> Yes, that code requires a new overload for seed, see
> https://github.com/D-Programming-Language/phobos/pull/627
> And yes, it's generating a infinite range of unpredictableSeed (Mt19937
> uses 624 values from that range)
> 
> I don't know anything about RNGs, but Andrew Talbot suggested in the thread called "Mersenne Twister Seeding and UUIDs" that we need this new, better seeding method.

Ah, okay. Apparently I missed that pull request. I'll have to look it over, particularly since I seem to be pretty much the only one merging anything in of late (particularly since Andrei has been too busy to do much with D lately).

> > > I don't think that's a good idea. If the randomGen.front the is e.g. ubyte, casting it to uint will produce 3 bytes which are not set to a random value. That's not acceptable.
> > > 
> > > ------------
> > > @trusted UUID randomUUID(RNG)(ref RNG randomGen)
> > > if(isUniformRNG!(RNG) &&
> > > 
> > >     isIntegral!(typeof(RNG.front)) && typeof(RNG.front).sizeof <=
> > > 
> > > 16) {
> > > 
> > >     enum elemSize = typeof(RNG.front).sizeof;
> > >     UUID u;
> > >     foreach(size_t i; iota(0, 16, elemSize))
> > >     {
> > > 
> > >         randomGen.popFront();
> > >         immutable randomValue = randomGen.front;
> > >         u.data[i .. i + elemSize] =
> > > 
> > > *cast(ubyte[elemSize]*)&randomValue; }
> > > ------------
> > > ?
> > 
> > Looks good except for the fact that the typeof(RNG.front).sizeof <= 16 is completely unnecessary. isIntgeral guarantees that the type is a built-in integral type, the largest of which is 8 bytes (and even if we added cent and ucent, it would still be only 16, not greater than 16).
> 
> I'll make it a static assert. It might not be necessary, but it at least documents that elemSize can't be > 16 and it's another safety measure.

But it's _completely_ unnecessary. You've already guaranteed it with isIntegral. I guess that you can leave it there if you want to, but as far as I can see, it's pointless.

> > > > * For your functions which can take range types, test with filter!"true" so that you get a range which isn't an array or string.
> > > 
> > > Does take!(string, int) return a slice? Should have thought about
> > > that....
> > 
> > No. It doesn't, because hasSlicing!string is false.
> > 
> > > I can't use filter (and map) though:
> > > std/algorithm.d(1141): Error: nested structs with constructors are
> > > not yet supported in CTFE (Bug 6419) std/uuid.d(1273):
> > > called from here: filter("00000000-0000-0000-0000-000000000000"d)
> > 
> > filter!"true"("00000000-0000-0000-0000-000000000000") should compile just fine. You then pass that to your range-based uuid functions which accept ranges or dchar. What does CTFE have to do with it? That would only be an issue if you were trying to use filter in CTFE, and even if you have CTFE tests where you'd like to do that and can't due to that bug, there are plenty of tests that you could run which weren't CTFE. I think that one or both of us is misunderstanding something here.
> 
> I _do_ have CTFE tests. parseUUID and UUID.this(string) are guaranteed to work in CTFE as well. And if I have to use a custom Range for the CTFE tests anyway, why not use it for all tests? (This also allows me to explicitly test Forward and Input Ranges)

If you have to create other ranges for your tests anyway, then that's fine. But you seemed to be saying that filter didn't work at all, and it should.

> > > Thanks for your feedback!
> > 
> > Also, I just noticed that it looks like you don't have any tests which verify that empty is false when it's supposed to be. You should probably add a few.
> 
> There's one such test, but I can add some more.

You probably don't need a lot, but it does need to be tested, and I didn't see any tests for empty which were on a UUID which wasn't supposed to be empty.

> > And by the way, before this actually gets merged into Phobos (as I'm guessing that it will be, since no way seems to particularly dislike the module thus far), you should fix the ddoc to use LREF and XREF when referencing symbols elsewhere in the module and elsewhere in std respectively.
> 
> OK, I replaced some $(D ) with XREF/LREF but MYREF1,2,3 probably have to
> stay. Those are needed for the table (at least it's done that way in
> std.algorithm)

Stuff for the table is fine. It's primarily an issue of the ddoc directly on functions which you've been using $(D ) with, since that doesn't create links to anything.

- Jonathan M Davis
June 17, 2012
On 17.06.2012 12:04, Johannes Pfau wrote:
> Am Sat, 16 Jun 2012 21:11:51 +0400
> schrieb Dmitry Olshansky<dmitry.olsh@gmail.com>:
>>
>> Ah and another way to go about it is:
>> union {
>> 	ubyte[16] uuid;
>> 	size_t[16/size_t.sizeof] by_word;
>> }
>>
>
> Isn't that an optimization which should really be done by the compiler?
> It already knows that it's supposed to compare two ubyte[16]...

It knows that you compare two ubyte[16] that it.
It easily might miss the fact that one of them is always 0 in all 16 cells.

>
> Also how could the union solution be used without having to copy the
> data?

There is no copy it union, aka overlapped storage. In other words as these 16 bytes represented as (on 32bit) 4 size_t. They share memory location. It doesn't play nice with CTFE though, as it thinks unions to be plain struct last time I checked.

-- 
Dmitry Olshansky
June 17, 2012
Am Sun, 17 Jun 2012 12:15:00 +0400
schrieb Dmitry Olshansky <dmitry.olsh@gmail.com>:

> On 17.06.2012 12:04, Johannes Pfau wrote:
> > Am Sat, 16 Jun 2012 21:11:51 +0400
> > schrieb Dmitry Olshansky<dmitry.olsh@gmail.com>:
> >>
> >> Ah and another way to go about it is:
> >> union {
> >> 	ubyte[16] uuid;
> >> 	size_t[16/size_t.sizeof] by_word;
> >> }
> >>
> >
> > Isn't that an optimization which should really be done by the compiler? It already knows that it's supposed to compare two ubyte[16]...
> 
> It knows that you compare two ubyte[16] that it.
> It easily might miss the fact that one of them is always 0 in all 16
> cells.
> 
> >
> > Also how could the union solution be used without having to copy the data?
> 
> There is no copy it union, aka overlapped storage. In other words as these 16 bytes represented as (on 32bit) 4 size_t. They share memory location. It doesn't play nice with CTFE though, as it thinks unions to be plain struct last time I checked.
> 

Yes, I thought about using the union nested in the empty function, so a copy would have been needed:

bool empty()
{
    union
    {
         ubyte[16] uuid;
         ....
    }
    uuid = data; //copy
}

I didn't know that it's possible to make members of a union private
though, so I could use this in the UUID struct:
union
{
    ubyte[16] data;
    private size_t[16/size_t.sizeof] by_word;
}

which indeed wouldn't require copying. However, with this code added std.uuid fails some unrelated ctfe UUID comparison unittests...
1 2 3 4 5
Top | Discussion index | About this forum | D home