View mode: basic / threaded / horizontal-split · Log in · Help
June 16, 2012
Re: Review: std.uuid
Am Wed, 13 Jun 2012 18:57:51 -0700
schrieb Jonathan M Davis <jmdavisProg@gmx.com>:

> * I'm not terribly fond of having names like Variant and Version, but
> they _are_ local to UUID, so I guess that they're clear enough. But
> the fact that you have to put a note about it _not_ being
> std.variant.Variant definitely indicates that another name might be
> better. I know that the RFC uses the word variant, but it also
> indicates that type would probably be a better name, and if Variant
> is unclear enough that a note is needed, then perhaps it really isn't
> the right name.

Well I added the note as a result of this review. I think whether the
name variant is confusing depends a lot on whether you've already used
a UUID library or std.variant. Naming it 'Type' might confuse those who
already worked with UUIDs. And 'Type' is a very generic name, even more
than UUID and at some point we'll have other 'Type' symbols in phobos
as well.

So I'd rather like to keep the name consistent with the RFC.
> 
> * nameBasedMd5 should probably be nameBasedMD5, and nameBasedSha1
> should probably be nameBasedSHA1. The correct names are MD5 and
> SHA-1, not md5 and sha1, so it would be more correct to use
> nameBasedMD5 and nameBasedSHA1. With acronyms, the thing that most of
> Phobos does is to keep all of the letters the same case, so if the
> first letter is lowercase, then the whole thing is lowercase, and if
> the first letter is uppercase, then the whole thing is uppercase. I'm
> not sure if MD5 and SHA-1 are acronyms or not, but they're similar
> enough, that I'd say that the same naming scheme should be used,
> which you do in most cases, but you didn't with the enums for some
> reason.
yep seems I missed those when I refactored the names.

> * For clarity's sake, please always put const after the function
> signatures for all const functions. In same cases you do, and in
> others you don't.
OK (Imho it's no confusing though if const is part of a attribute
list, as in '@safe nothrow const pure')
> 
> * If all that's preventing toString from being nothrow is idup, then
> just do this:
> 
> string toString() @safe const pure nothrow
> {
>     try
>         return _toString().idup;
>     catch(Exception e)
>         assert(0, "It should be impossible for idup to throw.");
> }
> 
> and if there's a bug report for idup, then list it in a comment so
> that we know to get rid of the try-catch block once that's been fixed
> (and if there isn't a bug report, one should be created).

_toString is already nothrow, so yes, it's only idups fault ;-)
Added links to http://d.puremagic.com/issues/show_bug.cgi?id=5700

> 
> * Get rid of nilUUID. Just use UUID.init. It just seems like an
> extra, unnecessary complication when the two are identical.
OK

> 
> * Please use SHA-1 in the documentation rather than SHA1, since SHA-1
> is the correct name. Also, make sure that it's SHA-1 and not just
> SHA. You're not always consistent (e.g. take a look at sha1UUID).
OK

> 
> * parseUUID should be clear on whether it takes both lowercase and
> uppercase hex digits, and I'd argue that it should definitely accept
> both. [0-9a-f] implies that it only accepts lowercase hex digits.
OK

> * Use std.exception's assertThrown. It'll make your tests cleaner.
> And if you can't do that, because you need to look at the exception
> itself, then use collectException. So,
> 
> thrown = false;
> try
>     //make sure 36 characters in total or we'll get a 'tooMuch' reason
>     id = UUID("{8ab3060e-2cba-4f23-b74c-b52db3bdf6}");
> catch(UUIDParserException e)
> {
>     assert(e.reason == UUIDParserException.Reason.invalidChar);
>     thrown = true;
> }
> assert(thrown);
> 
> becomes
> 
> //make sure 36 characters in total or we'll get a 'tooMuch' reason
> auto e =
> collectException!UUIDParserException(UUID("{8ab3060e-2cba-4f23-b74c-
> b52db3bdf6}")); assert(e !is null);
> assert(e.reason == UUIDParserException.Reason.invalidChar);
> 
> It's much shorter that way and less error-prone.

OK

> * You should probably change empty's body to
> 
> enum ubyte[16] emptyUUID = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0];
> return data == emptyUUID[];
> 
> I'm not sure whether it currently allocates or not if you do that (I
> _think_ that it'll do the heap allocation at compile time and then
> only have the value type at runtime), but regardless of whether it
> does or not, I'd expect that once it's fixed so that assigning an
> array literal to a static array, it definitely _won't_ allocate on
> the heap, whereas your current implementation will regardless of that
> fix.
> 
> And if that doesn't seem like a good change, then you can just make a
> module- level or struct variable which is immutable ubyte[16] and
> have empty use that. As it stands, it's always going to allocate the
> array literal on the heap. It's probably not a big deal, since empty
> probably won't be called all that often, but if we can easily avoid
> the extra heap allocation, we should.

I totally agree, we shouldn't allocate there.

> * I know that
> 
> void toString(scope void delegate(const(char)[]) sink)
> 
> is what has been previously discussed for a toString which doesn't
> allocate, but I have to wonder why we don't just use an output range.
> I suppose that that's a whole other discussion, but that's what I
> would have suggested rather than a delegate - though in some
> circumstances it probably _is_ easier to just create a nested
> function rather than a whole output range. Maybe we should be moving
> to having 3 overloads of toString (normal, accepts delegate, and
> accepts output range).

Probably, but as you said that's a whole other discussion :-)

> 
> * rndGen returns a Random (which is already aliased to Mt19937), and
> already initializes it with an unpredictable seed. So, why not just
> us rndGen in randomUUID? It becomes
> 
> return randomUUID(rndGen());
> 
> It's much shorter and just as good as far as I can tell.

Good catch. We'd have to change rndGen to use the new seed function
though. And the documentation for Random says: "You may want to use it
if [...] you don't care for the minutiae of the method being used"

I'm not sure, do we care about the method? We need something which
generates 'good' random numbers, I don't think a LCG would be
acceptable. But rndGen could default to something like that on embedded
systems?

Then again, I know nothing about random number generators, so someone
please tell me: Do we need  Mersenne twister for random UUIDS or could
we use any RNG?

> 
> * Change randomValue in randomUUID to auto. The default random number 
> generator _does_ use uint, but others could use ubyte, ushort, or
> ulong instead. I'd also be tempted to argue that you should make it
> immutable and use a different variable for each random value (you
> won't ever accidentally fail to set it to a new value that way - that
> I don't think that reusing variables is a good idea in general), but
> the benefits of that are debatable.
> 
> Actually, just make it a loop. Something like
> 
> foreach(size_t i; iota(0, 16, 4))
> {
>     randomGen.popFront();
>     immutable randomValue = randomGen.front;
>     u.data[i .. i + 4] = *cast(ubyte[4]*)&randomValue;
> }
> 
> That's probably slightly less efficient (though I wouldn't think by
> much), but it's much cleaner. And actually, since the randomValue
> isn't guarantee to be a uint, but your code needs it to be 4 bytes,
> you should probably change that line to
> 
> immutable uint randomValue = cast(uint)randomGen.front;

I don't think that's a good idea. If the randomGen.front the is e.g.
ubyte, casting it to uint will produce 3 bytes which are not set to a
random value. That's not acceptable.

------------
@trusted UUID randomUUID(RNG)(ref RNG randomGen) if(isUniformRNG!(RNG)
&&
   isIntegral!(typeof(RNG.front)) && typeof(RNG.front).sizeof <= 16)
{
   enum elemSize = typeof(RNG.front).sizeof;
   UUID u;
   foreach(size_t i; iota(0, 16, elemSize))
   {
       randomGen.popFront();
       immutable randomValue = randomGen.front;
       u.data[i .. i + elemSize] = *cast(ubyte[elemSize]*)&randomValue;
   }
------------
?

> 
> * I'd advise giving more descriptive names to parseError's
> parameters. It's true that it's a nested function and not in the
> actual API, but it would make the function easier to understand.
OK


> * You should probably use R or Range for template parameters which
> are ranges. That's the typical thing to do. T is usually used for
> generic types rather than for ranges. The code will be somewhat
> easier to read if you use R or Range given that that's how the rest
> of Phobos is.
OK

> 
> * In parseUUID, element should be a size_t, since it's used
> specifically for indexing. As it stands, the compiler is going to
> have to do extra conversions in some of the places that it's used
> (e.g. converting it to size_t when indexing).
OK

> 
> * In the unit tests, I'd argue that it would be cleaner to just use
> the result directly in an assertion rather than saving it if it's not
> going to be used outside of the assertion. For instance,
> 
> id = parseUUID("5668122d-9df0-49a4-ad0b-b9b0a57f886a"d);
> assert(id.data == [86, 104, 18, 45, 157, 240, 73, 164, 173,
>         11, 185, 176, 165, 127, 136, 106]);
> 
> should probably be
> 
> assert(parseUUID("5668122d-9df0-49a4-ad0b-b9b0a57f886a"d).data ==
>            [86, 104, 18, 45, 157, 240, 73, 164, 173, 11, 185, 176,
> 165, 127, 136, 106]);
> 
> It's generally shorter, cleaner, and avoids the risk of reusing the
> variable without setting it again.
OK

> * For your functions which can take range types, test with
> filter!"true" so that you get a range which isn't an array or string.
Does take!(string, int) return a slice? Should have thought about
that....

I can't use filter (and map) though:
std/algorithm.d(1141): Error: nested structs with constructors are not
yet supported in CTFE (Bug 6419) std/uuid.d(1273):        called from
here: filter("00000000-0000-0000-0000-000000000000"d)


> 
> * When testing various string types, you should take advantage of
> TypeTuple and std.conv.to to be thorough. For instance, instead of
> 
> id = parseUUID("5668122d-9df0-49a4-ad0b-b9b0a57f886a"d);
> assert(id.data == [86, 104, 18, 45, 157, 240, 73, 164, 173,
>     11, 185, 176, 165, 127, 136, 106]);
> id = parseUUID("5668122d-9df0-49a4-ad0b-b9b0a57f886a"w);
> assert(id.data == [86, 104, 18, 45, 157, 240, 73, 164, 173,
>     11, 185, 176, 165, 127, 136, 106]);
> 
> do something like
> 
> foreach(S; TypeTuple!(char, const char, immutable char,
>                       wchar, const wchar, immutable wchar,
>                       dchar, const dchar, immutable dchar))
> {
>     assert(parseUUID(to!S("5668122d-9df0-49a4-ad0b-b9b0a57f886a")).data
> == [86, 104, 18, 45, 157, 240, 73, 164, 173, 11, 185, 176, 165, 127, 
> 136, 106]);
> }
> 
> And while you're at it, you could throw _all_ of your string tests in
> that loop save for those which wouldn't really make sense to put in
> there (and you probably don't want to put very many exception tests
> in the loop, since those get to be expensive). You get much better
> testing this way.
OK (I'll put the exception tests in the loop as well - There'll be ~100
exception tests in total and throwing/catching 100 exceptions takes
~120 msec here, so it should be fast enough)

> 
> However, do make sure that some of the exception tests use multiple
> string types even if they aren't in the TypeTuple loop, since some of
> your exceptions (in parseUUID in particular) depend on the type of
> range passed in.
OK

> 
> * I'm not sure if you do or not, but if you have a spot where you
> have to create a UUIDParserException from another exception but don't
> want to have to specify the file and line number, then you should
> create an overload of UUIDParserException which takes the Throwable
> first (and probably forwards to the other constructor to avoid code
> duplication). Exception does this, and it can be quite useful. But
> UUIDParserException can't be publicly constructed, so only do it if
> it's actually useful within std.uuid.
OK
> 
> * Personally, I'd prefer UUIDParsingException to UUIDParserException,
> since you're not writing a full-on parser but rather just doing
> parsing in some of your functions. But that's fairly subjective.
OK
> 
> - Jonathan M Davis

Thanks for your feedback!
June 16, 2012
std.uuid: Update 4
Am Tue, 12 Jun 2012 13:46:17 +0200
schrieb Johannes Pfau <nospam@example.com>:

> Am Mon, 11 Jun 2012 13:12:49 +0200
> schrieb Johannes Pfau <nospam@example.com>:
> 
> > Am Sun, 10 Jun 2012 18:49:03 +0200
> > schrieb Johannes Pfau <nospam@example.com>:
> > 
> > > Am Sat, 09 Jun 2012 21:30:57 +0400
> > > schrieb Dmitry Olshansky <dmitry.olsh@gmail.com>:
> > > > Code: https://github.com/jpf91/phobos/blob/std.uuid/std/uuid.d
> > > > API-Docs: http://dl.dropbox.com/u/24218791/d/src/uuid.html
> > > 
> > > I pushed these changes suggested by Dmitry Olshansky and Jonas
> > > Drewsen:
> > > 
> > > * Add documentation table
> > > * Rename UUID.isNil --> UUID.empty
> > > * Merge ParserException and IsufficientInputException into
> > >   UUIDParserException
> > > * Add note about std.variant.Variant
> > > * Rewrite example to avoid cast
> > > * Add non-working(commented out) variadic constructor
> > > * Rename extractRegex --> uuidRegex
> > > * Use std.algorithm.swap instead of swapRanges
> > 
> > * parseUUID now supports InputRanges with ElementType == dchar (all
> >   string types are still supported as well)
> > 
> > 
> 
> * randomUUID now creates the RNG only once per thread.
> * randomUUID lazily seeds the RNG once (per thread) on the first
>   function call
> * randomUUID now uses the new seed overload (requires
>   https://github.com/D-Programming-Language/phobos/pull/627 )
> * the unsafe randomUUID overload taking only a type has been removed.
> 

* Add changes suggested by Jonathan M Davis (see
 https://github.com/jpf91/phobos/commit/91cd1c1f4385cfe1cd868c0720aff257d9436822
 )
June 16, 2012
Re: std.uuid: Update 4
On 16.06.2012 15:30, Johannes Pfau wrote:
> Am Tue, 12 Jun 2012 13:46:17 +0200
> schrieb Johannes Pfau<nospam@example.com>:
>
>> Am Mon, 11 Jun 2012 13:12:49 +0200
>> schrieb Johannes Pfau<nospam@example.com>:
>>
>>> Am Sun, 10 Jun 2012 18:49:03 +0200
>>> schrieb Johannes Pfau<nospam@example.com>:
>>>
>>>> Am Sat, 09 Jun 2012 21:30:57 +0400
>>>> schrieb Dmitry Olshansky<dmitry.olsh@gmail.com>:
>>>>> Code: https://github.com/jpf91/phobos/blob/std.uuid/std/uuid.d
>>>>> API-Docs: http://dl.dropbox.com/u/24218791/d/src/uuid.html
>>>>
>>>> I pushed these changes suggested by Dmitry Olshansky and Jonas
>>>> Drewsen:
>>>>
>>>> * Add documentation table
>>>> * Rename UUID.isNil -->  UUID.empty
>>>> * Merge ParserException and IsufficientInputException into
>>>>    UUIDParserException
>>>> * Add note about std.variant.Variant
>>>> * Rewrite example to avoid cast
>>>> * Add non-working(commented out) variadic constructor
>>>> * Rename extractRegex -->  uuidRegex
>>>> * Use std.algorithm.swap instead of swapRanges
>>>
>>> * parseUUID now supports InputRanges with ElementType == dchar (all
>>>    string types are still supported as well)
>>>
>>>
>>
>> * randomUUID now creates the RNG only once per thread.
>> * randomUUID lazily seeds the RNG once (per thread) on the first
>>    function call
>> * randomUUID now uses the new seed overload (requires
>>    https://github.com/D-Programming-Language/phobos/pull/627 )
>> * the unsafe randomUUID overload taking only a type has been removed.
>>
>
> * Add changes suggested by Jonathan M Davis (see
>    https://github.com/jpf91/phobos/commit/91cd1c1f4385cfe1cd868c0720aff257d9436822
>    )

Surely, a better way then making a global is to routinely check for zeros.

Plus, you can cast .ptr to size_t* and compare word by word in 4 or 2 
ops(!). You'd better watch out for __ctfe though, it deon't allow 
casting pointers this way.


-- 
Dmitry Olshansky
June 16, 2012
Re: std.uuid: Update 4
On 16.06.2012 21:06, Dmitry Olshansky wrote:
> On 16.06.2012 15:30, Johannes Pfau wrote:
>> Am Tue, 12 Jun 2012 13:46:17 +0200
>> schrieb Johannes Pfau<nospam@example.com>:
>>
>>> Am Mon, 11 Jun 2012 13:12:49 +0200
>>> schrieb Johannes Pfau<nospam@example.com>:
>>>
>>>> Am Sun, 10 Jun 2012 18:49:03 +0200
>>>> schrieb Johannes Pfau<nospam@example.com>:
>>>>
>>>>> Am Sat, 09 Jun 2012 21:30:57 +0400
>>>>> schrieb Dmitry Olshansky<dmitry.olsh@gmail.com>:
>>>>>> Code: https://github.com/jpf91/phobos/blob/std.uuid/std/uuid.d
>>>>>> API-Docs: http://dl.dropbox.com/u/24218791/d/src/uuid.html
>>>>>
>>>>> I pushed these changes suggested by Dmitry Olshansky and Jonas
>>>>> Drewsen:
>>>>>
>>>>> * Add documentation table
>>>>> * Rename UUID.isNil --> UUID.empty
>>>>> * Merge ParserException and IsufficientInputException into
>>>>> UUIDParserException
>>>>> * Add note about std.variant.Variant
>>>>> * Rewrite example to avoid cast
>>>>> * Add non-working(commented out) variadic constructor
>>>>> * Rename extractRegex --> uuidRegex
>>>>> * Use std.algorithm.swap instead of swapRanges
>>>>
>>>> * parseUUID now supports InputRanges with ElementType == dchar (all
>>>> string types are still supported as well)
>>>>
>>>>
>>>
>>> * randomUUID now creates the RNG only once per thread.
>>> * randomUUID lazily seeds the RNG once (per thread) on the first
>>> function call
>>> * randomUUID now uses the new seed overload (requires
>>> https://github.com/D-Programming-Language/phobos/pull/627 )
>>> * the unsafe randomUUID overload taking only a type has been removed.
>>>
>>
>> * Add changes suggested by Jonathan M Davis (see
>> https://github.com/jpf91/phobos/commit/91cd1c1f4385cfe1cd868c0720aff257d9436822
>>
>> )
>
> Surely, a better way then making a global is to routinely check for zeros.
>
> Plus, you can cast .ptr to size_t* and compare word by word in 4 or 2
> ops(!). You'd better watch out for __ctfe though, it deon't allow
> casting pointers this way.
>
>

Ah and another way to go about it is:
union {
	ubyte[16] uuid;
	size_t[16/size_t.sizeof] by_word;
}

-- 
Dmitry Olshansky
June 16, 2012
Re: Review: std.uuid
On Saturday, June 16, 2012 13:27:54 Johannes Pfau wrote:
> Am Wed, 13 Jun 2012 18:57:51 -0700
> So I'd rather like to keep the name consistent with the RFC.

I understand. I don't think that there's a really a good solution. So, if you 
think that keeping it as-is is best, then we can do that.

> _toString is already nothrow, so yes, it's only idups fault ;-)
> Added links to http://d.puremagic.com/issues/show_bug.cgi?id=5700

Please just put it as a comment above the try-catch rather than putting it in 
the ddoc. No one using the function needs to know that there's a workaround 
for nothrow. They just need to know that it's nothrow. It's the Phobos 
maintainers that need to know that there's a workaround. It seems fairly 
common to do something like

//@@@BUG@@@ workaround for bugzilla 5700

so that it's nicely greppable.

> > * parseUUID should be clear on whether it takes both lowercase and
> > uppercase hex digits, and I'd argue that it should definitely accept
> > both. [0-9a-f] implies that it only accepts lowercase hex digits.
> 
> OK

Make sure that you have tests which test UUIDs with both uppercase and 
lowercase letters. I only see tests with lowercase letters. You seem to have 
updated an example with some uppercase letters but not any tests from what I 
can see. Speaking of which, please make sure that you have unittest blocks 
with your examples in them to make sure that they compile. Personally, I 
always put

//Verify Example.
unittest
{
   //example code...
}

after a function with an example and make sure that the example and the code 
in the unittest block match exactly (save for indentation, since any 
indentation in the example in the ddoc ends up in the final ddoc, and we do 
want the indentation in the unittest block).

> > * rndGen returns a Random (which is already aliased to Mt19937), and
> > already initializes it with an unpredictable seed. So, why not just
> > us rndGen in randomUUID? It becomes
> > 
> > return randomUUID(rndGen());
> > 
> > It's much shorter and just as good as far as I can tell.
> 
> Good catch. We'd have to change rndGen to use the new seed function
> though. And the documentation for Random says: "You may want to use it
> if [...] you don't care for the minutiae of the method being used"
> 
> I'm not sure, do we care about the method? We need something which
> generates 'good' random numbers, I don't think a LCG would be
> acceptable. But rndGen could default to something like that on embedded
> systems?
> 
> Then again, I know nothing about random number generators, so someone
> please tell me: Do we need  Mersenne twister for random UUIDS or could
> we use any RNG?

Umm. My point was that as far as I can tell, what you're doing is _identical_ 
to what rndGen is doing.

Yours
-----------
@trusted UUID randomUUID()()
{
   static Mt19937 randomGen;
   static bool seeded = false;
   if(!seeded)
   {
       randomGen.seed(map!((a) => unpredictableSeed)(repeat(0)));
       seeded = true;
   }
   return randomUUID(randomGen);
}
-----------

vs

-----------
alias Mt19937 Random;

@property ref Random rndGen()
{
   static Random result;
   static bool initialized;
   if (!initialized)
   {
       result = Random(unpredictableSeed);
       initialized = true;
   }
   return result;
}
-----------

Really, the only differences are the fact that you're passing the Random into 
randomUUID and returning that rather than returing the Random, and you're 
using a map to do the seed for reasons that completely escape me. In fact, I'm 
not sure how that code even compiles. It creates an infinite range whose every 
value is zero and then maps all of those zeroes to unpredictableSeed. So, 
you've created an infinite range of unpredictableSeed results. But Mt19937's 
seed takes a single uint, not a range. So, I'm baffled as to how that line even 
compiles.

In any case, it looks like they're both using Random, and they're both seeding 
it with an unpredictableSeed. It's just that in noe case, it's using Random's 
constructor to do it, while in the other, seed is being called explicitly 
(thereby _re_seeding the Random).

So, I don't see what you're doing that's any different from just doing

return randomUUID(rndGen());

aside from the fact that you seem to have an impossible line of code for 
seeding your random number generator. And trying to compile that line on my 
computer actually fails (as I'd expect), so I don't know how it's compiling on 
yours.

> I don't think that's a good idea. If the randomGen.front the is e.g.
> ubyte, casting it to uint will produce 3 bytes which are not set to a
> random value. That's not acceptable.
> 
> ------------
> @trusted UUID randomUUID(RNG)(ref RNG randomGen) if(isUniformRNG!(RNG)
> &&
>     isIntegral!(typeof(RNG.front)) && typeof(RNG.front).sizeof <= 16)
> {
>     enum elemSize = typeof(RNG.front).sizeof;
>     UUID u;
>     foreach(size_t i; iota(0, 16, elemSize))
>     {
>         randomGen.popFront();
>         immutable randomValue = randomGen.front;
>         u.data[i .. i + elemSize] = *cast(ubyte[elemSize]*)&randomValue;
>     }
> ------------
> ?

Looks good except for the fact that the typeof(RNG.front).sizeof <= 16 is 
completely unnecessary. isIntgeral guarantees that the type is a built-in 
integral type, the largest of which is 8 bytes (and even if we added cent and 
ucent, it would still be only 16, not greater than 16).

> > * For your functions which can take range types, test with
> > filter!"true" so that you get a range which isn't an array or string.
> 
> Does take!(string, int) return a slice? Should have thought about
> that....

No. It doesn't, because hasSlicing!string is false.

> I can't use filter (and map) though:
> std/algorithm.d(1141): Error: nested structs with constructors are not
> yet supported in CTFE (Bug 6419) std/uuid.d(1273):        called from
> here: filter("00000000-0000-0000-0000-000000000000"d)

filter!"true"("00000000-0000-0000-0000-000000000000") should compile just fine. 
You then pass that to your range-based uuid functions which accept ranges or 
dchar. What does CTFE have to do with it? That would only be an issue if you 
were trying to use filter in CTFE, and even if you have CTFE tests where you'd 
like to do that and can't due to that bug, there are plenty of tests that you 
could run which weren't CTFE. I think that one or both of us is 
misunderstanding something here.

> Thanks for your feedback!

Also, I just noticed that it looks like you don't have any tests which verify 
that empty is false when it's supposed to be. You should probably add a few.

And by the way, before this actually gets merged into Phobos (as I'm guessing 
that it will be, since no way seems to particularly dislike the module thus 
far), you should fix the ddoc to use LREF and XREF when referencing symbols 
elsewhere in the module and elsewhere in std respectively.

- Jonathan M Davis
June 17, 2012
Re: Review: std.uuid
Am Sat, 16 Jun 2012 16:57:25 -0700
schrieb Jonathan M Davis <jmdavisProg@gmx.com>:

> Umm. My point was that as far as I can tell, what you're doing is
> _identical_ to what rndGen is doing.
> 

Yes right now it's exactly the same. But there's no guarantee that the
'Random' alias (and therefore rndGen) will always use Mt19997.

The docs explicitly say:
--------------
The "default", "favorite", "suggested" random number generator type on
the current platform. It is an alias for one of the previously-defined
generators. You may want to use it if (1) you need to generate some
nice random numbers, and (2) you don't care for the minutiae of the
method being used.
--------------

so it could be an alias to a 'worse' RNG on some platforms and I'm not
sure if that's OK.

I don't mind changing it to rndGen if someone can tell me that it's fine
to use 'worse' RNGs for UUIDs, but boost used MersenneTwister and I
don't want to change that unless I'm sure it won't cause problems.

> 
> So, I don't see what you're doing that's any different from just doing
> 
> return randomUUID(rndGen());
> 
> aside from the fact that you seem to have an impossible line of code
> for seeding your random number generator. And trying to compile that
> line on my computer actually fails (as I'd expect), so I don't know
> how it's compiling on yours.

Yes, that code requires a new overload for seed, see
https://github.com/D-Programming-Language/phobos/pull/627
And yes, it's generating a infinite range of unpredictableSeed (Mt19937
uses 624 values from that range)

I don't know anything about RNGs, but Andrew Talbot suggested in the
thread called "Mersenne Twister Seeding and UUIDs" that we need this
new, better seeding method. 

> 
> > I don't think that's a good idea. If the randomGen.front the is e.g.
> > ubyte, casting it to uint will produce 3 bytes which are not set to
> > a random value. That's not acceptable.
> > 
> > ------------
> > @trusted UUID randomUUID(RNG)(ref RNG randomGen)
> > if(isUniformRNG!(RNG) &&
> >     isIntegral!(typeof(RNG.front)) && typeof(RNG.front).sizeof <=
> > 16) {
> >     enum elemSize = typeof(RNG.front).sizeof;
> >     UUID u;
> >     foreach(size_t i; iota(0, 16, elemSize))
> >     {
> >         randomGen.popFront();
> >         immutable randomValue = randomGen.front;
> >         u.data[i .. i + elemSize] =
> > *cast(ubyte[elemSize]*)&randomValue; }
> > ------------
> > ?
> 
> Looks good except for the fact that the typeof(RNG.front).sizeof <=
> 16 is completely unnecessary. isIntgeral guarantees that the type is
> a built-in integral type, the largest of which is 8 bytes (and even
> if we added cent and ucent, it would still be only 16, not greater
> than 16).

I'll make it a static assert. It might not be necessary, but it at
least documents that elemSize can't be > 16 and it's another safety
measure.

> > > * For your functions which can take range types, test with
> > > filter!"true" so that you get a range which isn't an array or
> > > string.
> > 
> > Does take!(string, int) return a slice? Should have thought about
> > that....
> 
> No. It doesn't, because hasSlicing!string is false.
> 
> > I can't use filter (and map) though:
> > std/algorithm.d(1141): Error: nested structs with constructors are
> > not yet supported in CTFE (Bug 6419) std/uuid.d(1273):
> > called from here: filter("00000000-0000-0000-0000-000000000000"d)
> 
> filter!"true"("00000000-0000-0000-0000-000000000000") should compile
> just fine. You then pass that to your range-based uuid functions
> which accept ranges or dchar. What does CTFE have to do with it? That
> would only be an issue if you were trying to use filter in CTFE, and
> even if you have CTFE tests where you'd like to do that and can't due
> to that bug, there are plenty of tests that you could run which
> weren't CTFE. I think that one or both of us is misunderstanding
> something here.

I _do_ have CTFE tests. parseUUID and UUID.this(string) are guaranteed
to work in CTFE as well. And if I have to use a custom Range for the
CTFE tests anyway, why not use it for all tests? (This also allows me to
explicitly test Forward and Input Ranges)

> 
> > Thanks for your feedback!
> 
> Also, I just noticed that it looks like you don't have any tests
> which verify that empty is false when it's supposed to be. You should
> probably add a few.
There's one such test, but I can add some more.

> 
> And by the way, before this actually gets merged into Phobos (as I'm
> guessing that it will be, since no way seems to particularly dislike
> the module thus far), you should fix the ddoc to use LREF and XREF
> when referencing symbols elsewhere in the module and elsewhere in std
> respectively.

OK, I replaced some $(D ) with XREF/LREF but MYREF1,2,3 probably have to
stay. Those are needed for the table (at least it's done that way in
std.algorithm)
June 17, 2012
Re: std.uuid: Update 4
Am Sat, 16 Jun 2012 21:11:51 +0400
schrieb Dmitry Olshansky <dmitry.olsh@gmail.com>:

> On 16.06.2012 21:06, Dmitry Olshansky wrote:
> > On 16.06.2012 15:30, Johannes Pfau wrote:
> >> Am Tue, 12 Jun 2012 13:46:17 +0200
> >> schrieb Johannes Pfau<nospam@example.com>:
> >>
> >>> Am Mon, 11 Jun 2012 13:12:49 +0200
> >>> schrieb Johannes Pfau<nospam@example.com>:
> >>>
> >>>> Am Sun, 10 Jun 2012 18:49:03 +0200
> >>>> schrieb Johannes Pfau<nospam@example.com>:
> >>>>
> >>>>> Am Sat, 09 Jun 2012 21:30:57 +0400
> >>>>> schrieb Dmitry Olshansky<dmitry.olsh@gmail.com>:
> >>>>>> Code: https://github.com/jpf91/phobos/blob/std.uuid/std/uuid.d
> >>>>>> API-Docs: http://dl.dropbox.com/u/24218791/d/src/uuid.html
> >>>>>
> >>>>> I pushed these changes suggested by Dmitry Olshansky and Jonas
> >>>>> Drewsen:
> >>>>>
> >>>>> * Add documentation table
> >>>>> * Rename UUID.isNil --> UUID.empty
> >>>>> * Merge ParserException and IsufficientInputException into
> >>>>> UUIDParserException
> >>>>> * Add note about std.variant.Variant
> >>>>> * Rewrite example to avoid cast
> >>>>> * Add non-working(commented out) variadic constructor
> >>>>> * Rename extractRegex --> uuidRegex
> >>>>> * Use std.algorithm.swap instead of swapRanges
> >>>>
> >>>> * parseUUID now supports InputRanges with ElementType == dchar
> >>>> (all string types are still supported as well)
> >>>>
> >>>>
> >>>
> >>> * randomUUID now creates the RNG only once per thread.
> >>> * randomUUID lazily seeds the RNG once (per thread) on the first
> >>> function call
> >>> * randomUUID now uses the new seed overload (requires
> >>> https://github.com/D-Programming-Language/phobos/pull/627 )
> >>> * the unsafe randomUUID overload taking only a type has been
> >>> removed.
> >>>
> >>
> >> * Add changes suggested by Jonathan M Davis (see
> >> https://github.com/jpf91/phobos/commit/91cd1c1f4385cfe1cd868c0720aff257d9436822
> >>
> >> )
> >
> > Surely, a better way then making a global is to routinely check for
> > zeros.
> >
> > Plus, you can cast .ptr to size_t* and compare word by word in 4 or
> > 2 ops(!). You'd better watch out for __ctfe though, it deon't allow
> > casting pointers this way.
> >
> >
> 
> Ah and another way to go about it is:
> union {
> 	ubyte[16] uuid;
> 	size_t[16/size_t.sizeof] by_word;
> }
> 

Isn't that an optimization which should really be done by the compiler?
It already knows that it's supposed to compare two ubyte[16]...

Also how could the union solution be used without having to copy the
data?
June 17, 2012
Re: Review: std.uuid
On Sunday, June 17, 2012 09:42:40 Johannes Pfau wrote:
> Am Sat, 16 Jun 2012 16:57:25 -0700
> 
> schrieb Jonathan M Davis <jmdavisProg@gmx.com>:
> > Umm. My point was that as far as I can tell, what you're doing is
> > _identical_ to what rndGen is doing.
> 
> Yes right now it's exactly the same. But there's no guarantee that the
> 'Random' alias (and therefore rndGen) will always use Mt19997.
> 
> The docs explicitly say:
> --------------
> The "default", "favorite", "suggested" random number generator type on
> the current platform. It is an alias for one of the previously-defined
> generators. You may want to use it if (1) you need to generate some
> nice random numbers, and (2) you don't care for the minutiae of the
> method being used.
> --------------
> 
> so it could be an alias to a 'worse' RNG on some platforms and I'm not
> sure if that's OK.
> 
> I don't mind changing it to rndGen if someone can tell me that it's fine
> to use 'worse' RNGs for UUIDs, but boost used MersenneTwister and I
> don't want to change that unless I'm sure it won't cause problems.

I seriously question that it will _ever_ be anything worse then Mt19997, but 
if you're that worried about it, maybe you should add a static assert that 
Random is Mt19997? If you want to leave it as-is, then you should probably at 
least put a comment in there as to why you're using Mt19997 instead of just 
using rndGen, otherwise, someone may come along and change it to use rndGen 
later.

> > So, I don't see what you're doing that's any different from just doing
> > 
> > return randomUUID(rndGen());
> > 
> > aside from the fact that you seem to have an impossible line of code
> > for seeding your random number generator. And trying to compile that
> > line on my computer actually fails (as I'd expect), so I don't know
> > how it's compiling on yours.
> 
> Yes, that code requires a new overload for seed, see
> https://github.com/D-Programming-Language/phobos/pull/627
> And yes, it's generating a infinite range of unpredictableSeed (Mt19937
> uses 624 values from that range)
> 
> I don't know anything about RNGs, but Andrew Talbot suggested in the
> thread called "Mersenne Twister Seeding and UUIDs" that we need this
> new, better seeding method.

Ah, okay. Apparently I missed that pull request. I'll have to look it over, 
particularly since I seem to be pretty much the only one merging anything in 
of late (particularly since Andrei has been too busy to do much with D 
lately).

> > > I don't think that's a good idea. If the randomGen.front the is e.g.
> > > ubyte, casting it to uint will produce 3 bytes which are not set to
> > > a random value. That's not acceptable.
> > > 
> > > ------------
> > > @trusted UUID randomUUID(RNG)(ref RNG randomGen)
> > > if(isUniformRNG!(RNG) &&
> > > 
> > >     isIntegral!(typeof(RNG.front)) && typeof(RNG.front).sizeof <=
> > > 
> > > 16) {
> > > 
> > >     enum elemSize = typeof(RNG.front).sizeof;
> > >     UUID u;
> > >     foreach(size_t i; iota(0, 16, elemSize))
> > >     {
> > >     
> > >         randomGen.popFront();
> > >         immutable randomValue = randomGen.front;
> > >         u.data[i .. i + elemSize] =
> > > 
> > > *cast(ubyte[elemSize]*)&randomValue; }
> > > ------------
> > > ?
> > 
> > Looks good except for the fact that the typeof(RNG.front).sizeof <=
> > 16 is completely unnecessary. isIntgeral guarantees that the type is
> > a built-in integral type, the largest of which is 8 bytes (and even
> > if we added cent and ucent, it would still be only 16, not greater
> > than 16).
> 
> I'll make it a static assert. It might not be necessary, but it at
> least documents that elemSize can't be > 16 and it's another safety
> measure.

But it's _completely_ unnecessary. You've already guaranteed it with 
isIntegral. I guess that you can leave it there if you want to, but as far as 
I can see, it's pointless.

> > > > * For your functions which can take range types, test with
> > > > filter!"true" so that you get a range which isn't an array or
> > > > string.
> > > 
> > > Does take!(string, int) return a slice? Should have thought about
> > > that....
> > 
> > No. It doesn't, because hasSlicing!string is false.
> > 
> > > I can't use filter (and map) though:
> > > std/algorithm.d(1141): Error: nested structs with constructors are
> > > not yet supported in CTFE (Bug 6419) std/uuid.d(1273):
> > > called from here: filter("00000000-0000-0000-0000-000000000000"d)
> > 
> > filter!"true"("00000000-0000-0000-0000-000000000000") should compile
> > just fine. You then pass that to your range-based uuid functions
> > which accept ranges or dchar. What does CTFE have to do with it? That
> > would only be an issue if you were trying to use filter in CTFE, and
> > even if you have CTFE tests where you'd like to do that and can't due
> > to that bug, there are plenty of tests that you could run which
> > weren't CTFE. I think that one or both of us is misunderstanding
> > something here.
> 
> I _do_ have CTFE tests. parseUUID and UUID.this(string) are guaranteed
> to work in CTFE as well. And if I have to use a custom Range for the
> CTFE tests anyway, why not use it for all tests? (This also allows me to
> explicitly test Forward and Input Ranges)

If you have to create other ranges for your tests anyway, then that's fine. But 
you seemed to be saying that filter didn't work at all, and it should.

> > > Thanks for your feedback!
> > 
> > Also, I just noticed that it looks like you don't have any tests
> > which verify that empty is false when it's supposed to be. You should
> > probably add a few.
> 
> There's one such test, but I can add some more.

You probably don't need a lot, but it does need to be tested, and I didn't see 
any tests for empty which were on a UUID which wasn't supposed to be empty.

> > And by the way, before this actually gets merged into Phobos (as I'm
> > guessing that it will be, since no way seems to particularly dislike
> > the module thus far), you should fix the ddoc to use LREF and XREF
> > when referencing symbols elsewhere in the module and elsewhere in std
> > respectively.
> 
> OK, I replaced some $(D ) with XREF/LREF but MYREF1,2,3 probably have to
> stay. Those are needed for the table (at least it's done that way in
> std.algorithm)

Stuff for the table is fine. It's primarily an issue of the ddoc directly on 
functions which you've been using $(D ) with, since that doesn't create links 
to anything.

- Jonathan M Davis
June 17, 2012
Re: std.uuid: Update 4
On 17.06.2012 12:04, Johannes Pfau wrote:
> Am Sat, 16 Jun 2012 21:11:51 +0400
> schrieb Dmitry Olshansky<dmitry.olsh@gmail.com>:
>>
>> Ah and another way to go about it is:
>> union {
>> 	ubyte[16] uuid;
>> 	size_t[16/size_t.sizeof] by_word;
>> }
>>
>
> Isn't that an optimization which should really be done by the compiler?
> It already knows that it's supposed to compare two ubyte[16]...

It knows that you compare two ubyte[16] that it.
It easily might miss the fact that one of them is always 0 in all 16 cells.

>
> Also how could the union solution be used without having to copy the
> data?

There is no copy it union, aka overlapped storage. In other words as 
these 16 bytes represented as (on 32bit) 4 size_t. They share memory 
location. It doesn't play nice with CTFE though, as it thinks unions to 
be plain struct last time I checked.

-- 
Dmitry Olshansky
June 17, 2012
Re: std.uuid: Update 4
Am Sun, 17 Jun 2012 12:15:00 +0400
schrieb Dmitry Olshansky <dmitry.olsh@gmail.com>:

> On 17.06.2012 12:04, Johannes Pfau wrote:
> > Am Sat, 16 Jun 2012 21:11:51 +0400
> > schrieb Dmitry Olshansky<dmitry.olsh@gmail.com>:
> >>
> >> Ah and another way to go about it is:
> >> union {
> >> 	ubyte[16] uuid;
> >> 	size_t[16/size_t.sizeof] by_word;
> >> }
> >>
> >
> > Isn't that an optimization which should really be done by the
> > compiler? It already knows that it's supposed to compare two
> > ubyte[16]...
> 
> It knows that you compare two ubyte[16] that it.
> It easily might miss the fact that one of them is always 0 in all 16
> cells.
> 
> >
> > Also how could the union solution be used without having to copy the
> > data?
> 
> There is no copy it union, aka overlapped storage. In other words as 
> these 16 bytes represented as (on 32bit) 4 size_t. They share memory 
> location. It doesn't play nice with CTFE though, as it thinks unions
> to be plain struct last time I checked.
> 

Yes, I thought about using the union nested in the empty function, so a
copy would have been needed:

bool empty()
{
   union
   {
        ubyte[16] uuid;
        ....
   }
   uuid = data; //copy
}

I didn't know that it's possible to make members of a union private
though, so I could use this in the UUID struct:
union
{
   ubyte[16] data;
   private size_t[16/size_t.sizeof] by_word;
}

which indeed wouldn't require copying. However, with this code added
std.uuid fails some unrelated ctfe UUID comparison unittests...
1 2 3 4 5
Top | Discussion index | About this forum | D home