Review: std.uuid (page 3) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Review: std.uuid (page 3)

June 13, 2012

Re: Review: std.uuid

Posted by Kevin Cox
in reply to Kagamin

Kevin Cox

Posted in reply to Kagamin

Attachments:

text/html part

On Jun 13, 2012 7:23 AM, "Kagamin" <spam@here.lot> wrote:
>
> If we use all caps for abbreviations then the names should be SHA1UUID,
MD5UUID and UUIDVersion?

I believe tr naming scheme is acronyms have the same case.  So if an acronym is first it is all lowercase otherwise all uppercase.

June 14, 2012

Re: Review: std.uuid

Posted by Jonathan M Davis
in reply to Dmitry Olshansky

Jonathan M Davis

Posted in reply to Dmitry Olshansky

On Saturday, June 09, 2012 21:30:57 Dmitry Olshansky wrote:
> Code: https://github.com/jpf91/phobos/blob/std.uuid/std/uuid.d API-Docs: http://dl.dropbox.com/u/24218791/d/src/uuid.html

* I'm not terribly fond of having names like Variant and Version, but they _are_ local to UUID, so I guess that they're clear enough. But the fact that you have to put a note about it _not_ being std.variant.Variant definitely indicates that another name might be better. I know that the RFC uses the word variant, but it also indicates that type would probably be a better name, and if Variant is unclear enough that a note is needed, then perhaps it really isn't the right name.

* nameBasedMd5 should probably be nameBasedMD5, and nameBasedSha1 should probably be nameBasedSHA1. The correct names are MD5 and SHA-1, not md5 and sha1, so it would be more correct to use nameBasedMD5 and nameBasedSHA1. With acronyms, the thing that most of Phobos does is to keep all of the letters the same case, so if the first letter is lowercase, then the whole thing is lowercase, and if the first letter is uppercase, then the whole thing is uppercase. I'm not sure if MD5 and SHA-1 are acronyms or not, but they're similar enough, that I'd say that the same naming scheme should be used, which you do in most cases, but you didn't with the enums for some reason.

* For clarity's sake, please always put const after the function signatures for all const functions. In same cases you do, and in others you don't.

* If all that's preventing toString from being nothrow is idup, then just do this:

string toString() @safe const pure nothrow
{
    try
        return _toString().idup;
    catch(Exception e)
        assert(0, "It should be impossible for idup to throw.");
}

and if there's a bug report for idup, then list it in a comment so that we know to get rid of the try-catch block once that's been fixed (and if there isn't a bug report, one should be created).

* Get rid of nilUUID. Just use UUID.init. It just seems like an extra, unnecessary complication when the two are identical.

* Please use SHA-1 in the documentation rather than SHA1, since SHA-1 is the correct name. Also, make sure that it's SHA-1 and not just SHA. You're not always consistent (e.g. take a look at sha1UUID).

* parseUUID should be clear on whether it takes both lowercase and uppercase hex digits, and I'd argue that it should definitely accept both. [0-9a-f] implies that it only accepts lowercase hex digits.

* Use std.exception's assertThrown. It'll make your tests cleaner. And if you can't do that, because you need to look at the exception itself, then use collectException. So,

thrown = false;
try
    //make sure 36 characters in total or we'll get a 'tooMuch' reason
    id = UUID("{8ab3060e-2cba-4f23-b74c-b52db3bdf6}");
catch(UUIDParserException e)
{
    assert(e.reason == UUIDParserException.Reason.invalidChar);
    thrown = true;
}
assert(thrown);

becomes

//make sure 36 characters in total or we'll get a 'tooMuch' reason
auto e = collectException!UUIDParserException(UUID("{8ab3060e-2cba-4f23-b74c-
b52db3bdf6}"));
assert(e !is null);
assert(e.reason == UUIDParserException.Reason.invalidChar);

It's much shorter that way and less error-prone.

* You should probably change empty's body to

enum ubyte[16] emptyUUID = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]; return data == emptyUUID[];

I'm not sure whether it currently allocates or not if you do that (I _think_ that it'll do the heap allocation at compile time and then only have the value type at runtime), but regardless of whether it does or not, I'd expect that once it's fixed so that assigning an array literal to a static array, it definitely _won't_ allocate on the heap, whereas your current implementation will regardless of that fix.

And if that doesn't seem like a good change, then you can just make a module- level or struct variable which is immutable ubyte[16] and have empty use that. As it stands, it's always going to allocate the array literal on the heap. It's probably not a big deal, since empty probably won't be called all that often, but if we can easily avoid the extra heap allocation, we should.

* I know that

void toString(scope void delegate(const(char)[]) sink)

is what has been previously discussed for a toString which doesn't allocate, but I have to wonder why we don't just use an output range. I suppose that that's a whole other discussion, but that's what I would have suggested rather than a delegate - though in some circumstances it probably _is_ easier to just create a nested function rather than a whole output range. Maybe we should be moving to having 3 overloads of toString (normal, accepts delegate, and accepts output range).

* rndGen returns a Random (which is already aliased to Mt19937), and already initializes it with an unpredictable seed. So, why not just us rndGen in randomUUID? It becomes

return randomUUID(rndGen());

It's much shorter and just as good as far as I can tell.

* Change randomValue in randomUUID to auto. The default random number generator _does_ use uint, but others could use ubyte, ushort, or ulong instead. I'd also be tempted to argue that you should make it immutable and use a different variable for each random value (you won't ever accidentally fail to set it to a new value that way - that I don't think that reusing variables is a good idea in general), but the benefits of that are debatable.

Actually, just make it a loop. Something like

foreach(size_t i; iota(0, 16, 4))
{
    randomGen.popFront();
    immutable randomValue = randomGen.front;
    u.data[i .. i + 4] = *cast(ubyte[4]*)&randomValue;
}

That's probably slightly less efficient (though I wouldn't think by much), but it's much cleaner. And actually, since the randomValue isn't guarantee to be a uint, but your code needs it to be 4 bytes, you should probably change that line to

immutable uint randomValue = cast(uint)randomGen.front;

* I'd advise giving more descriptive names to parseError's parameters. It's true that it's a nested function and not in the actual API, but it would make the function easier to understand.

* You should probably use R or Range for template parameters which are ranges. That's the typical thing to do. T is usually used for generic types rather than for ranges. The code will be somewhat easier to read if you use R or Range given that that's how the rest of Phobos is.

* In parseUUID, element should be a size_t, since it's used specifically for indexing. As it stands, the compiler is going to have to do extra conversions in some of the places that it's used (e.g. converting it to size_t when indexing).

* In the unit tests, I'd argue that it would be cleaner to just use the result directly in an assertion rather than saving it if it's not going to be used outside of the assertion. For instance,

id = parseUUID("5668122d-9df0-49a4-ad0b-b9b0a57f886a"d);
assert(id.data == [86, 104, 18, 45, 157, 240, 73, 164, 173,
        11, 185, 176, 165, 127, 136, 106]);

should probably be

assert(parseUUID("5668122d-9df0-49a4-ad0b-b9b0a57f886a"d).data ==
           [86, 104, 18, 45, 157, 240, 73, 164, 173, 11, 185, 176, 165, 127,
136, 106]);

It's generally shorter, cleaner, and avoids the risk of reusing the variable without setting it again.

* For your functions which can take range types, test with filter!"true" so that you get a range which isn't an array or string.

* When testing various string types, you should take advantage of TypeTuple and std.conv.to to be thorough. For instance, instead of

id = parseUUID("5668122d-9df0-49a4-ad0b-b9b0a57f886a"d);
assert(id.data == [86, 104, 18, 45, 157, 240, 73, 164, 173,
    11, 185, 176, 165, 127, 136, 106]);
id = parseUUID("5668122d-9df0-49a4-ad0b-b9b0a57f886a"w);
assert(id.data == [86, 104, 18, 45, 157, 240, 73, 164, 173,
    11, 185, 176, 165, 127, 136, 106]);

do something like

foreach(S; TypeTuple!(char, const char, immutable char,
                      wchar, const wchar, immutable wchar,
                      dchar, const dchar, immutable dchar))
{
    assert(parseUUID(to!S("5668122d-9df0-49a4-ad0b-b9b0a57f886a")).data ==
           [86, 104, 18, 45, 157, 240, 73, 164, 173, 11, 185, 176, 165, 127,
136, 106]);
}

And while you're at it, you could throw _all_ of your string tests in that loop save for those which wouldn't really make sense to put in there (and you probably don't want to put very many exception tests in the loop, since those get to be expensive). You get much better testing this way.

However, do make sure that some of the exception tests use multiple string types even if they aren't in the TypeTuple loop, since some of your exceptions (in parseUUID in particular) depend on the type of range passed in.

* I'm not sure if you do or not, but if you have a spot where you have to create a UUIDParserException from another exception but don't want to have to specify the file and line number, then you should create an overload of UUIDParserException which takes the Throwable first (and probably forwards to the other constructor to avoid code duplication). Exception does this, and it can be quite useful. But UUIDParserException can't be publicly constructed, so only do it if it's actually useful within std.uuid.

* Personally, I'd prefer UUIDParsingException to UUIDParserException, since you're not writing a full-on parser but rather just doing parsing in some of your functions. But that's fairly subjective.

- Jonathan M Davis

June 14, 2012

Re: Review: std.uuid

Posted by Jacob Carlborg
in reply to Jonathan M Davis

Jacob Carlborg

Posted in reply to Jonathan M Davis

On 2012-06-14 03:57, Jonathan M Davis wrote:
> On Saturday, June 09, 2012 21:30:57 Dmitry Olshansky wrote:
>> Code: https://github.com/jpf91/phobos/blob/std.uuid/std/uuid.d
>> API-Docs: http://dl.dropbox.com/u/24218791/d/src/uuid.html
>
> * I'm not terribly fond of having names like Variant and Version, but they
> _are_ local to UUID, so I guess that they're clear enough. But the fact that
> you have to put a note about it _not_ being std.variant.Variant definitely
> indicates that another name might be better. I know that the RFC uses the word
> variant, but it also indicates that type would probably be a better name, and
> if Variant is unclear enough that a note is needed, then perhaps it really
> isn't the right name.

I understand exactly what you're saying here and it's probably good to not have too many conflicting names in the standard library. But at the same time it feels like the whole point of modules go to waste and we're back to C where everything is in the same global namespace.

-- 
/Jacob Carlborg

June 14, 2012

Re: Review: std.uuid

Posted by Dmitry Olshansky
in reply to Jacob Carlborg

Dmitry Olshansky

Posted in reply to Jacob Carlborg

On 14.06.2012 10:35, Jacob Carlborg wrote:
> On 2012-06-14 03:57, Jonathan M Davis wrote:
>> On Saturday, June 09, 2012 21:30:57 Dmitry Olshansky wrote:
>>> Code: https://github.com/jpf91/phobos/blob/std.uuid/std/uuid.d
>>> API-Docs: http://dl.dropbox.com/u/24218791/d/src/uuid.html
>>
>> * I'm not terribly fond of having names like Variant and Version, but
>> they
>> _are_ local to UUID, so I guess that they're clear enough. But the
>> fact that
>> you have to put a note about it _not_ being std.variant.Variant
>> definitely
>> indicates that another name might be better. I know that the RFC uses
>> the word
>> variant, but it also indicates that type would probably be a better
>> name, and
>> if Variant is unclear enough that a note is needed, then perhaps it
>> really
>> isn't the right name.
>
> I understand exactly what you're saying here and it's probably good to
> not have too many conflicting names in the standard library. But at the
> same time it feels like the whole point of modules go to waste and we're
> back to C where everything is in the same global namespace.
>

It feels this way because by default we import all symbols. The good thing is that you don't care for conflicts as long as you don't touch the conflicting name.

One day we'd just have to use static import more often.

-- 
Dmitry Olshansky

June 14, 2012

Re: Review: std.uuid

Posted by Dmitry Olshansky
in reply to Dmitry Olshansky

Dmitry Olshansky

Posted in reply to Dmitry Olshansky

On 09.06.2012 21:30, Dmitry Olshansky wrote:
> The review process stalled long enough, let's kick start it with a small
> yet a valuable module that was there for quite some time.
>
> std.uuid by Johannes Pfau is a rehash of it's C++ twin from well known
> Boost library.
>
> The review cycle takes the usual 2 weeks starting today 9th June, ending
> at 23 June 00:00 UTC followed by a one week vote process ending at 30 June.
>

Attention: we changing the review period, it'll take 5 days less in review.
-----

Given the small scale of module with approval from Johannes, the review period is shortened:

Review ends at 18-19 00:00 UTC and similarly voting ends at 25-26 June.
-----


-- 
Dmitry Olshansky

June 14, 2012

Re: Review: std.uuid

Posted by Jacob Carlborg
in reply to Dmitry Olshansky

Jacob Carlborg

Posted in reply to Dmitry Olshansky

On 2012-06-14 08:57, Dmitry Olshansky wrote:

> It feels this way because by default we import all symbols. The good
> thing is that you don't care for conflicts as long as you don't touch
> the conflicting name.

That's a good thing.

> One day we'd just have to use static import more often.

Or aliases, or renamed imports.

-- 
/Jacob Carlborg

June 14, 2012

Re: Review: std.uuid

Posted by Jonathan M Davis
in reply to Jacob Carlborg

Jonathan M Davis

Posted in reply to Jacob Carlborg

On Thursday, June 14, 2012 09:41:53 Jacob Carlborg wrote:
> On 2012-06-14 08:57, Dmitry Olshansky wrote:
> > It feels this way because by default we import all symbols. The good thing is that you don't care for conflicts as long as you don't touch the conflicting name.
> 
> That's a good thing.

It is and it isn't. It makes it much easier to end up with conflicts. It would be less of a problem if not all symbols were imported by default. We _do_ have some great tools for dealing with conflicts, but the fact that everything gets imported by default when you import a module means that it's very easy to create conflicts, and when that's coupled with the fact that D is very quick to declare ambiguities and conflicts with overload sets and whatnot rather than trying to determine which one you really meant (like C++ would) makes it that much worse. There are definite pros to the situation (e.g. being so picky with overload sets makes it much harder to call the wrong function without knowing it), but there's no question that conflicts happen quite easily. It's not as big a deal with introducing a new module, since the programmer will deal with it when they import it for the first time, but it tends to break code when adding new stuff to old modules.

So, in general, if we can pick good names which don't conflict, we're better off. But if the best names are ones that are used in other modules, then that's what we'll end up going with. It has to be judged on a case-by-case basis.

> > One day we'd just have to use static import more often.
> 
> Or aliases, or renamed imports.

aliases are useless for dealing with conflicts as long as private aliases aren't hidden. At present, I'd argue that private aliases are bad practice unless you alias things to very esoteric names which aren't likely to conflict. It would really be nice to have that fixed though.

- Jonathan M Davis

June 14, 2012

Re: Review: std.uuid

Posted by Jacob Carlborg
in reply to Jonathan M Davis

Jacob Carlborg

Posted in reply to Jonathan M Davis

On 2012-06-14 09:49, Jonathan M Davis wrote:

> It is and it isn't. It makes it much easier to end up with conflicts. It would
> be less of a problem if not all symbols were imported by default. We _do_ have
> some great tools for dealing with conflicts, but the fact that everything gets
> imported by default when you import a module means that it's very easy to
> create conflicts, and when that's coupled with the fact that D is very quick to
> declare ambiguities and conflicts with overload sets and whatnot rather than
> trying to determine which one you really meant (like C++ would) makes it that
> much worse. There are definite pros to the situation (e.g. being so picky with
> overload sets makes it much harder to call the wrong function without knowing
> it), but there's no question that conflicts happen quite easily. It's not as
> big a deal with introducing a new module, since the programmer will deal with
> it when they import it for the first time, but it tends to break code when
> adding new stuff to old modules.
>
> So, in general, if we can pick good names which don't conflict, we're better
> off. But if the best names are ones that are used in other modules, then that's
> what we'll end up going with. It has to be judged on a case-by-case basis.

I was referring to that it's good that a conflict will not occur just by importing modules with conflicting names. You also have to _use_ them.

> aliases are useless for dealing with conflicts as long as private aliases
> aren't hidden. At present, I'd argue that private aliases are bad practice
> unless you alias things to very esoteric names which aren't likely to conflict.
> It would really be nice to have that fixed though.

Right, but that's a bug.

-- 
/Jacob Carlborg

June 14, 2012

Re: Review: std.uuid

Posted by Jonathan M Davis
in reply to Jacob Carlborg

Jonathan M Davis

Posted in reply to Jacob Carlborg

On Thursday, June 14, 2012 11:55:51 Jacob Carlborg wrote:
> > aliases are useless for dealing with conflicts as long as private aliases aren't hidden. At present, I'd argue that private aliases are bad practice unless you alias things to very esoteric names which aren't likely to conflict. It would really be nice to have that fixed though.
> 
> Right, but that's a bug.

Actually, it's not. It's by design. private never hides _anything_ and is always considered _after_ overload resolution. That's how it works in C++, and that's how it works in D. If we want it to change, we need to convince Walter. Personally, I think that we'd be better off if we made private hidden, but I don't know if we can convince Walter of that. Regardless, someone else will probably have to implement the changes. IIRC, someone had done something along those lines and created a pull request, but I don't know what happened with it.

- Jonathan M Davis

June 14, 2012

Re: Review: std.uuid

Posted by Jacob Carlborg
in reply to Jonathan M Davis

Jacob Carlborg

Posted in reply to Jonathan M Davis

On 2012-06-14 12:04, Jonathan M Davis wrote:

> Actually, it's not. It's by design. private never hides _anything_ and is
> always considered _after_ overload resolution. That's how it works in C++, and
> that's how it works in D. If we want it to change, we need to convince Walter.
> Personally, I think that we'd be better off if we made private hidden, but I
> don't know if we can convince Walter of that. Regardless, someone else will
> probably have to implement the changes. IIRC, someone had done something along
> those lines and created a pull request, but I don't know what happened with
> it.

Oh, yeah, I remember that now. I really don't a reason why private aliases shouldn't be hidden.

-- 
/Jacob Carlborg

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation