October 23, 2012 Uri class and parser | ||||
---|---|---|---|---|
| ||||
Hi all! I've been working on an URI parser which takes a string and then separates the parts and puts them in the correct properties. If a valid URI was provided, the (static) parser will return an instance of Uri. I've commented all relevant lines of code and tested it using unittests. Now what I'm wondering is if it meets the phobos requirements and standards. And of course if you think I should do a pull request on GitHub! My code can be found here, at the bottom of the already existing file uri.d: https://github.com/MikevanDongen/phobos/blob/uri-parser/std/uri.d Thanks, Mike van Dongen. |
October 24, 2012 Re: Uri class and parser | ||||
---|---|---|---|---|
| ||||
Posted in reply to Mike van Dongen | On 2012-10-23 22:47, Mike van Dongen wrote: > Hi all! > > I've been working on an URI parser which takes a string and then > separates the parts and puts them in the correct properties. > If a valid URI was provided, the (static) parser will return an instance > of Uri. > > I've commented all relevant lines of code and tested it using unittests. > > Now what I'm wondering is if it meets the phobos requirements and > standards. > And of course if you think I should do a pull request on GitHub! > > My code can be found here, at the bottom of the already existing file > uri.d: > https://github.com/MikevanDongen/phobos/blob/uri-parser/std/uri.d > > > Thanks, > > Mike van Dongen. I would have expected a few additional components, like: * Domain * Password * Username * Host * Hash A way to build an URI base on the components. It would be nice if there were methods for getting/setting the path component as an array. Also methods for getting/setting the query component as an associative array. A few stylistic issues. There are a lot of places where you haven't indented the code, at least how it looks like on github. I wouldn't put the private methods at the top. -- /Jacob Carlborg |
October 24, 2012 Re: Uri class and parser | ||||
---|---|---|---|---|
| ||||
Posted in reply to Mike van Dongen | On Tuesday, 23 October 2012 at 20:47:26 UTC, Mike van Dongen wrote: > https://github.com/MikevanDongen/phobos/blob/uri-parser/std/uri.d If you want to take any of the code from mine, feel free. It is struct Uri in my cgi.d: https://github.com/adamdruppe/misc-stuff-including-D-programming-language-web-stuff/blob/master/cgi.d#L1615 My thing includes relative linking and some more parsing too. The ctRegex in there however, when it works it is cool, but if there's an error in an *other* part of the code, other module, doesn't call it, completely unrelated such as just making a typo on a local variable name... the compiler spews like 20 errors about ctRegex. That's annoying. But the bug is in the compiler and only makes other errors uglier so I'm just ignoring it for now. |
October 24, 2012 Re: Uri class and parser | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jacob Carlborg | On Wednesday, 24 October 2012 at 07:38:58 UTC, Jacob Carlborg wrote:
> It would be nice if there were methods for getting/setting the path component as an array. Also methods for getting/setting the query component as an associative array.
BTW don't forget that this is legal:
?value&value=1&value=2
The appropriate type for the AA is
string[][string]
This is why my cgi.d has functions two decodeVariables and decodeVariablesSingle and two members (in the Cgi class, I didn't add it to the Uri struct) get and getArray.
decodeVariables returns the complete string[][string]
and the single versions only keep the last element of the string[], which gives a string[string] for convenience.
|
October 24, 2012 Re: Uri class and parser | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jacob Carlborg | On Wednesday, 24 October 2012 at 07:38:58 UTC, Jacob Carlborg wrote: > > I would have expected a few additional components, like: > > * Domain > * Password > * Username > * Host > * Hash > > A way to build an URI base on the components. > It would be nice if there were methods for getting/setting the path component as an array. Also methods for getting/setting the query component as an associative array. I have a public domain URI parser here: http://github.com/p0nce/gfm/blob/master/common/uri.d |
October 24, 2012 Re: Uri class and parser | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jacob Carlborg | On Wednesday, 24 October 2012 at 07:38:58 UTC, Jacob Carlborg wrote: > I would have expected a few additional components, like: > > * Domain > * Password > * Username > * Host > * Hash > > A way to build an URI base on the components. > It would be nice if there were methods for getting/setting the path component as an array. Also methods for getting/setting the query component as an associative array. Thanks for the suggestions! I've added many, if not all, of them to the repo: - Identifying/separating the username, password (together the userinfo), the domain and the port number from the authority. - The hash now also can be get/set and the same thing goes for the data in the query On Wednesday, 24 October 2012 at 12:47:15 UTC, Adam D. Ruppe wrote: > On Wednesday, 24 October 2012 at 07:38:58 UTC, Jacob Carlborg wrote: >> It would be nice if there were methods for getting/setting the path component as an array. Also methods for getting/setting the query component as an associative array. > > BTW don't forget that this is legal: > > ?value&value=1&value=2 > > The appropriate type for the AA is > > string[][string] It does not yet take into account the fact that multiple query elements can have the same name. I'll be working on that next. On Wednesday, 24 October 2012 at 07:38:58 UTC, Jacob Carlborg wrote: > A few stylistic issues. There are a lot of places where you haven't indented the code, at least how it looks like on github. > > I wouldn't put the private methods at the top. As for the indentations, I use tabs with the size of 4 spaces. Viewing the code on Github (in Chromium) you'll see tabs of 8 spaces. I'm not sure what the phobos standard is? As all my code is part of a single class and the file std/uri.d already existed, I decided to 'just' append my code to the file. Should I perhaps put it in another file as the private methods you mentioned are not relevant to my code? You may be able to see the new getters by checking out this unittest: uri = Uri.parse("foo://username:password@example.com:8042/over/there/index.dtb?type=animal&name=narwhal&novalue#nose"); assert(uri.scheme == "foo"); assert(uri.authority == "username:password@example.com:8042"); assert(uri.path == "over/there/index.dtb"); assert(uri.pathAsArray == ["over", "there", "index.dtb"]); assert(uri.query == "type=animal&name=narwhal&novalue"); assert(uri.queryAsArray == ["type": "animal", "name": "narwhal", "novalue": ""]); assert(uri.fragment == "nose"); assert(uri.host == "example.com"); assert(uri.port == 8042); assert(uri.username == "username"); assert(uri.password == "password"); assert(uri.userinfo == "username:password"); assert(uri.queryAsArray["type"] == "animal"); assert(uri.queryAsArray["novalue"] == ""); assert("novalue" in uri.queryAsArray); assert(!("nothere" in uri.queryAsArray)); |
October 24, 2012 Re: Uri class and parser | ||||
---|---|---|---|---|
| ||||
Posted in reply to Mike van Dongen | On 2012-10-24 20:22, Mike van Dongen wrote: > Thanks for the suggestions! > I've added many, if not all, of them to the repo: > > - Identifying/separating the username, password (together the userinfo), > the domain and the port number from the authority. > - The hash now also can be get/set and the same thing goes for the data > in the query > As for the indentations, I use tabs with the size of 4 spaces. > Viewing the code on Github (in Chromium) you'll see tabs of 8 spaces. > I'm not sure what the phobos standard is? Ok, I'm using firefox and it doesn't look particular good on github. The Phobos standard is to use tabs as spaces with the size of 4. > As all my code is part of a single class and the file std/uri.d already > existed, I decided to 'just' append my code to the file. Should I > perhaps put it in another file as the private methods you mentioned are > not relevant to my code? If the some methods aren't used by the URI parser you should remove the. If they're used I would suggested you move the further down in the code, possibly at the bottom. > You may be able to see the new getters by checking out this unittest: Cool. It would be nice to have a way to set the query and path as an (associative) array as well. Just a suggestion, I don't really see a point in having getters and setters that just forwards to the instance variables. Just use public instance variables. The only reason to use getters and setters would be to be able to subclass and override them. But I think you could just make Uri a final class. About path and query. I wonder that's best to be default return an (associative) array or a string. I would think it's more useful to return an (associative) array and then provide rawPath() and rawQuery() which would return strings. A nitpick, I'm not really an expert on URI's but is "fragment" really the correct name for that I would call the "hash"? That would be "nose" in the example below. > uri = > Uri.parse("foo://username:password@example.com:8042/over/there/index.dtb?type=animal&name=narwhal&novalue#nose"); > > assert(uri.scheme == "foo"); > assert(uri.authority == "username:password@example.com:8042"); > assert(uri.path == "over/there/index.dtb"); > assert(uri.pathAsArray == ["over", "there", "index.dtb"]); > assert(uri.query == "type=animal&name=narwhal&novalue"); > assert(uri.queryAsArray == ["type": "animal", "name": "narwhal", > "novalue": ""]); > assert(uri.fragment == "nose"); > assert(uri.host == "example.com"); > assert(uri.port == 8042); > assert(uri.username == "username"); > assert(uri.password == "password"); > assert(uri.userinfo == "username:password"); > assert(uri.queryAsArray["type"] == "animal"); > assert(uri.queryAsArray["novalue"] == ""); > assert("novalue" in uri.queryAsArray); > assert(!("nothere" in uri.queryAsArray)); -- /Jacob Carlborg |
October 24, 2012 Re: Uri class and parser | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jacob Carlborg | On Wednesday, 24 October 2012 at 19:54:54 UTC, Jacob Carlborg wrote: > A nitpick, I'm not really an expert on URI's but is "fragment" really the correct name for that I would call the "hash"? That would be "nose" in the example below. Yes, that's the term in the standard. http://en.wikipedia.org/wiki/Fragment_identifier Javascript calls it the hash though, but it is slightly different: the # symbol itself is not part of the fragment according to the standard. But javascript's location.hash does return it. URL: example.com/ >>> location.hash "" >>> location.hash = "test" "test" URL changes to: example.com/#test >>> location.hash; "#test" The fragment would technically just be "test" there. |
October 25, 2012 Re: Uri class and parser | ||||
---|---|---|---|---|
| ||||
Posted in reply to Adam D. Ruppe | On 2012-10-24 22:36, Adam D. Ruppe wrote: > Yes, that's the term in the standard. > > http://en.wikipedia.org/wiki/Fragment_identifier > > Javascript calls it the hash though, but it is slightly different: the # > symbol itself is not part of the fragment according to the standard. > > But javascript's location.hash does return it. > > URL: example.com/ >>>> location.hash > "" > >>>> location.hash = "test" > "test" > > URL changes to: example.com/#test > >>>> location.hash; > "#test" > > > > The fragment would technically just be "test" there. I've obviously done too much JavaScript :). Thanks for the clarification. -- /Jacob Carlborg |
October 25, 2012 Re: Uri class and parser | ||||
---|---|---|---|---|
| ||||
Posted in reply to Adam D. Ruppe | On Wednesday, 24 October 2012 at 20:36:51 UTC, Adam D. Ruppe wrote: > On Wednesday, 24 October 2012 at 19:54:54 UTC, Jacob Carlborg wrote: >> A nitpick, I'm not really an expert on URI's but is "fragment" really the correct name for that I would call the "hash"? That would be "nose" in the example below. > > Yes, that's the term in the standard. > > http://en.wikipedia.org/wiki/Fragment_identifier The only reason I used "fragment" was because both the RFC and the Wikipedia page called it that way. I hate to break protocol ;) > Cool. It would be nice to have a way to set the query and path as an (associative) array as well. Now it allows you to create/edit an URI. You can do so by using an array or string, whichever you prefer. I also added a toString() method and fixed the indentation to 4 spaces, instead of 1 tab. uri = new Uri(); uri.scheme = "foo"; uri.username = "username"; uri.password = "password"; uri.host = "example.com"; uri.port = 8042; uri.path = ["over", "there", "index.dtb"]; uri.query = ["type": "animal", "name": "narwhal", "novalue": ""]; uri.fragment = "nose"; assert(uri.toString() == "foo://username:password@example.com:8042/over/there/index.dtb?novalue=&name=narwhal&type=animal#nose"); |
Copyright © 1999-2021 by the D Language Foundation