Jump to page: 1 26  
Page
Thread overview
Uri class and parser
Oct 23, 2012
Mike van Dongen
Oct 24, 2012
Jacob Carlborg
Oct 24, 2012
Adam D. Ruppe
Oct 25, 2012
Mike van Dongen
Oct 25, 2012
Jacob Carlborg
Oct 25, 2012
Jens Mueller
Oct 26, 2012
Jacob Carlborg
Oct 26, 2012
Jens Mueller
Oct 26, 2012
Jonathan M Davis
Oct 26, 2012
Jens Mueller
Oct 24, 2012
ponce
Oct 24, 2012
Mike van Dongen
Oct 24, 2012
Jacob Carlborg
Oct 24, 2012
Adam D. Ruppe
Oct 25, 2012
Jacob Carlborg
Oct 25, 2012
Mike van Dongen
Oct 25, 2012
Jacob Carlborg
Oct 24, 2012
Adam D. Ruppe
Oct 25, 2012
Jens Mueller
Oct 25, 2012
Mike van Dongen
Oct 25, 2012
Jens Mueller
Oct 26, 2012
John Chapman
Oct 26, 2012
Mike van Dongen
Oct 26, 2012
Adam D. Ruppe
Oct 26, 2012
Mike van Dongen
Oct 25, 2012
Jonathan M Davis
Oct 25, 2012
Jens Mueller
Oct 25, 2012
Jonathan M Davis
Oct 25, 2012
Jens Mueller
Oct 26, 2012
Jonathan M Davis
Oct 26, 2012
Jacob Carlborg
Oct 26, 2012
Walter Bright
Oct 28, 2012
Jens Mueller
Oct 28, 2012
Jonathan M Davis
Oct 28, 2012
Jacob Carlborg
Nov 08, 2012
Mike van Dongen
Nov 08, 2012
jerro
Nov 08, 2012
Mike van Dongen
Nov 08, 2012
jerro
Nov 08, 2012
Mike van Dongen
Nov 08, 2012
jerro
Nov 08, 2012
Jonathan M Davis
Nov 09, 2012
Mike van Dongen
Nov 09, 2012
Jonathan M Davis
Feb 24, 2013
RommelVR
Oct 26, 2012
Jens Mueller
Oct 26, 2012
Jonathan M Davis
Oct 26, 2012
Jens Mueller
Oct 26, 2012
Jacob Carlborg
Oct 26, 2012
Jens Mueller
Oct 27, 2012
Jacob Carlborg
Oct 26, 2012
Walter Bright
Oct 27, 2012
Adam D. Ruppe
Oct 27, 2012
Jacob Carlborg
October 23, 2012
Hi all!

I've been working on an URI parser which takes a string and then separates the parts and puts them in the correct properties.
If a valid URI was provided, the (static) parser will return an instance of Uri.

I've commented all relevant lines of code and tested it using unittests.

Now what I'm wondering is if it meets the phobos requirements and standards.
And of course if you think I should do a pull request on GitHub!

My code can be found here, at the bottom of the already existing file uri.d:
https://github.com/MikevanDongen/phobos/blob/uri-parser/std/uri.d


Thanks,

Mike van Dongen.
October 24, 2012
On 2012-10-23 22:47, Mike van Dongen wrote:
> Hi all!
>
> I've been working on an URI parser which takes a string and then
> separates the parts and puts them in the correct properties.
> If a valid URI was provided, the (static) parser will return an instance
> of Uri.
>
> I've commented all relevant lines of code and tested it using unittests.
>
> Now what I'm wondering is if it meets the phobos requirements and
> standards.
> And of course if you think I should do a pull request on GitHub!
>
> My code can be found here, at the bottom of the already existing file
> uri.d:
> https://github.com/MikevanDongen/phobos/blob/uri-parser/std/uri.d
>
>
> Thanks,
>
> Mike van Dongen.

I would have expected a few additional components, like:

* Domain
* Password
* Username
* Host
* Hash

A way to build an URI base on the components.
It would be nice if there were methods for getting/setting the path component as an array. Also methods for getting/setting the query component as an associative array.

A few stylistic issues. There are a lot of places where you haven't indented the code, at least how it looks like on github.

I wouldn't put the private methods at the top.

-- 
/Jacob Carlborg
October 24, 2012
On Tuesday, 23 October 2012 at 20:47:26 UTC, Mike van Dongen wrote:
> https://github.com/MikevanDongen/phobos/blob/uri-parser/std/uri.d

If you want to take any of the code from mine, feel free. It is struct Uri in my cgi.d:

https://github.com/adamdruppe/misc-stuff-including-D-programming-language-web-stuff/blob/master/cgi.d#L1615

My thing includes relative linking and some more parsing too. The ctRegex in there however, when it works it is cool, but if there's an error in an *other* part of the code, other module, doesn't call it, completely unrelated such as just making a typo on a local variable name... the compiler spews like 20 errors about ctRegex.

That's annoying. But the bug is in the compiler and only makes other errors uglier so I'm just ignoring it for now.
October 24, 2012
On Wednesday, 24 October 2012 at 07:38:58 UTC, Jacob Carlborg wrote:
> It would be nice if there were methods for getting/setting the path component as an array. Also methods for getting/setting the query component as an associative array.

BTW don't forget that this is legal:

?value&value=1&value=2

The appropriate type for the AA is

string[][string]


This is why my cgi.d has functions two decodeVariables and decodeVariablesSingle and two members (in the Cgi class, I didn't add it to the Uri struct) get and getArray.

decodeVariables returns the complete string[][string]

and the single versions only keep the last element of the string[], which gives a string[string] for convenience.
October 24, 2012
On Wednesday, 24 October 2012 at 07:38:58 UTC, Jacob Carlborg
wrote:
>
> I would have expected a few additional components, like:
>
> * Domain
> * Password
> * Username
> * Host
> * Hash
>
> A way to build an URI base on the components.
> It would be nice if there were methods for getting/setting the path component as an array. Also methods for getting/setting the query component as an associative array.

I have a public domain URI parser here:
http://github.com/p0nce/gfm/blob/master/common/uri.d




October 24, 2012
On Wednesday, 24 October 2012 at 07:38:58 UTC, Jacob Carlborg wrote:
> I would have expected a few additional components, like:
>
> * Domain
> * Password
> * Username
> * Host
> * Hash
>
> A way to build an URI base on the components.
> It would be nice if there were methods for getting/setting the path component as an array. Also methods for getting/setting the query component as an associative array.

Thanks for the suggestions!
I've added many, if not all, of them to the repo:

- Identifying/separating the username, password (together the userinfo), the domain and the port number from the authority.
- The hash now also can be get/set and the same thing goes for the data in the query


On Wednesday, 24 October 2012 at 12:47:15 UTC, Adam D. Ruppe wrote:
> On Wednesday, 24 October 2012 at 07:38:58 UTC, Jacob Carlborg wrote:
>> It would be nice if there were methods for getting/setting the path component as an array. Also methods for getting/setting the query component as an associative array.
>
> BTW don't forget that this is legal:
>
> ?value&value=1&value=2
>
> The appropriate type for the AA is
>
> string[][string]

It does not yet take into account the fact that multiple query elements can have the same name. I'll be working on that next.


On Wednesday, 24 October 2012 at 07:38:58 UTC, Jacob Carlborg wrote:
> A few stylistic issues. There are a lot of places where you haven't indented the code, at least how it looks like on github.
>
> I wouldn't put the private methods at the top.

As for the indentations, I use tabs with the size of 4 spaces.
Viewing the code on Github (in Chromium) you'll see tabs of 8 spaces.
I'm not sure what the phobos standard is?

As all my code is part of a single class and the file std/uri.d already existed, I decided to 'just' append my code to the file. Should I perhaps put it in another file as the private methods you mentioned are not relevant to my code?


You may be able to see the new getters by checking out this unittest:

uri = Uri.parse("foo://username:password@example.com:8042/over/there/index.dtb?type=animal&name=narwhal&novalue#nose");
assert(uri.scheme == "foo");
assert(uri.authority == "username:password@example.com:8042");
assert(uri.path == "over/there/index.dtb");
assert(uri.pathAsArray == ["over", "there", "index.dtb"]);
assert(uri.query == "type=animal&name=narwhal&novalue");
assert(uri.queryAsArray == ["type": "animal", "name": "narwhal", "novalue": ""]);
assert(uri.fragment == "nose");
assert(uri.host == "example.com");
assert(uri.port == 8042);
assert(uri.username == "username");
assert(uri.password == "password");
assert(uri.userinfo == "username:password");
assert(uri.queryAsArray["type"] == "animal");
assert(uri.queryAsArray["novalue"] == "");
assert("novalue" in uri.queryAsArray);
assert(!("nothere" in uri.queryAsArray));
October 24, 2012
On 2012-10-24 20:22, Mike van Dongen wrote:

> Thanks for the suggestions!
> I've added many, if not all, of them to the repo:
>
> - Identifying/separating the username, password (together the userinfo),
> the domain and the port number from the authority.
> - The hash now also can be get/set and the same thing goes for the data
> in the query

> As for the indentations, I use tabs with the size of 4 spaces.
> Viewing the code on Github (in Chromium) you'll see tabs of 8 spaces.
> I'm not sure what the phobos standard is?

Ok, I'm using firefox and it doesn't look particular good on github. The Phobos standard is to use tabs as spaces with the size of 4.

> As all my code is part of a single class and the file std/uri.d already
> existed, I decided to 'just' append my code to the file. Should I
> perhaps put it in another file as the private methods you mentioned are
> not relevant to my code?

If the some methods aren't used by the URI parser you should remove the. If they're used I would suggested you move the further down in the code, possibly at the bottom.

> You may be able to see the new getters by checking out this unittest:

Cool. It would be nice to have a way to set the query and path as an (associative) array as well.

Just a suggestion, I don't really see a point in having getters and setters that just forwards to the instance variables. Just use public instance variables. The only reason to use getters and setters would be to be able to subclass and override them. But I think you could just make Uri a final class.

About path and query. I wonder that's best to be default return an (associative) array or a string. I would think it's more useful to return an (associative) array and then provide rawPath() and rawQuery() which would return strings.

A nitpick, I'm not really an expert on URI's but is "fragment" really the correct name for that I would call the "hash"? That would be "nose" in the example below.

> uri =
> Uri.parse("foo://username:password@example.com:8042/over/there/index.dtb?type=animal&name=narwhal&novalue#nose");
>
> assert(uri.scheme == "foo");
> assert(uri.authority == "username:password@example.com:8042");
> assert(uri.path == "over/there/index.dtb");
> assert(uri.pathAsArray == ["over", "there", "index.dtb"]);
> assert(uri.query == "type=animal&name=narwhal&novalue");
> assert(uri.queryAsArray == ["type": "animal", "name": "narwhal",
> "novalue": ""]);
> assert(uri.fragment == "nose");
> assert(uri.host == "example.com");
> assert(uri.port == 8042);
> assert(uri.username == "username");
> assert(uri.password == "password");
> assert(uri.userinfo == "username:password");
> assert(uri.queryAsArray["type"] == "animal");
> assert(uri.queryAsArray["novalue"] == "");
> assert("novalue" in uri.queryAsArray);
> assert(!("nothere" in uri.queryAsArray));


-- 
/Jacob Carlborg
October 24, 2012
On Wednesday, 24 October 2012 at 19:54:54 UTC, Jacob Carlborg wrote:
> A nitpick, I'm not really an expert on URI's but is "fragment" really the correct name for that I would call the "hash"? That would be "nose" in the example below.

Yes, that's the term in the standard.

http://en.wikipedia.org/wiki/Fragment_identifier

Javascript calls it the hash though, but it is slightly different: the # symbol itself is not part of the fragment according to the standard.

But javascript's location.hash does return it.

URL: example.com/
>>> location.hash
""

>>> location.hash = "test"
"test"

URL changes to: example.com/#test

>>> location.hash;
"#test"



The fragment would technically just be "test" there.
October 25, 2012
On 2012-10-24 22:36, Adam D. Ruppe wrote:

> Yes, that's the term in the standard.
>
> http://en.wikipedia.org/wiki/Fragment_identifier
>
> Javascript calls it the hash though, but it is slightly different: the #
> symbol itself is not part of the fragment according to the standard.
>
> But javascript's location.hash does return it.
>
> URL: example.com/
>>>> location.hash
> ""
>
>>>> location.hash = "test"
> "test"
>
> URL changes to: example.com/#test
>
>>>> location.hash;
> "#test"
>
>
>
> The fragment would technically just be "test" there.

I've obviously done too much JavaScript :). Thanks for the clarification.

-- 
/Jacob Carlborg
October 25, 2012
On Wednesday, 24 October 2012 at 20:36:51 UTC, Adam D. Ruppe wrote:
> On Wednesday, 24 October 2012 at 19:54:54 UTC, Jacob Carlborg wrote:
>> A nitpick, I'm not really an expert on URI's but is "fragment" really the correct name for that I would call the "hash"? That would be "nose" in the example below.
>
> Yes, that's the term in the standard.
>
> http://en.wikipedia.org/wiki/Fragment_identifier

The only reason I used "fragment" was because both the RFC and the Wikipedia page called it that way. I hate to break protocol ;)

> Cool. It would be nice to have a way to set the query and path as an (associative) array as well.

Now it allows you to create/edit an URI. You can do so by using an array or string, whichever you prefer.
I also added a toString() method and fixed the indentation to 4 spaces, instead of 1 tab.

uri = new Uri();
uri.scheme = "foo";
uri.username = "username";
uri.password = "password";
uri.host = "example.com";
uri.port = 8042;
uri.path = ["over", "there", "index.dtb"];
uri.query = ["type": "animal", "name": "narwhal", "novalue": ""];
uri.fragment = "nose";
assert(uri.toString() == "foo://username:password@example.com:8042/over/there/index.dtb?novalue=&name=narwhal&type=animal#nose");

« First   ‹ Prev
1 2 3 4 5 6