March 12, 2011
On 12/03/11 05.30, Ary Manzana wrote:
> On 3/11/11 12:20 PM, Jonas Drewsen wrote:
>> Hi,
>>
>> So I've spent some time trying to wrap libcurl for D. There is a lot of
>> things that you can do with libcurl which I did not know so I'm starting
>> out small.
>>
>> For now I've created all the declarations for the latest public curl C
>> api. I have put that in the etc.c.curl module.
>>
>> On top of that I've created a more D like api as seen below. This is
>> located in the 'etc.curl' module. What you can see below currently works
>> but before proceeding further down this road I would like to get your
>> comments on it.
>
> I *love* it.
>
> All APIs should be like yours. One-liners for what you want right now.
> If it's a little more complex, some more lines. This is perfect.
>
> Congratulations!

Thank you! Words like these keep up the motivation.

/Jonas
March 12, 2011
On 11/03/11 17.33, Vladimir Panteleev wrote:
> On Fri, 11 Mar 2011 17:20:38 +0200, Jonas Drewsen <jdrewsen@nospam.com>
> wrote:
>
>> writeln( Http.get("http://www.google.com").content );
>
> Does this return a string? What if the page's encoding isn't UTF-8?
>
> Data should probably be returned as void[], similar to std.file.read.

Currently it returns a string, but should probably return void[] as you suggest.

Maybe the interface should be something like this to support misc. encodings (like the std.file.readText does):

class Http {
  	struct Result(S) {
		S content;
		...
	}
	static Result!S get(S = void[])(in string url);
	
}

Actually I just took a look at Andrei's std.stream2 suggestion and Http/Ftp... Transports would be pretty neat to have as well for reading formatted data.

I'll follow the newly spawned "Stream proposal" thread on this one :)

/Jonas
March 12, 2011
Jonas Drewsen wrote:

> On 11/03/11 22.21, Jesse Phillips wrote:
>> I'll make some comments on the API. Do we have to choose Http/Ftp...? The URI already contains this, I could see being able to specifically request one or the other for performance or so www.google.com works.
> 
> That is a good question.
> 
> The problem with creating a grand unified Curl class that does it all is that each protocol supports different things ie. http supports cookie handling and http redirection, ftp supports passive/active mode and dir listings and so on.
> 
> I think it would confuse the user of the API if e.g. he were allowed to set cookies on his ftp request.
> 
> The protocols supported (Http, Ftp,... classes) do have a base class Protocol that implements common things like timouts etc.
> 
> 
>> And what about properties? They tend to be very nice instead of set methods. examples below.
> 
> Actually I thought off this and went the usual C++ way of _not_ using public properties but use accessor methods. Is public properties accepted as "the D way" and if so what about the usual reasons about why you should use accessor methods (like encapsulation and tolerance to future changes to the API)?
> 
> I do like the shorter onHeader/onContent much better though :)
> 
> /Jonas

Properties *are* accessor methods, with some sugar. In fact you already have used them, try it:

http.setReceiveHeaderCallback =  (string key, string value) {
    writeln(key ~ ":" ~ value);
};

Marking a function with @property just signals it's intended use, in which case it's nicer to grop the get/set prefixes. Supposedly using parenthesis with such declarations will be outlawed in the future, but I don't think that's the case currently.

>> Jonas Drewsen Wrote:
>>
>>> //
>>> // Simple HTTP GET with sane defaults
>>> // provides the .content, .headers and .status
>>> //
>>> writeln( Http.get("http://www.google.com").content );
>>>
>>> //
>>> // GET with custom data receiver delegates
>>> //
>>> Http http = new Http("http://www.google.dk");
>>> http.setReceiveHeaderCallback( (string key, string value) {
>>> writeln(key ~ ":" ~ value);
>>> } );
>>> http.setReceiveCallback( (string data) { /* drop */ } );
>>> http.perform;
>>
>> http.onHeader = (string key, string value) {...};
>> http.onContent = (string data) { ... };
>> http.perform();

March 12, 2011
Jonas Drewsen Wrote:

> On 11/03/11 22.21, Jesse Phillips wrote:
> > I'll make some comments on the API. Do we have to choose Http/Ftp...? The URI already contains this, I could see being able to specifically request one or the other for performance or so www.google.com works.
> 
> That is a good question.
> 
> The problem with creating a grand unified Curl class that does it all is that each protocol supports different things ie. http supports cookie handling and http redirection, ftp supports passive/active mode and dir listings and so on.
> 
> I think it would confuse the user of the API if e.g. he were allowed to set cookies on his ftp request.
> 
> The protocols supported (Http, Ftp,... classes) do have a base class Protocol that implements common things like timouts etc.

Ah. I guess I was just thinking about if you want to download some file, you don't really care where you are getting it from you just have the URL and are read to go.

> > And what about properties? They tend to be very nice instead of set methods. examples below.
> 
> Actually I thought off this and went the usual C++ way of _not_ using public properties but use accessor methods. Is public properties accepted as "the D way" and if so what about the usual reasons about why you should use accessor methods (like encapsulation and tolerance to future changes to the API)?
> 
> I do like the shorter onHeader/onContent much better though :)

D was originally very friendly with properties. Your could can at this moment be written:

http.setReceiveHeaderCallback = (string key, string value) {
        writeln(key ~ ":" ~ value);
};

But is going to be deprecated for the use of the @property attribute. You are probably aware of properties in C#, so yes D is fine with public fields and functions that look like public fields.

Otherwise this looks really good and I do hope to see it in Phobos.

March 12, 2011
On 12/03/11 20.44, Jesse Phillips wrote:
> Jonas Drewsen Wrote:
>
>> On 11/03/11 22.21, Jesse Phillips wrote:
>>> I'll make some comments on the API. Do we have to choose Http/Ftp...? The URI already contains this, I could see being able to specifically request one or the other for performance or so www.google.com works.
>>
>> That is a good question.
>>
>> The problem with creating a grand unified Curl class that does it all is
>> that each protocol supports different things ie. http supports cookie
>> handling and http redirection, ftp supports passive/active mode and dir
>> listings and so on.
>>
>> I think it would confuse the user of the API if e.g. he were allowed to
>> set cookies on his ftp request.
>>
>> The protocols supported (Http, Ftp,... classes) do have a base class
>> Protocol that implements common things like timouts etc.
>
> Ah. I guess I was just thinking about if you want to download some file, you don't really care where you are getting it from you just have the URL and are read to go.

There should definitely be a simple method based only on an url. I'll put that in.


>>> And what about properties? They tend to be very nice instead of set methods. examples below.
>>
>> Actually I thought off this and went the usual C++ way of _not_ using
>> public properties but use accessor methods. Is public properties
>> accepted as "the D way" and if so what about the usual reasons about why
>> you should use accessor methods (like encapsulation and tolerance to
>> future changes to the API)?
>>
>> I do like the shorter onHeader/onContent much better though :)
>
> D was originally very friendly with properties. Your could can at this moment be written:
>
> http.setReceiveHeaderCallback = (string key, string value) {
>          writeln(key ~ ":" ~ value);
> };
>
> But is going to be deprecated for the use of the @property attribute. You are probably aware of properties in C#, so yes D is fine with public fields and functions that look like public fields.

Just tried the property stuff out but it seems a bit inconsistent. Maybe someone can enlighten me:

import std.stdio;

alias void delegate() deleg;

class T {
  private deleg tvalue;
  @property void prop(deleg dg) {
    tvalue = dg;
  }
  @property deleg prop() {
    return tvalue;
  }
}

void main(string[] args) {
  T t = new T;
  t.prop = { writeln("fda"); };

  // Seems a bit odd that assigning to a temporary (tvalue) suddently
  // changes the behaviour.
  auto tvalue = t.prop;
  tvalue();     // Works as expected by printing fda
  t.prop();     // Just returns the delegate!

  // Shouldn't the @property attribute ensure that no () is needed
  // when using the property
  t.prop()(); // Works
}

/Jonas




> Otherwise this looks really good and I do hope to see it in Phobos.
>

March 12, 2011
On Saturday 12 March 2011 13:51:37 Jonas Drewsen wrote:
> On 12/03/11 20.44, Jesse Phillips wrote:
> > Jonas Drewsen Wrote:
> >> On 11/03/11 22.21, Jesse Phillips wrote:
> >>> I'll make some comments on the API. Do we have to choose Http/Ftp...? The URI already contains this, I could see being able to specifically request one or the other for performance or so www.google.com works.
> >> 
> >> That is a good question.
> >> 
> >> The problem with creating a grand unified Curl class that does it all is that each protocol supports different things ie. http supports cookie handling and http redirection, ftp supports passive/active mode and dir listings and so on.
> >> 
> >> I think it would confuse the user of the API if e.g. he were allowed to set cookies on his ftp request.
> >> 
> >> The protocols supported (Http, Ftp,... classes) do have a base class Protocol that implements common things like timouts etc.
> > 
> > Ah. I guess I was just thinking about if you want to download some file, you don't really care where you are getting it from you just have the URL and are read to go.
> 
> There should definitely be a simple method based only on an url. I'll put that in.
> 
> >>> And what about properties? They tend to be very nice instead of set methods. examples below.
> >> 
> >> Actually I thought off this and went the usual C++ way of _not_ using public properties but use accessor methods. Is public properties accepted as "the D way" and if so what about the usual reasons about why you should use accessor methods (like encapsulation and tolerance to future changes to the API)?
> >> 
> >> I do like the shorter onHeader/onContent much better though :)
> > 
> > D was originally very friendly with properties. Your could can at this moment be written:
> > 
> > http.setReceiveHeaderCallback = (string key, string value) {
> > 
> >          writeln(key ~ ":" ~ value);
> > 
> > };
> > 
> > But is going to be deprecated for the use of the @property attribute. You are probably aware of properties in C#, so yes D is fine with public fields and functions that look like public fields.
> 
> Just tried the property stuff out but it seems a bit inconsistent. Maybe someone can enlighten me:
> 
> import std.stdio;
> 
> alias void delegate() deleg;
> 
> class T {
>    private deleg tvalue;
>    @property void prop(deleg dg) {
>      tvalue = dg;
>    }
>    @property deleg prop() {
>      return tvalue;
>    }
> }
> 
> void main(string[] args) {
>    T t = new T;
>    t.prop = { writeln("fda"); };
> 
>    // Seems a bit odd that assigning to a temporary (tvalue) suddently
>    // changes the behaviour.
>    auto tvalue = t.prop;
>    tvalue();     // Works as expected by printing fda
>    t.prop();     // Just returns the delegate!
> 
>    // Shouldn't the @property attribute ensure that no () is needed
>    // when using the property
>    t.prop()(); // Works
> }

@property doesn't currently enforce much of anything. Things are in a transitory state with regards to property. Originally, there was no such thing as @property and any function which had no parameters and returned a value could be used as a getter and any function which returned nothing and took a single argument could be used as a setter. It was decided to make it more restrictive, so @property was added. Eventually, you will _only_ be able to use such functions as property functions if they are marked with @property, and you will _have_ to call them with the property syntax and will _not_ be able to call non-property functions with the property syntax. However, at the moment, the compiler doesn't enforce that. It will eventually, but there are several bugs with regards to property functions (they mostly work, but you found one of the cases where they don't), and it probably wouldn't be a good idea to enforce it until more of those bugs have been fixed.

- Jonathan M Davis
March 13, 2011
On 13/03/11 00.28, Jonathan M Davis wrote:
> On Saturday 12 March 2011 13:51:37 Jonas Drewsen wrote:
>> On 12/03/11 20.44, Jesse Phillips wrote:
>>> Jonas Drewsen Wrote:
>>>> On 11/03/11 22.21, Jesse Phillips wrote:
>>>>> I'll make some comments on the API. Do we have to choose Http/Ftp...?
>>>>> The URI already contains this, I could see being able to specifically
>>>>> request one or the other for performance or so www.google.com works.
>>>>
>>>> That is a good question.
>>>>
>>>> The problem with creating a grand unified Curl class that does it all is
>>>> that each protocol supports different things ie. http supports cookie
>>>> handling and http redirection, ftp supports passive/active mode and dir
>>>> listings and so on.
>>>>
>>>> I think it would confuse the user of the API if e.g. he were allowed to
>>>> set cookies on his ftp request.
>>>>
>>>> The protocols supported (Http, Ftp,... classes) do have a base class
>>>> Protocol that implements common things like timouts etc.
>>>
>>> Ah. I guess I was just thinking about if you want to download some file,
>>> you don't really care where you are getting it from you just have the
>>> URL and are read to go.
>>
>> There should definitely be a simple method based only on an url. I'll
>> put that in.
>>
>>>>> And what about properties? They tend to be very nice instead of set
>>>>> methods. examples below.
>>>>
>>>> Actually I thought off this and went the usual C++ way of _not_ using
>>>> public properties but use accessor methods. Is public properties
>>>> accepted as "the D way" and if so what about the usual reasons about why
>>>> you should use accessor methods (like encapsulation and tolerance to
>>>> future changes to the API)?
>>>>
>>>> I do like the shorter onHeader/onContent much better though :)
>>>
>>> D was originally very friendly with properties. Your could can at this
>>> moment be written:
>>>
>>> http.setReceiveHeaderCallback = (string key, string value) {
>>>
>>>           writeln(key ~ ":" ~ value);
>>>
>>> };
>>>
>>> But is going to be deprecated for the use of the @property attribute. You
>>> are probably aware of properties in C#, so yes D is fine with public
>>> fields and functions that look like public fields.
>>
>> Just tried the property stuff out but it seems a bit inconsistent. Maybe
>> someone can enlighten me:
>>
>> import std.stdio;
>>
>> alias void delegate() deleg;
>>
>> class T {
>>     private deleg tvalue;
>>     @property void prop(deleg dg) {
>>       tvalue = dg;
>>     }
>>     @property deleg prop() {
>>       return tvalue;
>>     }
>> }
>>
>> void main(string[] args) {
>>     T t = new T;
>>     t.prop = { writeln("fda"); };
>>
>>     // Seems a bit odd that assigning to a temporary (tvalue) suddently
>>     // changes the behaviour.
>>     auto tvalue = t.prop;
>>     tvalue();     // Works as expected by printing fda
>>     t.prop();     // Just returns the delegate!
>>
>>     // Shouldn't the @property attribute ensure that no () is needed
>>     // when using the property
>>     t.prop()(); // Works
>> }
>
> @property doesn't currently enforce much of anything. Things are in a transitory
> state with regards to property. Originally, there was no such thing as @property
> and any function which had no parameters and returned a value could be used as a
> getter and any function which returned nothing and took a single argument could
> be used as a setter. It was decided to make it more restrictive, so @property
> was added. Eventually, you will _only_ be able to use such functions as property
> functions if they are marked with @property, and you will _have_ to call them
> with the property syntax and will _not_ be able to call non-property functions
> with the property syntax. However, at the moment, the compiler doesn't enforce
> that. It will eventually, but there are several bugs with regards to property
> functions (they mostly work, but you found one of the cases where they don't),
> and it probably wouldn't be a good idea to enforce it until more of those bugs
> have been fixed.
>
> - Jonathan M Davis

Okey... nice to hear that this is coming up.

Thanks again!
/Jonas


March 13, 2011
Jonas Drewsen Wrote:

> Just tried the property stuff out but it seems a bit inconsistent. Maybe someone can enlighten me:
> 
> import std.stdio;
> 
> alias void delegate() deleg;
> 
> class T {
>    private deleg tvalue;
>    @property void prop(deleg dg) {
>      tvalue = dg;
>    }
>    @property deleg prop() {
>      return tvalue;
>    }
> }
> 
> void main(string[] args) {
>    T t = new T;
>    t.prop = { writeln("fda"); };
> 
>    // Seems a bit odd that assigning to a temporary (tvalue) suddently
>    // changes the behaviour.
>    auto tvalue = t.prop;
>    tvalue();     // Works as expected by printing fda
>    t.prop();     // Just returns the delegate!
> 
>    // Shouldn't the @property attribute ensure that no () is needed
>    // when using the property
>    t.prop()(); // Works
> }
> 
> /Jonas

Ah, yes. One of the big reasons for introducing @property was because returning delegates could be very confusing in terms if whether the delegate is called or returned from the function. Since the old system has not yet been ripped out @property basically does nothing except under some conditions where it will complain you have added a ().

So the situation should improve, but I really don't know how or when things will change.
March 13, 2011
Hi,

  So I've been working a bit on the etc.curl module. Currently most of the HTTP functionality is done and some very simple Ftp.

I would very much like to know if this has a chance of getting in phobos if I finish it with the current design. If not then it will be for my own project only and doesn't need as much documentation or all the features.

https://github.com/jcd/phobos/tree/curl

I do know that the error handling is currently not good enough... WIP.

/Jonas


On 11/03/11 16.20, Jonas Drewsen wrote:
> Hi,
>
> So I've spent some time trying to wrap libcurl for D. There is a lot of
> things that you can do with libcurl which I did not know so I'm starting
> out small.
>
> For now I've created all the declarations for the latest public curl C
> api. I have put that in the etc.c.curl module.
>
> On top of that I've created a more D like api as seen below. This is
> located in the 'etc.curl' module. What you can see below currently works
> but before proceeding further down this road I would like to get your
> comments on it.
>
> //
> // Simple HTTP GET with sane defaults
> // provides the .content, .headers and .status
> //
> writeln( Http.get("http://www.google.com").content );
>
> //
> // GET with custom data receiver delegates
> //
> Http http = new Http("http://www.google.dk");
> http.setReceiveHeaderCallback( (string key, string value) {
> writeln(key ~ ":" ~ value);
> } );
> http.setReceiveCallback( (string data) { /* drop */ } );
> http.perform;
>
> //
> // POST with some timouts
> //
> http.setUrl("http://www.testing.com/test.cgi");
> http.setReceiveCallback( (string data) { writeln(data); } );
> http.setConnectTimeout(1000);
> http.setDataTimeout(1000);
> http.setDnsTimeout(1000);
> http.setPostData("The quick....");
> http.perform;
>
> //
> // PUT with data sender delegate
> //
> string msg = "Hello world";
> size_t len = msg.length; /* using chuncked transfer if omitted */
>
> http.setSendCallback( delegate size_t(char[] data) {
> if (msg.empty) return 0;
> auto l = msg.length;
> data[0..l] = msg[0..$];
> msg.length = 0;
> return l;
> },
> HttpMethod.put, len );
> http.perform;
>
> //
> // HTTPS
> //
> writeln(Http.get("https://mail.google.com").content);
>
> //
> // FTP
> //
> writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
> "./downloaded-file"));
>
>
> // ... authenication, cookies, interface select, progress callback
> // etc. is also implemented this way.
>
>
> /Jonas

March 13, 2011
On 3/11/11 9:20 AM, Jonas Drewsen wrote:
> Hi,
>
> So I've spent some time trying to wrap libcurl for D. There is a lot of
> things that you can do with libcurl which I did not know so I'm starting
> out small.
>
> For now I've created all the declarations for the latest public curl C
> api. I have put that in the etc.c.curl module.

Great! Could you please create a pull request for that?

> On top of that I've created a more D like api as seen below. This is
> located in the 'etc.curl' module. What you can see below currently works
> but before proceeding further down this road I would like to get your
> comments on it.
>
> //
> // Simple HTTP GET with sane defaults
> // provides the .content, .headers and .status
> //
> writeln( Http.get("http://www.google.com").content );

Sweet. As has been discussed, often the content is not text so you may want to have content return ubyte[] and add a new property such as "textContent" or "text".

> //
> // GET with custom data receiver delegates
> //
> Http http = new Http("http://www.google.dk");

You'll probably need to justify the existence of a class hierarchy and what overridable methods there are. In particular, since you seem to offer hooks via delegates, probably classes wouldn't be needed at all. (FWIW I would've done the same; I wouldn't want to inherit just to intercept the headers etc.)

> http.setReceiveHeaderCallback( (string key, string value) {
> writeln(key ~ ":" ~ value);
> } );
> http.setReceiveCallback( (string data) { /* drop */ } );
> http.perform;

As discussed, properties may be better here than setXxx and getXxx. The setReceiveCallback hook should take a ubyte[]. The setReceiveHeaderCallback should take a const(char)[]. That way you won't need to copy all headers, leaving safely that option to the client.

> //
> // POST with some timouts
> //
> http.setUrl("http://www.testing.com/test.cgi");
> http.setReceiveCallback( (string data) { writeln(data); } );
> http.setConnectTimeout(1000);
> http.setDataTimeout(1000);
> http.setDnsTimeout(1000);
> http.setPostData("The quick....");
> http.perform;

setPostData -> setTextPostData, and then changing everything to properties would make it something like textPostData. Or wait, there could be some overloading going on... Anyway, the basic idea is that generally get and post data could be raw bytes, and the user could elect to transfer strings instead.

> //
> // PUT with data sender delegate
> //
> string msg = "Hello world";
> size_t len = msg.length; /* using chuncked transfer if omitted */
>
> http.setSendCallback( delegate size_t(char[] data) {
> if (msg.empty) return 0;
> auto l = msg.length;
> data[0..l] = msg[0..$];
> msg.length = 0;
> return l;
> },
> HttpMethod.put, len );
> http.perform;

The callback would take ubyte[].

> //
> // HTTPS
> //
> writeln(Http.get("https://mail.google.com").content);
>
> //
> // FTP
> //
> writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
> "./downloaded-file"));
>
>
> // ... authenication, cookies, interface select, progress callback
> // etc. is also implemented this way.
>
>
> /Jonas

This is all very encouraging. I think this API covers nicely a variety of needs. We need to make sure everything interacts well with threads, in particular that one can shut down a transfer (or the entire library) from a thread or callback and have the existing transfer(s) throw an exception immediately.

Regarding a range interface, it would be great if you allowed e.g.

foreach (line; Http.get("https://mail.google.com").byLine()) {
   ...
}

The data transfer should happen concurrently with the foreach code. The type of line is char[] or const(char)[]. Similarly, there would be a byChunk interface that transfers in ubyte[] chunks.

Also we need a head() method for the corresponding command.


Andrei