March 14, 2011
On 13/03/11 23.44, Andrei Alexandrescu wrote:
> On 3/11/11 9:20 AM, Jonas Drewsen wrote:
>> Hi,
>>
>> So I've spent some time trying to wrap libcurl for D. There is a lot of
>> things that you can do with libcurl which I did not know so I'm starting
>> out small.
>>
>> For now I've created all the declarations for the latest public curl C
>> api. I have put that in the etc.c.curl module.
>
> Great! Could you please create a pull request for that?

Will do as soon as I've figured out howto create a pull request for a single file in a branch. Anyone knows how to do that on github? Or should I just create a pull request including the etc.curl wrapper as well?

>> On top of that I've created a more D like api as seen below. This is
>> located in the 'etc.curl' module. What you can see below currently works
>> but before proceeding further down this road I would like to get your
>> comments on it.
>>
>> //
>> // Simple HTTP GET with sane defaults
>> // provides the .content, .headers and .status
>> //
>> writeln( Http.get("http://www.google.com").content );
>
> Sweet. As has been discussed, often the content is not text so you may
> want to have content return ubyte[] and add a new property such as
> "textContent" or "text".

I've already changed it to void[] as done in the std.file module. Is ubyte[] better suited?

I'll add a text property as well.


>> //
>> // GET with custom data receiver delegates
>> //
>> Http http = new Http("http://www.google.dk");
>
> You'll probably need to justify the existence of a class hierarchy and
> what overridable methods there are. In particular, since you seem to
> offer hooks via delegates, probably classes wouldn't be needed at all.
> (FWIW I would've done the same; I wouldn't want to inherit just to
> intercept the headers etc.)
>
>> http.setReceiveHeaderCallback( (string key, string value) {
>> writeln(key ~ ":" ~ value);
>> } );
>> http.setReceiveCallback( (string data) { /* drop */ } );
>> http.perform;
>
> As discussed, properties may be better here than setXxx and getXxx. The
> setReceiveCallback hook should take a ubyte[]. The
> setReceiveHeaderCallback should take a const(char)[]. That way you won't
> need to copy all headers, leaving safely that option to the client.

I've already replaced the set/get methods with properties and renamed them. Hadn't thought of using const(char)[].. thanks for the hint.


>> //
>> // POST with some timouts
>> //
>> http.setUrl("http://www.testing.com/test.cgi");
>> http.setReceiveCallback( (string data) { writeln(data); } );
>> http.setConnectTimeout(1000);
>> http.setDataTimeout(1000);
>> http.setDnsTimeout(1000);
>> http.setPostData("The quick....");
>> http.perform;
>
> setPostData -> setTextPostData, and then changing everything to
> properties would make it something like textPostData. Or wait, there
> could be some overloading going on... Anyway, the basic idea is that
> generally get and post data could be raw bytes, and the user could elect
> to transfer strings instead.

I'll make sure both text and byte[]/void[] versions will be available.

>> //
>> // PUT with data sender delegate
>> //
>> string msg = "Hello world";
>> size_t len = msg.length; /* using chuncked transfer if omitted */
>>
>> http.setSendCallback( delegate size_t(char[] data) {
>> if (msg.empty) return 0;
>> auto l = msg.length;
>> data[0..l] = msg[0..$];
>> msg.length = 0;
>> return l;
>> },
>> HttpMethod.put, len );
>> http.perform;
>
> The callback would take ubyte[].

Already fixed.


>> //
>> // HTTPS
>> //
>> writeln(Http.get("https://mail.google.com").content);
>>
>> //
>> // FTP
>> //
>> writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
>> "./downloaded-file"));
>>
>>
>> // ... authenication, cookies, interface select, progress callback
>> // etc. is also implemented this way.
>>
>>
>> /Jonas
>
> This is all very encouraging. I think this API covers nicely a variety
> of needs. We need to make sure everything interacts well with threads,
> in particular that one can shut down a transfer (or the entire library)
> from a thread or callback and have the existing transfer(s) throw an
> exception immediately.

I'll have a look at it.


> Regarding a range interface, it would be great if you allowed e.g.
>
> foreach (line; Http.get("https://mail.google.com").byLine()) {
> ...
> }
>
> The data transfer should happen concurrently with the foreach code. The
> type of line is char[] or const(char)[]. Similarly, there would be a
> byChunk interface that transfers in ubyte[] chunks.
>
> Also we need a head() method for the corresponding command.
>
> Andrei

That would be neat. What do you mean about concurrent data transfers with foreach?


/Jonas
March 14, 2011
On Monday 14 March 2011 02:16:12 Jonas Drewsen wrote:
> On 13/03/11 23.44, Andrei Alexandrescu wrote:
> > On 3/11/11 9:20 AM, Jonas Drewsen wrote:
> >> Hi,
> >> 
> >> So I've spent some time trying to wrap libcurl for D. There is a lot of things that you can do with libcurl which I did not know so I'm starting out small.
> >> 
> >> For now I've created all the declarations for the latest public curl C api. I have put that in the etc.c.curl module.
> > 
> > Great! Could you please create a pull request for that?
> 
> Will do as soon as I've figured out howto create a pull request for a
> single file in a branch. Anyone knows how to do that on github? Or
> should I just create a pull request including the etc.curl wrapper as well?

You can't. A pull request is for an entire branch. It pulls _everything_ from that branch which differs from the one being merged with. git cares about commits, not files. And pulling from another repository pulls all of the commits which you don't have. So, if you want to do a pull request, you create a branch with exactly the commits that you wanted merged in on it. No more, no less.

> >> On top of that I've created a more D like api as seen below. This is located in the 'etc.curl' module. What you can see below currently works but before proceeding further down this road I would like to get your comments on it.
> >> 
> >> //
> >> // Simple HTTP GET with sane defaults
> >> // provides the .content, .headers and .status
> >> //
> >> writeln( Http.get("http://www.google.com").content );
> > 
> > Sweet. As has been discussed, often the content is not text so you may want to have content return ubyte[] and add a new property such as "textContent" or "text".
> 
> I've already changed it to void[] as done in the std.file module. Is ubyte[] better suited?

That's debatable. Some would argue one way, some another. Personally, I'd argue ubyte[]. I don't like void[] one bit. Others would agree with me, and yet others would disagree. I don't think that there's really a general agreement on whether void[] or ubyte[] is better when it comes to reading binary data like that.

- Jonathan M Davis
March 14, 2011
Jonas Drewsen wrote:
>Hi,
>
>   So I've been working a bit on the etc.curl module. Currently most
> of
>the HTTP functionality is done and some very simple Ftp.
>
>I would very much like to know if this has a chance of getting in phobos if I finish it with the current design. If not then it will be for my own project only and doesn't need as much documentation or all the features.
>
>https://github.com/jcd/phobos/tree/curl
>
>I do know that the error handling is currently not good enough... WIP.
>
>/Jonas
>
>
>On 11/03/11 16.20, Jonas Drewsen wrote:
>> Hi,
>>
>> So I've spent some time trying to wrap libcurl for D. There is a lot of things that you can do with libcurl which I did not know so I'm starting out small.
>>
>> For now I've created all the declarations for the latest public curl C api. I have put that in the etc.c.curl module.
>>
>> On top of that I've created a more D like api as seen below. This is located in the 'etc.curl' module. What you can see below currently works but before proceeding further down this road I would like to get your comments on it.
>>
>> //
>> // Simple HTTP GET with sane defaults
>> // provides the .content, .headers and .status
>> //
>> writeln( Http.get("http://www.google.com").content );
>>
>> //
>> // GET with custom data receiver delegates
>> //
>> Http http = new Http("http://www.google.dk");
>> http.setReceiveHeaderCallback( (string key, string value) {
>> writeln(key ~ ":" ~ value);
>> } );
>> http.setReceiveCallback( (string data) { /* drop */ } );
>> http.perform;
>>
>> //
>> // POST with some timouts
>> //
>> http.setUrl("http://www.testing.com/test.cgi");
>> http.setReceiveCallback( (string data) { writeln(data); } );
>> http.setConnectTimeout(1000);
>> http.setDataTimeout(1000);
>> http.setDnsTimeout(1000);
>> http.setPostData("The quick....");
>> http.perform;
>>
>> //
>> // PUT with data sender delegate
>> //
>> string msg = "Hello world";
>> size_t len = msg.length; /* using chuncked transfer if omitted */
>>
>> http.setSendCallback( delegate size_t(char[] data) {
>> if (msg.empty) return 0;
>> auto l = msg.length;
>> data[0..l] = msg[0..$];
>> msg.length = 0;
>> return l;
>> },
>> HttpMethod.put, len );
>> http.perform;
>>
>> //
>> // HTTPS
>> //
>> writeln(Http.get("https://mail.google.com").content);
>>
>> //
>> // FTP
>> //
>> writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
>> "./downloaded-file"));
>>
>>
>> // ... authenication, cookies, interface select, progress callback // etc. is also implemented this way.
>>
>>
>> /Jonas
>
Hi,
I really like the API. A few comments:

You use the internal curl progress meter. According to the documentation (It's a little hidden, look at CURLOPT_NOPROGRESS) the progress meter is likely to removed in future curl versions. The download progress should be easy to reimplement, although you'd have to parse the Content-Length header. Upload shouldn't be to difficult either (One problem: What does curl pass as ultotal/dltotal when chunked encoding is used or the total size is not known?). Then we could also use different delegates for upload/download.

The callback interface suits curl best and I actually like it, but how
will it interact with streams? As an example: If someone wrote a
stream/filter that decoded gzip for files it should be usable with
the http streams as well. But files/ filestreams have a pull
interface (no callbacks, stream.read() in a loop). So how could a gzip
stream be written without to much code duplication supporting files and
the http stuff?

Do you plan to add some kind of support for header parsing? I think
something like what the .net webclient uses
( http://msdn.microsoft.com/en-us/library/system.net.webclient(v=VS.100).aspx )
would be great. Especially the HeaderCollection supporting headers as
strings and as data types (for both parsing and formatting), but
without a class hierarchy for the headers, using templates instead.

I've written D parsers/formatters for almost all headers in
rfc2616 (1 or 2 might be missing) and for a few additional commonly
used headers (Content-Disposition, cookie headers). The parsers are
written with ragel and are to be used with curl (continuations must be
removed and the parsers always take 1 line of input, just as you get it
from curl). Right now only the client side is implemented (no parsers
for headers which can only be sent from client-->server ). However, I
need to add some more documentation to the parsers, need to do
some refactoring and I've got absolutely no time for that in the next 2
weeks ('abitur' final exams). But if you could wait 2 weeks or if
you wanted to do the refactoring yourself, I would be happy to
contribute that code.


-- 
Johannes Pfau


March 14, 2011
On Mon, 14 Mar 2011 02:36:07 -0700, Jonathan M Davis wrote:

> On Monday 14 March 2011 02:16:12 Jonas Drewsen wrote:
>> On 13/03/11 23.44, Andrei Alexandrescu wrote:
>> > On 3/11/11 9:20 AM, Jonas Drewsen wrote:
>> >> Hi,
>> >> 
>> >> So I've spent some time trying to wrap libcurl for D. There is a lot of things that you can do with libcurl which I did not know so I'm starting out small.
>> >> 
>> >> For now I've created all the declarations for the latest public curl C api. I have put that in the etc.c.curl module.
>> > 
>> > Great! Could you please create a pull request for that?
>> 
>> Will do as soon as I've figured out howto create a pull request for a single file in a branch. Anyone knows how to do that on github? Or should I just create a pull request including the etc.curl wrapper as well?
> 
> You can't. A pull request is for an entire branch. It pulls _everything_ from that branch which differs from the one being merged with. git cares about commits, not files. And pulling from another repository pulls all of the commits which you don't have. So, if you want to do a pull request, you create a branch with exactly the commits that you wanted merged in on it. No more, no less.
> 
>> >> On top of that I've created a more D like api as seen below. This is located in the 'etc.curl' module. What you can see below currently works but before proceeding further down this road I would like to get your comments on it.
>> >> 
>> >> //
>> >> // Simple HTTP GET with sane defaults // provides the .content,
>> >> .headers and .status //
>> >> writeln( Http.get("http://www.google.com").content );
>> > 
>> > Sweet. As has been discussed, often the content is not text so you may want to have content return ubyte[] and add a new property such as "textContent" or "text".
>> 
>> I've already changed it to void[] as done in the std.file module. Is ubyte[] better suited?
> 
> That's debatable. Some would argue one way, some another. Personally, I'd argue ubyte[]. I don't like void[] one bit. Others would agree with me, and yet others would disagree. I don't think that there's really a general agreement on whether void[] or ubyte[] is better when it comes to reading binary data like that.

I also think ubyte[] is best, because:

1. It can be used directly.  (You can't get an element from a void[] array without casting it to something else first.)

2. There are no assumptions about the type of data contained in the array.  (char[] arrays are assumed to be UTF-8 encoded.)

3. ubyte[] arrays are (AFAIK) not scanned by the GC.  (void[] arrays may contain pointers and must therefore be scanned.)

I think the rule of thumb should be:  If the array contains raw data of unspecified type, but no pointers or references, use ubyte[].

void[] is very useful for input parameters, however, since all arrays are implicitly castable to void[]:

  void writeData(void[] data) { ... }

  writeData("Hello World!");
  writeData([1, 2, 3, 4]);

-Lars
March 14, 2011
On Mon, 14 Mar 2011 07:20:26 -0400, Lars T. Kyllingstad <public@kyllingen.nospamnet> wrote:

> On Mon, 14 Mar 2011 02:36:07 -0700, Jonathan M Davis wrote:
>
>> On Monday 14 March 2011 02:16:12 Jonas Drewsen wrote:
>>> On 13/03/11 23.44, Andrei Alexandrescu wrote:
>>> > On 3/11/11 9:20 AM, Jonas Drewsen wrote:
>>> >> Hi,
>>> >>
>>> >> So I've spent some time trying to wrap libcurl for D. There is a lot
>>> >> of things that you can do with libcurl which I did not know so I'm
>>> >> starting out small.
>>> >>
>>> >> For now I've created all the declarations for the latest public curl
>>> >> C api. I have put that in the etc.c.curl module.
>>> >
>>> > Great! Could you please create a pull request for that?
>>>
>>> Will do as soon as I've figured out howto create a pull request for a
>>> single file in a branch. Anyone knows how to do that on github? Or
>>> should I just create a pull request including the etc.curl wrapper as
>>> well?
>>
>> You can't. A pull request is for an entire branch. It pulls _everything_
>> from that branch which differs from the one being merged with. git cares
>> about commits, not files. And pulling from another repository pulls all
>> of the commits which you don't have. So, if you want to do a pull
>> request, you create a branch with exactly the commits that you wanted
>> merged in on it. No more, no less.
>>
>>> >> On top of that I've created a more D like api as seen below. This is
>>> >> located in the 'etc.curl' module. What you can see below currently
>>> >> works but before proceeding further down this road I would like to
>>> >> get your comments on it.
>>> >>
>>> >> //
>>> >> // Simple HTTP GET with sane defaults // provides the .content,
>>> >> .headers and .status //
>>> >> writeln( Http.get("http://www.google.com").content );
>>> >
>>> > Sweet. As has been discussed, often the content is not text so you
>>> > may want to have content return ubyte[] and add a new property such
>>> > as "textContent" or "text".
>>>
>>> I've already changed it to void[] as done in the std.file module. Is
>>> ubyte[] better suited?
>>
>> That's debatable. Some would argue one way, some another. Personally,
>> I'd argue ubyte[]. I don't like void[] one bit. Others would agree with
>> me, and yet others would disagree. I don't think that there's really a
>> general agreement on whether void[] or ubyte[] is better when it comes
>> to reading binary data like that.
>
> I also think ubyte[] is best, because:
>
> 1. It can be used directly.  (You can't get an element from a void[]
> array without casting it to something else first.)
>
> 2. There are no assumptions about the type of data contained in the
> array.  (char[] arrays are assumed to be UTF-8 encoded.)
>
> 3. ubyte[] arrays are (AFAIK) not scanned by the GC.  (void[] arrays may
> contain pointers and must therefore be scanned.)

This isn't exactly true.  arrays *created* as void[] will be scanned.  Arrays created as ubyte[] and then cast to void[] will not be scanned.

However, it is far too easy while dealing with a void[] array to have it mysteriously flip its bit to scan-able.

> I think the rule of thumb should be:  If the array contains raw data of
> unspecified type, but no pointers or references, use ubyte[].
>
> void[] is very useful for input parameters, however, since all arrays are
> implicitly castable to void[]:
>
>   void writeData(void[] data) { ... }
>
>   writeData("Hello World!");
>   writeData([1, 2, 3, 4]);

I think (and this differs from  my previous opinion) const(void)[] should be used for input parameters where any array type could be passed in.  However, ubyte[] should be used for output parameters and for internal storage.  void[] just has too many pitfalls to be used anywhere but where its implicit casting ability is useful.

-Steve
March 14, 2011
On 14/03/11 12.10, Johannes Pfau wrote:
> Jonas Drewsen wrote:
>> Hi,
>>
>>    So I've been working a bit on the etc.curl module. Currently most
>> of
>> the HTTP functionality is done and some very simple Ftp.
>>
>> I would very much like to know if this has a chance of getting in
>> phobos if I finish it with the current design. If not then it will be
>> for my own project only and doesn't need as much documentation or all
>> the features.
>>
>> https://github.com/jcd/phobos/tree/curl
>>
>> I do know that the error handling is currently not good enough... WIP.
>>
>> /Jonas
>>
>>
>> On 11/03/11 16.20, Jonas Drewsen wrote:
>>> Hi,
>>>
>>> So I've spent some time trying to wrap libcurl for D. There is a lot
>>> of things that you can do with libcurl which I did not know so I'm
>>> starting out small.
>>>
>>> For now I've created all the declarations for the latest public curl
>>> C api. I have put that in the etc.c.curl module.
>>>
>>> On top of that I've created a more D like api as seen below. This is
>>> located in the 'etc.curl' module. What you can see below currently
>>> works but before proceeding further down this road I would like to
>>> get your comments on it.
>>>
>>> //
>>> // Simple HTTP GET with sane defaults
>>> // provides the .content, .headers and .status
>>> //
>>> writeln( Http.get("http://www.google.com").content );
>>>
>>> //
>>> // GET with custom data receiver delegates
>>> //
>>> Http http = new Http("http://www.google.dk");
>>> http.setReceiveHeaderCallback( (string key, string value) {
>>> writeln(key ~ ":" ~ value);
>>> } );
>>> http.setReceiveCallback( (string data) { /* drop */ } );
>>> http.perform;
>>>
>>> //
>>> // POST with some timouts
>>> //
>>> http.setUrl("http://www.testing.com/test.cgi");
>>> http.setReceiveCallback( (string data) { writeln(data); } );
>>> http.setConnectTimeout(1000);
>>> http.setDataTimeout(1000);
>>> http.setDnsTimeout(1000);
>>> http.setPostData("The quick....");
>>> http.perform;
>>>
>>> //
>>> // PUT with data sender delegate
>>> //
>>> string msg = "Hello world";
>>> size_t len = msg.length; /* using chuncked transfer if omitted */
>>>
>>> http.setSendCallback( delegate size_t(char[] data) {
>>> if (msg.empty) return 0;
>>> auto l = msg.length;
>>> data[0..l] = msg[0..$];
>>> msg.length = 0;
>>> return l;
>>> },
>>> HttpMethod.put, len );
>>> http.perform;
>>>
>>> //
>>> // HTTPS
>>> //
>>> writeln(Http.get("https://mail.google.com").content);
>>>
>>> //
>>> // FTP
>>> //
>>> writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
>>> "./downloaded-file"));
>>>
>>>
>>> // ... authenication, cookies, interface select, progress callback
>>> // etc. is also implemented this way.
>>>
>>>
>>> /Jonas
>>
> Hi,
> I really like the API. A few comments:
>
> You use the internal curl progress meter. According to the
> documentation (It's a little hidden, look at CURLOPT_NOPROGRESS) the
> progress meter is likely to removed in future curl versions. The
> download progress should be easy to reimplement, although you'd have to
> parse the Content-Length header. Upload shouldn't be to difficult either
> (One problem: What does curl pass as ultotal/dltotal when chunked
> encoding is used or the total size is not known?). Then we could also
> use different delegates for upload/download.

I did see the notice about the future of NOPROGRESS's removal but decided to wrap it anyway. Maybe I should just remove it in an initial version. As you say it is pretty simple to implement ourselves.

> The callback interface suits curl best and I actually like it, but how
> will it interact with streams? As an example: If someone wrote a
> stream/filter that decoded gzip for files it should be usable with
> the http streams as well. But files/ filestreams have a pull
> interface (no callbacks, stream.read() in a loop). So how could a gzip
> stream be written without to much code duplication supporting files and
> the http stuff?

If we take Andrei's stream proposal as the base of a new streaming design then the http would just be another Transport. Files have a pull interface that blocks until data is read. The same could be done for a the http class.

What I would really like is for the stream design to support non-blocking as mentioned in the stream proposal. Just have to figure out how the streaming API should behave in such cases I guess.


> Do you plan to add some kind of support for header parsing? I think
> something like what the .net webclient uses
> ( http://msdn.microsoft.com/en-us/library/system.net.webclient(v=VS.100).aspx )
> would be great. Especially the HeaderCollection supporting headers as
> strings and as data types (for both parsing and formatting), but
> without a class hierarchy for the headers, using templates instead.

It would be nice to be able to get/set headers by string and enums (http://msdn.microsoft.com/en-us/library/system.net.httprequestheader.aspx). But I cannot see that .net is using datatypes or templates for it. Could you give me a pointer please?


> I've written D parsers/formatters for almost all headers in
> rfc2616 (1 or 2 might be missing) and for a few additional commonly
> used headers (Content-Disposition, cookie headers). The parsers are
> written with ragel and are to be used with curl (continuations must be
> removed and the parsers always take 1 line of input, just as you get it
> from curl). Right now only the client side is implemented (no parsers
> for headers which can only be sent from client-->server ). However, I
> need to add some more documentation to the parsers, need to do
> some refactoring and I've got absolutely no time for that in the next 2
> weeks ('abitur' final exams). But if you could wait 2 weeks or if
> you wanted to do the refactoring yourself, I would be happy to
> contribute that code.

That sounds very interesting. I would very much like to see the code and see if fits in.



March 14, 2011
On 2011-03-13 22:39, Jonas Drewsen wrote:
> Hi,
>
> So I've been working a bit on the etc.curl module. Currently most of the
> HTTP functionality is done and some very simple Ftp.
>
> I would very much like to know if this has a chance of getting in phobos
> if I finish it with the current design. If not then it will be for my
> own project only and doesn't need as much documentation or all the
> features.
>
> https://github.com/jcd/phobos/tree/curl
>
> I do know that the error handling is currently not good enough... WIP.
>
> /Jonas
>
>
> On 11/03/11 16.20, Jonas Drewsen wrote:
>> Hi,
>>
>> So I've spent some time trying to wrap libcurl for D. There is a lot of
>> things that you can do with libcurl which I did not know so I'm starting
>> out small.
>>
>> For now I've created all the declarations for the latest public curl C
>> api. I have put that in the etc.c.curl module.
>>
>> On top of that I've created a more D like api as seen below. This is
>> located in the 'etc.curl' module. What you can see below currently works
>> but before proceeding further down this road I would like to get your
>> comments on it.
>>
>> //
>> // Simple HTTP GET with sane defaults
>> // provides the .content, .headers and .status
>> //
>> writeln( Http.get("http://www.google.com").content );
>>
>> //
>> // GET with custom data receiver delegates
>> //
>> Http http = new Http("http://www.google.dk");
>> http.setReceiveHeaderCallback( (string key, string value) {
>> writeln(key ~ ":" ~ value);
>> } );
>> http.setReceiveCallback( (string data) { /* drop */ } );
>> http.perform;
>>
>> //
>> // POST with some timouts
>> //
>> http.setUrl("http://www.testing.com/test.cgi");
>> http.setReceiveCallback( (string data) { writeln(data); } );
>> http.setConnectTimeout(1000);
>> http.setDataTimeout(1000);
>> http.setDnsTimeout(1000);
>> http.setPostData("The quick....");
>> http.perform;
>>
>> //
>> // PUT with data sender delegate
>> //
>> string msg = "Hello world";
>> size_t len = msg.length; /* using chuncked transfer if omitted */
>>
>> http.setSendCallback( delegate size_t(char[] data) {
>> if (msg.empty) return 0;
>> auto l = msg.length;
>> data[0..l] = msg[0..$];
>> msg.length = 0;
>> return l;
>> },
>> HttpMethod.put, len );
>> http.perform;
>>
>> //
>> // HTTPS
>> //
>> writeln(Http.get("https://mail.google.com").content);
>>
>> //
>> // FTP
>> //
>> writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
>> "./downloaded-file"));
>>
>>
>> // ... authenication, cookies, interface select, progress callback
>> // etc. is also implemented this way.
>>
>>
>> /Jonas

I thought that the "etc" package was for C bindings and would expect the "curl" module to be placed in std.curl or std.net.curl.

-- 
/Jacob Carlborg
March 14, 2011
On 14/03/11 13.28, Steven Schveighoffer wrote:
> On Mon, 14 Mar 2011 07:20:26 -0400, Lars T. Kyllingstad
> <public@kyllingen.nospamnet> wrote:
>
>> On Mon, 14 Mar 2011 02:36:07 -0700, Jonathan M Davis wrote:
>>
>>> On Monday 14 March 2011 02:16:12 Jonas Drewsen wrote:
>>>> On 13/03/11 23.44, Andrei Alexandrescu wrote:
>>>> > On 3/11/11 9:20 AM, Jonas Drewsen wrote:
>>>> >> Hi,
>>>> >>
>>>> >> So I've spent some time trying to wrap libcurl for D. There is a lot
>>>> >> of things that you can do with libcurl which I did not know so I'm
>>>> >> starting out small.
>>>> >>
>>>> >> For now I've created all the declarations for the latest public curl
>>>> >> C api. I have put that in the etc.c.curl module.
>>>> >
>>>> > Great! Could you please create a pull request for that?
>>>>
>>>> Will do as soon as I've figured out howto create a pull request for a
>>>> single file in a branch. Anyone knows how to do that on github? Or
>>>> should I just create a pull request including the etc.curl wrapper as
>>>> well?
>>>
>>> You can't. A pull request is for an entire branch. It pulls _everything_
>>> from that branch which differs from the one being merged with. git cares
>>> about commits, not files. And pulling from another repository pulls all
>>> of the commits which you don't have. So, if you want to do a pull
>>> request, you create a branch with exactly the commits that you wanted
>>> merged in on it. No more, no less.
>>>
>>>> >> On top of that I've created a more D like api as seen below. This is
>>>> >> located in the 'etc.curl' module. What you can see below currently
>>>> >> works but before proceeding further down this road I would like to
>>>> >> get your comments on it.
>>>> >>
>>>> >> //
>>>> >> // Simple HTTP GET with sane defaults // provides the .content,
>>>> >> .headers and .status //
>>>> >> writeln( Http.get("http://www.google.com").content );
>>>> >
>>>> > Sweet. As has been discussed, often the content is not text so you
>>>> > may want to have content return ubyte[] and add a new property such
>>>> > as "textContent" or "text".
>>>>
>>>> I've already changed it to void[] as done in the std.file module. Is
>>>> ubyte[] better suited?
>>>
>>> That's debatable. Some would argue one way, some another. Personally,
>>> I'd argue ubyte[]. I don't like void[] one bit. Others would agree with
>>> me, and yet others would disagree. I don't think that there's really a
>>> general agreement on whether void[] or ubyte[] is better when it comes
>>> to reading binary data like that.
>>
>> I also think ubyte[] is best, because:
>>
>> 1. It can be used directly. (You can't get an element from a void[]
>> array without casting it to something else first.)
>>
>> 2. There are no assumptions about the type of data contained in the
>> array. (char[] arrays are assumed to be UTF-8 encoded.)
>>
>> 3. ubyte[] arrays are (AFAIK) not scanned by the GC. (void[] arrays may
>> contain pointers and must therefore be scanned.)
>
> This isn't exactly true. arrays *created* as void[] will be scanned.
> Arrays created as ubyte[] and then cast to void[] will not be scanned.
>
> However, it is far too easy while dealing with a void[] array to have it
> mysteriously flip its bit to scan-able.
>
>> I think the rule of thumb should be: If the array contains raw data of
>> unspecified type, but no pointers or references, use ubyte[].
>>
>> void[] is very useful for input parameters, however, since all arrays are
>> implicitly castable to void[]:
>>
>> void writeData(void[] data) { ... }
>>
>> writeData("Hello World!");
>> writeData([1, 2, 3, 4]);
>
> I think (and this differs from my previous opinion) const(void)[] should
> be used for input parameters where any array type could be passed in.
> However, ubyte[] should be used for output parameters and for internal
> storage. void[] just has too many pitfalls to be used anywhere but where
> its implicit casting ability is useful.
>
> -Steve

const(ubyte)[] for input
void[] for output

that sounds reasonable. I guess that if everybody can agree on this then the all of phobos (e.g. std.file) should use the same types?

/Jonas

March 14, 2011
On 13/03/11 23.44, Andrei Alexandrescu wrote:
> You'll probably need to justify the existence of a class hierarchy and
> what overridable methods there are. In particular, since you seem to
> offer hooks via delegates, probably classes wouldn't be needed at all.
> (FWIW I would've done the same; I wouldn't want to inherit just to
> intercept the headers etc.)

Missed this one in my last reply.

Ftp/Http etc. are all inheriting from a Protocol class. The Protocol class defines common settings (@properties) for all protocols e.g. dnsTimeout, connectTimeout, networkInterface, url, port selection.

I could make these into a mixin and thereby get rid of the inheritance of course.

I think that keeping the Protocol as an abstract base class would benefit e.g. the integration with streams. In that case we could simply create a CurlTransport that contains a reference to a Protocol derived objects (Http,Ftp...).

Or would it be better to have specific HttpTransport, FtpTransport?


/Jonas
March 14, 2011
Jonas Drewsen wrote:
>> Do you plan to add some kind of support for header parsing? I think
>> something like what the .net webclient uses
>> ( http://msdn.microsoft.com/en-us/library/system.net.webclient(v=VS.100).aspx )
>> would be great. Especially the HeaderCollection supporting headers as
>> strings and as data types (for both parsing and formatting), but
>> without a class hierarchy for the headers, using templates instead.
>
>It would be nice to be able to get/set headers by string and enums
>(http://msdn.microsoft.com/en-us/library/system.net.httprequestheader.aspx).
>But I cannot see that .net is using datatypes or templates for it.
>Could you give me a pointer please?
>

You're right I didn't look close enough at the .net documentation. I thought HttpRequestHeader is a class. What I meant for D was something like this:

struct ETagHeader
{
    //Data members
    bool Weak = false;
    string Value;

    //All header structs provide these
    static string Key = "ETag";

    static ETagHeader parse(string value)
    {
        //parser logic here
    }

    void format(T writer)
        if (isOutputRange!(T, string))
    {
        if(etag.Weak)
            writer.put("W/");
        assert(etag.Value != "");
        writer.put(quote(etag.Value));
    }
}

Then we can offer methods like these:

setHeader(T)(T header)
    if(isHeader(T))
{
    headers[T.Key] = formatHeader(header);
}

T getHeader(T type)()
    if(isHeader(T))
{
   if(!T.Key in headers)
       throw Exception();
   return T.parse(headers[T.key]);
}

So user code wouldn't have to deal with header parsing / formatting:
auto etag = client.getHeader!ETagHeader();
assert(etag.Weak);

>> I've written D parsers/formatters for almost all headers in
>> rfc2616 (1 or 2 might be missing) and for a few additional commonly
>> used headers (Content-Disposition, cookie headers). The parsers are
>> written with ragel and are to be used with curl (continuations must
>> be removed and the parsers always take 1 line of input, just as you
>> get it from curl). Right now only the client side is implemented (no
>> parsers for headers which can only be sent from client-->server ).
>> However, I need to add some more documentation to the parsers, need
>> to do some refactoring and I've got absolutely no time for that in
>> the next 2 weeks ('abitur' final exams). But if you could wait 2
>> weeks or if you wanted to do the refactoring yourself, I would be
>> happy to contribute that code.
>
>That sounds very interesting. I would very much like to see the code and see if fits in.

Ok, here it is, but it seriously needs to be refactored and documented: https://gist.github.com/869324

-- 
Johannes Pfau