std.net.curl get webpage asia font issue

Jun 07, 2012

Sam Hu

Jun 07, 2012

Kevin

Jun 08, 2012

Jun 07, 2012

Jun 08, 2012

Jun 08, 2012

Greeting! The document on this website provide an example on how to get webpage information by std.net.curl.It is quite straightforward: [code] import std.net.curl, std.stdio; void main(){ // Return a string containing the content specified by an URL string content = get("dlang.org"); writefln("%s\n",content); readln; } [/code] When I change get("dlang.org") to get("yahoo.com"),everything goes fine;but when I change to get("yahoo.com.cn"),a runtime error said bad gbk encoding bla... So my very simple question is how to retrieve information from a webpage which could possibily contains asia font (like Chinese font)? Thanks for your help in advance. Regards, Sam

On 07/06/12 02:57, Sam Hu wrote: > string content = get("dlang.org"); > writefln("%s\n",content); > > So my very simple question is how to retrieve information from a webpage which could possibily contains asia font (like Chinese font)? I'm not really sure but try: wstring content = get("dlang.org"); Also make sure your terminal is set up for unicode.

June 07, 2012

Re: std.net.curl get webpage asia font issue

Posted by Dmitry Olshansky
in reply to Sam Hu

Permalink

Dmitry Olshansky

Posted in reply to Sam Hu

Permalink

On 07.06.2012 10:57, Sam Hu wrote:
> Greeting!
>
> The document on this website provide an example on how to get webpage
> information by std.net.curl.It is quite straightforward:
>
> [code]
> import std.net.curl, std.stdio;
>
> void main(){
>
> // Return a string containing the content specified by an URL
> string content = get("dlang.org");

It's simple this line you "convert" whatever site content was to unicode. Problem is that "convert" is either broken or it's simply a cast whereas it should re-encode source as unicode. So the way around is to get it to array of bytes and decode yourself.

>
> writefln("%s\n",content);
>
> readln;
> }
> [/code]
>
> When I change get("dlang.org") to get("yahoo.com"),everything goes
> fine;but when I change to get("yahoo.com.cn"),a runtime error said bad
> gbk encoding bla...
>
> So my very simple question is how to retrieve information from a webpage
> which could possibily contains asia font (like Chinese font)?
>
I think it's not "font" but encoding problem.

> Thanks for your help in advance.
>
> Regards,
> Sam


-- 
Dmitry Olshansky

On Thursday, 7 June 2012 at 10:43:32 UTC, Dmitry Olshansky wrote: >> string content = get("dlang.org"); > > It's simple this line you "convert" whatever site content was to unicode. Problem is that "convert" is either broken or it's simply a cast whereas it should re-encode source as unicode. So the way around is to get it to array of bytes and decode yourself. > Thanks.May I know how ?Appreciated a piece of code segment.

On Thursday, 7 June 2012 at 10:38:53 UTC, Kevin wrote: > On 07/06/12 02:57, Sam Hu wrote: >> string content = get("dlang.org"); >> writefln("%s\n",content); >> >> So my very simple question is how to retrieve information from a >> webpage which could possibily contains asia font (like Chinese font)? > > I'm not really sure but try: > wstring content = get("dlang.org"); > > Also make sure your terminal is set up for unicode. Sorry,no,it does not work,I tried to print the content to DFL TextBox control but still the same issue.

On 08.06.2012 5:03, Sam Hu wrote: > On Thursday, 7 June 2012 at 10:43:32 UTC, Dmitry Olshansky wrote: >>> string content = get("dlang.org"); >> >> It's simple this line you "convert" whatever site content was to >> unicode. Problem is that "convert" is either broken or it's simply a >> cast whereas it should re-encode source as unicode. So the way around >> is to get it to array of bytes and decode yourself. >> > > Thanks.May I know how ?Appreciated a piece of code segment. seems like ubyte[] data = get!(AutoProtocol, ubyte)("your-site.cn"); //should work, sorry I'm on windows and curl doesn't work here for me then you work with your data, decode and whatever, at least this: writeln(data);//will not throw but will print bytes -- Dmitry Olshansky

Forums