Thread overview | ||||||||
---|---|---|---|---|---|---|---|---|
|
June 07, 2012 std.net.curl get webpage asia font issue | ||||
---|---|---|---|---|
| ||||
Greeting! The document on this website provide an example on how to get webpage information by std.net.curl.It is quite straightforward: [code] import std.net.curl, std.stdio; void main(){ // Return a string containing the content specified by an URL string content = get("dlang.org"); writefln("%s\n",content); readln; } [/code] When I change get("dlang.org") to get("yahoo.com"),everything goes fine;but when I change to get("yahoo.com.cn"),a runtime error said bad gbk encoding bla... So my very simple question is how to retrieve information from a webpage which could possibily contains asia font (like Chinese font)? Thanks for your help in advance. Regards, Sam |
June 07, 2012 Re: std.net.curl get webpage asia font issue | ||||
---|---|---|---|---|
| ||||
Posted in reply to Sam Hu | On 07/06/12 02:57, Sam Hu wrote:
> string content = get("dlang.org");
> writefln("%s\n",content);
>
> So my very simple question is how to retrieve information from a webpage which could possibily contains asia font (like Chinese font)?
I'm not really sure but try:
wstring content = get("dlang.org");
Also make sure your terminal is set up for unicode.
|
June 07, 2012 Re: std.net.curl get webpage asia font issue | ||||
---|---|---|---|---|
| ||||
Posted in reply to Sam Hu | On 07.06.2012 10:57, Sam Hu wrote: > Greeting! > > The document on this website provide an example on how to get webpage > information by std.net.curl.It is quite straightforward: > > [code] > import std.net.curl, std.stdio; > > void main(){ > > // Return a string containing the content specified by an URL > string content = get("dlang.org"); It's simple this line you "convert" whatever site content was to unicode. Problem is that "convert" is either broken or it's simply a cast whereas it should re-encode source as unicode. So the way around is to get it to array of bytes and decode yourself. > > writefln("%s\n",content); > > readln; > } > [/code] > > When I change get("dlang.org") to get("yahoo.com"),everything goes > fine;but when I change to get("yahoo.com.cn"),a runtime error said bad > gbk encoding bla... > > So my very simple question is how to retrieve information from a webpage > which could possibily contains asia font (like Chinese font)? > I think it's not "font" but encoding problem. > Thanks for your help in advance. > > Regards, > Sam -- Dmitry Olshansky |
June 08, 2012 Re: std.net.curl get webpage asia font issue | ||||
---|---|---|---|---|
| ||||
Posted in reply to Dmitry Olshansky | On Thursday, 7 June 2012 at 10:43:32 UTC, Dmitry Olshansky wrote:
>> string content = get("dlang.org");
>
> It's simple this line you "convert" whatever site content was to unicode. Problem is that "convert" is either broken or it's simply a cast whereas it should re-encode source as unicode. So the way around is to get it to array of bytes and decode yourself.
>
Thanks.May I know how ?Appreciated a piece of code segment.
|
June 08, 2012 Re: std.net.curl get webpage asia font issue | ||||
---|---|---|---|---|
| ||||
Posted in reply to Kevin | On Thursday, 7 June 2012 at 10:38:53 UTC, Kevin wrote:
> On 07/06/12 02:57, Sam Hu wrote:
>> string content = get("dlang.org");
>> writefln("%s\n",content);
>>
>> So my very simple question is how to retrieve information from a
>> webpage which could possibily contains asia font (like Chinese font)?
>
> I'm not really sure but try:
> wstring content = get("dlang.org");
>
> Also make sure your terminal is set up for unicode.
Sorry,no,it does not work,I tried to print the content to DFL
TextBox control but still the same issue.
|
June 08, 2012 Re: std.net.curl get webpage asia font issue | ||||
---|---|---|---|---|
| ||||
Posted in reply to Sam Hu | On 08.06.2012 5:03, Sam Hu wrote: > On Thursday, 7 June 2012 at 10:43:32 UTC, Dmitry Olshansky wrote: >>> string content = get("dlang.org"); >> >> It's simple this line you "convert" whatever site content was to >> unicode. Problem is that "convert" is either broken or it's simply a >> cast whereas it should re-encode source as unicode. So the way around >> is to get it to array of bytes and decode yourself. >> > > Thanks.May I know how ?Appreciated a piece of code segment. seems like ubyte[] data = get!(AutoProtocol, ubyte)("your-site.cn"); //should work, sorry I'm on windows and curl doesn't work here for me then you work with your data, decode and whatever, at least this: writeln(data);//will not throw but will print bytes -- Dmitry Olshansky |
Copyright © 1999-2021 by the D Language Foundation