Thread overview
Any library with string encoding/decoding support?
Jan 20, 2014
ilya-stromberg
Jan 20, 2014
Adam D. Ruppe
Jan 20, 2014
MGW
Jan 21, 2014
FreeSlave
January 20, 2014
Do you know any library with string encoding/decoding support? I need more encodings than provides `std.encoding`.
January 20, 2014
On Monday, 20 January 2014 at 08:33:09 UTC, ilya-stromberg wrote:
> Do you know any library with string encoding/decoding support? I need more encodings than provides `std.encoding`.

I did one that does a little bit more decoding, but no encoding support at all. (I wrote it for my web scraper and email reader so all i cared about was getting it to utf8)

https://github.com/adamdruppe/misc-stuff-including-D-programming-language-web-stuff/blob/master/characterencodings.d

auto s = convertToUtf8(your_raw_data, "current_encoding");


if you want something full featured, GNU iconv isn't hard to use from D


import core.stdc.errno;
extern(C) {
        alias void* iconv_t;
        iconv_t iconv_open(const char *tocode, const char *fromcode);
        int iconv_close(iconv_t cd);

     pragma(lib, "iconv");

       size_t iconv(iconv_t cd,
                    char **inbuf, size_t *inbytesleft,
                    char **outbuf, size_t *outbytesleft);
}

    auto i = iconv_open("UTF-8", toStringz("CP1252"));
    if(i == cast(void*) -1) throw new Exception("iconv open failed");
    scope(exit) iconv_close(i);

    /* get input pointer and length ready */
    /* Allocate an output buffer with 4x the size of the input buffer */
    // keep the output buffer around as a slice and get a pointer to it for the lib
    auto startingOutputBuffer = new char(content.length * 4];
    char* outputBuffer = startingOutputBuffer.ptr;

    while(inputLength) {
        auto ret = iconv(i, &input, &inputLength, &outputBuffer, &outputLength);
        if(ret == -1) {
               // check errno. errno == 84 means wrong charset
        }
    }

   // number of bytes remaining in the output buffer is the size here
   // so we do original buffer size minus remaining buffer size
   outputLength = (content.length * 4) - outputLength;

   // then slice it to get the result
   string convertedContent = startingOutputBuffer[0 .. outputLength];




Note that iconv i think is GPL licensed.
January 20, 2014
On Monday, 20 January 2014 at 08:33:09 UTC, ilya-stromberg wrote:
> Do you know any library with string encoding/decoding support? I need more encodings than provides `std.encoding`.

Library to work with Qt.
https://github.com/MGWL/QtE-Qt_for_Dlang_and_Forth


Working with Qt and its QTextCodec class.
---------------------------------------

// Compile:
// ------------------------------
// Linux:    dmd ex1.d qte.d -L-ldl
// Windows:  dmd ex1.d qte.d
// ------------------------------

import qte;                             // Work with Qt
import core.runtime;                    // Parametrs start
import std.stdio;                       // writeln();

int main(string[] args) {
    QApplication app;       // Application
    QTextCodec UTF_8;
    QTextCodec WIN_1251;
    QTextCodec IBM866;
    QString tmpQs;
    QByteArray ba;
    QLabel label;

    // Test load. If '--debug' start with warnings message load QtE
    bool fDebug; fDebug = false; foreach (arg; args[0 .. args.length])  { if (arg=="--debug") fDebug = true; }

    // Load GUI. fDebug=F disable warnings, T=enable warnings
    int rez = LoadQt( dll.Core | dll.Gui | dll.QtE, fDebug); if (rez==1) return 1;

    // Init Qt. Last parametr T=GUI, F=console app
    app = new QApplication; (app.adrQApplication())(cast(void*)app.bufObj, &Runtime.cArgs.argc, Runtime.cArgs.argv, true);

    // Init insaid coding. All codec Qt QTextCodec
    tmpQs = new QString();
    UTF_8 = new QTextCodec("UTF-8");            // Linux
    WIN_1251 = new QTextCodec("Windows-1251");  // Windows
    IBM866 = new QTextCodec("IBM 866");         // DOS

    // Create string "Hello from Qt" on Rushen
    tmpQs.toUnicode(cast(char*)("<h2>Привет из <font color=red size=5>QtE.d</font></h2>".ptr), UTF_8);

    // QLabel
    label = new QLabel(null);
    label.setText(tmpQs); label.setAlignment(QtE.AlignmentFlag.AlignCenter); // Write text and alignment
    label.resize(300, 130); // Size label

    // Exammple DOS console
    ba = new QByteArray(cast(char*)("Привет из QtE.d - обратите внимание на перекодировку в DOS".ptr));  // Это в UTF-8
    tmpQs.toUnicode(cast(char*)ba.data(), UTF_8);  // in Unicode
    version(Windows) {  // window DOS in Windows
        tmpQs.fromUnicode(cast(char*)ba.data(), IBM866);
    }
    version(linux) {    // Linux work UTF-8.
        tmpQs.fromUnicode(cast(char*)ba.data(), UTF_8);
    }
    printf("%s", ba.data());

    label.show();

    return app.exec();
}
January 21, 2014
iconv as library is under LGPL. iconv as utility is under GPL. Note that iconv is not portable even on Linux, since different distros may have different implementations.

Qt is not the case because it's unstable with D. It's also redundant dependency. And as far as I know Qt uses platform-dependent functions like iconv on Linux and Windows-specific functions to work with encodings on Windows.