Thread overview
[Issue 10868] New: std.string.translate should take an optional buffer
Aug 22, 2013
Andrej Mitrovic
Aug 22, 2013
Jonathan M Davis
Aug 22, 2013
Andrej Mitrovic
Aug 22, 2013
Jonathan M Davis
Aug 22, 2013
Andrej Mitrovic
Aug 22, 2013
Brad Anderson
Sep 19, 2013
Andrej Mitrovic
August 22, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=10868

           Summary: std.string.translate should take an optional buffer
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Phobos
        AssignedTo: andrej.mitrovich@gmail.com
        ReportedBy: andrej.mitrovich@gmail.com


--- Comment #0 from Andrej Mitrovic <andrej.mitrovich@gmail.com> 2013-08-21 18:46:42 PDT ---
translate is useful for interfacing with other languages that require special string literal escaping rules (e.g. Tcl), however these escaped strings themselves typically have to be wrapped in some outer block.

Currently translate doesn't allow a user-provided buffer, leading to writing inefficient code like this:

-----
import std.array;
import std.string;

void main()
{
    string[dchar] table = ['{' : `\{`, '}' : `\}`];
    string input = `{ foobar }`;

    // multiple allocation
    auto res = format(`"%s"`, input.translate(table));

    assert(res == `"\{ foobar \}"`);
}
-----

A more efficient approach is to allow passing a custom buffer to translate:

-----
import std.array;
import std.string;

void main()
{
    string[dchar] table = ['{' : `\{`, '}' : `\}`];
    auto buffer = appender!(dchar[])();

    buffer ~= '{';
    string input = `{ foobar }`;
    input.translate(table, null, buffer);
    buffer ~= '}';
    assert(buffer.data == `"\{ foobar \}"`);
}
-----

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
August 22, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=10868


Jonathan M Davis <jmdavisProg@gmx.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jmdavisProg@gmx.com


--- Comment #1 from Jonathan M Davis <jmdavisProg@gmx.com> 2013-08-21 18:55:37 PDT ---
I think that we should move towards having pretty much any Phobos function which allocates an array (including strings) have an overload which takes an output range to write to instead. However, we should probably make sure that output ranges are appropriately ironed out before we do that heavily (e.g. we really need to have a clear way of asking whether an output range is full). Input ranges are used heavily, but output ranges have had much less attention and may need additional work before they're used that heavily.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
August 22, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=10868



--- Comment #2 from Andrej Mitrovic <andrej.mitrovich@gmail.com> 2013-08-21 18:59:39 PDT ---
(In reply to comment #1)
> I think that we should move towards having pretty much any Phobos function which allocates an array (including strings) have an overload which takes an output range to write to instead. However, we should probably make sure that output ranges are appropriately ironed out before we do that heavily (e.g. we really need to have a clear way of asking whether an output range is full). Input ranges are used heavily, but output ranges have had much less attention and may need additional work before they're used that heavily.

Hmm, is there any way this .full check can be added later once the output range API is finalized? For now if .put works on the buffer I think we're ok, since most buffers are likely created with appender().

If it's not in the NG already, could you bring this OutputRange topic up? We've already made a promise to make Phobos more memory-friendly (one way is providing an optional buffer), but not much has happened since the end of DConf.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
August 22, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=10868



--- Comment #3 from Jonathan M Davis <jmdavisProg@gmx.com> 2013-08-21 22:59:34 PDT ---
IIRC, it's come up before (with the digest stuff? - I know the issue of a function to "finish" came up for that), but I don't recall any resolution of it. I believe that with regards to arrays at least, if you use them as an output range, you can check for full by checking that length != 0, and length effectively ends up giving you how much left you can write to, but even if that holds in general, not all ranges can tell you how much they can be written to them as opposed to whether they're full, which IIRC led to discussions of the case where you try and put more elements than can fit at once (e.g. if you have 1 char left in a char[] and try and put a character takes up 3 code units), and that almost requires that you end up throwing in put in some cases rather than being able to check for space ahead of time, which is problematic (though maybe put could be changed to return whether it succeeded in order to deal with the case where you might be able to put more than it could fit and can't check first).

In any case, it obviously gets a bit nasty, and we probably should discuss it again. I'll try and organize my thoughts on it a bit and create another thread on the topic, since we do need to iron out these issues if we want to use output ranges heavily (which we should be moving to at least as an overload for many/most functions which allocate). Output ranges have suffered from a lack of attention and probably should have been discussed more or thought through more before being added. But at least output ranges are used enough less than input ranges that it's probably nowhere near as big a deal to break their API if we have to (especially if at least some of the missing functions can be given generic versions via UFCS).

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
August 22, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=10868


Andrej Mitrovic <andrej.mitrovich@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |pull


--- Comment #4 from Andrej Mitrovic <andrej.mitrovich@gmail.com> 2013-08-22 07:38:06 PDT ---
https://github.com/D-Programming-Language/phobos/pull/1501

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
August 22, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=10868


Brad Anderson <eco@gnuk.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |eco@gnuk.net


--- Comment #5 from Brad Anderson <eco@gnuk.net> 2013-08-22 09:06:38 PDT ---
(In reply to comment #1)
> I think that we should move towards having pretty much any Phobos function which allocates an array (including strings) have an overload which takes an output range to write to instead. However, we should probably make sure that output ranges are appropriately ironed out before we do that heavily (e.g. we really need to have a clear way of asking whether an output range is full). Input ranges are used heavily, but output ranges have had much less attention and may need additional work before they're used that heavily.

Monarch's https://github.com/D-Programming-Language/phobos/pull/1439 improves the Output Range situation quite a bit.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
August 25, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=10868


monarchdodra@gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |monarchdodra@gmail.com


--- Comment #6 from monarchdodra@gmail.com 2013-08-25 01:53:53 PDT ---
(In reply to comment #3)
> IIRC, it's come up before (with the digest stuff? - I know the issue of a function to "finish" came up for that), but I don't recall any resolution of it.

I remember having brought up the subject a few times before. It never gets discussed a whole lot though. The latest is here: http://forum.dlang.org/thread/gqsfiatbhdqwcptkoqua@forum.dlang.org

> I believe that with regards to arrays at least, if you use them as an
> output range, you can check for full by checking that length != 0, and length
> effectively ends up giving you how much left you can write to, but even if that
> holds in general, not all ranges can tell you how much they can be written to
> them as opposed to whether they're full, which IIRC led to discussions of the
> case where you try and put more elements than can fit at once (e.g. if you have
> 1 char left in a char[] and try and put a character takes up 3 code units), and
> that almost requires that you end up throwing in put in some cases rather than
> being able to check for space ahead of time, which is problematic (though maybe
> put could be changed to return whether it succeeded in order to deal with the
> case where you might be able to put more than it could fit and can't check
> first).

There are simpler fail cases than that, since an output range can accept ranges
of elements itself. Meaning you can do:
int[] a = [0, 0];
a.put([1, 2, 3]);

So as you did mention, basically, the problem is that even if the range is not full, that doesn't guarantee that what you want to cram into it will fit.

I think going the *sformat* road might be simplest? It is *strongly* recommended to use an infinite sink when an output range is passed. You are free to use some sort of finite storage, but if it empties out, then it's error.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
August 29, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=10868



--- Comment #7 from github-bugzilla@puremagic.com 2013-08-29 00:30:27 PDT ---
Commits pushed to master at https://github.com/D-Programming-Language/phobos

https://github.com/D-Programming-Language/phobos/commit/5ec5c213b0b47eeccfa9a335da362a29cb1e0fdf
Fixes Issue 10868 - std.string.translate should have an overload that takes an
output buffer.

https://github.com/D-Programming-Language/phobos/commit/d6b8a21157caec2025ce70bfa259c4c912a969b7 Merge pull request #1501 from AndrejMitrovic/Fix10868

Issue 10868 - std.string.translate should have an overload that takes an output buffer.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
September 19, 2013
http://d.puremagic.com/issues/show_bug.cgi?id=10868


Andrej Mitrovic <andrej.mitrovich@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED


-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------