Thread overview | |||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
December 04, 2013 Use case: eliminate hidden allocations in buildPath | ||||
---|---|---|---|---|
| ||||
Hello, Walter and I were talking about eliminating the surreptitious allocations in buildPath: http://dlang.org/phobos/std_path.html#.buildPath We'd need to keep the existing version working, so we're looking at adding one or more new overloads. We're looking at giving the user the option to control any needed memory allocation (or even arrange things such that there's no memory allocated at all). It's a generous design space, so although we have a couple of ideas let's hear others first. Thanks, Andrei |
December 05, 2013 Re: Use case: eliminate hidden allocations in buildPath | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On 2013-12-04 23:14:48 +0000, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> said: > Walter and I were talking about eliminating the surreptitious allocations in buildPath: > > http://dlang.org/phobos/std_path.html#.buildPath > > We'd need to keep the existing version working, so we're looking at adding one or more new overloads. We're looking at giving the user the option to control any needed memory allocation (or even arrange things such that there's no memory allocated at all). > > It's a generous design space, so although we have a couple of ideas let's hear others first. Allow an allocator as the first argument. Then pass an allocator that uses preallocated memory (or any other strategy that does not really need to allocate). While technically the allocator still "allocates" memory, since you control the allocator it does it the you can redefine "allocate" to not allocate. Here's a funny thought: allow plain arrays to be *typed* allocators through UFCS, just like arrays are ranges. If you have an array of chars, then "allocating" from it will simply return a slice and "bump the pointer" by becoming the remaining unused slice. The big problem with buildPath is that it won't work with overloading because your allocator has the same type as the other parameters. You'll need to wrap it in some kind of allocator shell. :-/ -- Michel Fortin michel.fortin@michelf.ca http://michelf.ca |
December 05, 2013 Re: Use case: eliminate hidden allocations in buildPath | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Wed, Dec 04, 2013 at 03:14:48PM -0800, Andrei Alexandrescu wrote: > Hello, > > > Walter and I were talking about eliminating the surreptitious allocations in buildPath: > > http://dlang.org/phobos/std_path.html#.buildPath > > We'd need to keep the existing version working, so we're looking at adding one or more new overloads. We're looking at giving the user the option to control any needed memory allocation (or even arrange things such that there's no memory allocated at all). > > It's a generous design space, so although we have a couple of ideas let's hear others first. [...] What about a new overload that takes an output range instead of returning a string? The caller can then create the appropriate output range that does whatever is desired (malloc a buffer, use a static fixed-size buffer, etc.), and buildPath doesn't have to care about the implementation details. So, something like: void buildPath(OutputRange,Range)(OutputRange output, Range segments) if (isOutputRange!(OutputRange, ElementType!Range)) { ... } // On a related note, I wonder if it's profitable to extend the concept of an output range to include void delegates that take range elements as arguments. The current toString overload looked for by std.format, for example, has this signature: void toString(scope void delegate(const(char)[] s) dg); The delegate here essentially behaves like an output range (it takes snippets of string data and presumably appends them to some buffer somewhere). So why not extend the concept of output ranges to include such delegates, so that we can write: void toString(R)(R r) if (isOutputRange!R); Then any output range can be used as the target of a string conversion, e.g., writing straight to a file or network socket without the unnecessary buffering that returning a string implies. "But!" I hear you cry, "you can't call a delegate as dg.put(...), which you have to if it is to conform to the output range interface!" To that I'd say: module std.range; ... void put(R,T)(R dg, T data) if (is(R == void delegate(T)) || is(R == void function(T))) { dg(data); } If I'm not mistaken, this should make isOutputRange true for functions and delegates that take the appropriate argument types. This then allows us to use buildPath with no hidden allocations, for example: char[1024] buffer; void appendToBuf(const(char)[] data) { ... } void main() { buildPath(&appendToBuf, "usr", "local", "share", "filename"); } T -- Three out of two people have difficulties with fractions. -- Dirk Eddelbuettel |
December 05, 2013 Re: Use case: eliminate hidden allocations in buildPath | ||||
---|---|---|---|---|
| ||||
On Wednesday, December 04, 2013 16:38:54 H. S. Teoh wrote:
> What about a new overload that takes an output range instead of returning a string?
I would have thought that that would be the obvious way to solve the problem. In general, I think that when a function allocates any kind of string or array which it returns, we should overload it with a version that takes an output range. In many cases, it would probably even make sense to make the default just use std.array.appender and make it so that only the output range overload is really doing any work.
What to do in cases of allocation where we're not dealing with arrays being returned is a tougher question, and I think that that starts going into custom allocator territory, but for arrays, output ranges are definitely the way to go IMHO.
- Jonathan M Davis
|
December 05, 2013 Re: Use case: eliminate hidden allocations in buildPath | ||||
---|---|---|---|---|
| ||||
Attachments:
| On 5 December 2013 11:02, Jonathan M Davis <jmdavisProg@gmx.com> wrote: > On Wednesday, December 04, 2013 16:38:54 H. S. Teoh wrote: > > What about a new overload that takes an output range instead of returning a string? > > I would have thought that that would be the obvious way to solve the > problem. > In general, I think that when a function allocates any kind of string or > array > which it returns, we should overload it with a version that takes an output > range. In many cases, it would probably even make sense to make the default > just use std.array.appender and make it so that only the output range > overload > is really doing any work. > This seems the intuitive approach to me. What to do in cases of allocation where we're not dealing with arrays being > returned is a tougher question, and I think that that starts going into > custom > allocator territory, but for arrays, output ranges are definitely the way > to go > IMHO. > You mean for internal working buffers? It get's tricky. In my experience, many functions can use alloca, with a fallback in the event of very large workspaces. In the event of very large workspaces, it is often possible to break the process into stages which allow an iterative use of a smaller alloca-ed buffer if the function doesn't perform highly random access on the working data. In the event of large workspace and heavily randomised access to the working dataset (ie, difficult to avoid one large allocation), then and only then, maybe start thinking about receiving an allocator to work with (default arg to a gc allocator), or, if it's a large complex process, break the process into sub-functions which the user can issue, and they will then inherit ownership and management of the workspace, which they can treat how they like. These are some of my approaches. Almost all code I write must never allocate, except for functions that produce new allocations, where either receiving an output range or an allocator seems intuitive. The most important question to ask is 'can this function possibly be called recursively?'. Many library functions are effectively leaf functions, and can't lead back into user code via any path, which means alloca is safe to use liberally. If there is potential for a recursive call, then i guess you need to start preferring receiving allocators or output ranges, unless the alloca is in the order of a normal function's stack usage (<1kb-ish?). |
December 05, 2013 Re: Use case: eliminate hidden allocations in buildPath | ||||
---|---|---|---|---|
| ||||
Posted in reply to Michel Fortin Attachments:
| On 5 December 2013 10:13, Michel Fortin <michel.fortin@michelf.ca> wrote: > On 2013-12-04 23:14:48 +0000, Andrei Alexandrescu < SeeWebsiteForEmail@erdani.org> said: > > Walter and I were talking about eliminating the surreptitious allocations >> in buildPath: >> >> http://dlang.org/phobos/std_path.html#.buildPath >> >> We'd need to keep the existing version working, so we're looking at >> adding one or more new overloads. We're looking at giving the user the >> option to control any needed memory allocation (or even arrange things such >> that there's no memory allocated at all). >> >> It's a generous design space, so although we have a couple of ideas let's hear others first. >> > > Allow an allocator as the first argument. Then pass an allocator that uses preallocated memory (or any other strategy that does not really need to allocate). While technically the allocator still "allocates" memory, since you control the allocator it does it the you can redefine "allocate" to not allocate. > Allocator as the first argument? This is so you can use UFCS on the allocator to make the call? I think it maybe makes sense for a function that takes an output range to receive it as the first argument for this reason: outputRange.myFunction(); // I'm not sure if I like this, not sure it's communicating the right process... but maybe people find it convenient? An overload that receives an allocator though, I'd probably make the allocator the last argument, this way it can have a default arg that is the default GC allocator, thus eliminating the other (original?) overload of the function which receives neither an output range or an allocator (using the GC intrinsically). Here's a funny thought: allow plain arrays to be *typed* allocators through > UFCS, just like arrays are ranges. If you have an array of chars, then "allocating" from it will simply return a slice and "bump the pointer" by becoming the remaining unused slice. The big problem with buildPath is that it won't work with overloading because your allocator has the same type as the other parameters. You'll need to wrap it in some kind of allocator shell. :-/ I'm not sure I like this idea. I think I'd prefer to see 2 overloads, one that receives an output range, and one that receives an allocator (which may default to a GC allocator?). |
December 05, 2013 Re: Use case: eliminate hidden allocations in buildPath | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manu | On 2013-12-05 02:24:02 +0000, Manu <turkeyman@gmail.com> said: > Allocator as the first argument? This is so you can use UFCS on the > allocator to make the call? Haha, no. That's because buildPath's arguments are (const(C[])[] paths...) with a "..." at the end. Can we put an argument after the variadic argument? I didn't check, but I though it was impossible... thus the allocator as the first argument. -- Michel Fortin michel.fortin@michelf.ca http://michelf.ca |
December 05, 2013 Re: Use case: eliminate hidden allocations in buildPath | ||||
---|---|---|---|---|
| ||||
Posted in reply to Michel Fortin Attachments:
| On 5 December 2013 13:14, Michel Fortin <michel.fortin@michelf.ca> wrote:
> On 2013-12-05 02:24:02 +0000, Manu <turkeyman@gmail.com> said:
>
> Allocator as the first argument? This is so you can use UFCS on the
>> allocator to make the call?
>>
>
> Haha, no. That's because buildPath's arguments are (const(C[])[] paths...) with a "..." at the end. Can we put an argument after the variadic argument? I didn't check, but I though it was impossible... thus the allocator as the first argument.
Oh yeah! :P
I was thinking in a more general sense, as a principle to be applied across
phobos, not just for this one function.
|
December 05, 2013 Re: Use case: eliminate hidden allocations in buildPath | ||||
---|---|---|---|---|
| ||||
Posted in reply to Michel Fortin | On 2013-12-05 04:14, Michel Fortin wrote: > Haha, no. That's because buildPath's arguments are (const(C[])[] > paths...) with a "..." at the end. Can we put an argument after the > variadic argument? I didn't check, but I though it was impossible... > thus the allocator as the first argument. No, that's not possible. One could think it would be, as long as there is no conflict in the types. -- /Jacob Carlborg |
December 05, 2013 Re: Use case: eliminate hidden allocations in buildPath | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Wednesday, 4 December 2013 at 23:14:48 UTC, Andrei Alexandrescu wrote: > Hello, > > > Walter and I were talking about eliminating the surreptitious allocations in buildPath: > > http://dlang.org/phobos/std_path.html#.buildPath > > We'd need to keep the existing version working, so we're looking at adding one or more new overloads. We're looking at giving the user the option to control any needed memory allocation (or even arrange things such that there's no memory allocated at all). > > It's a generous design space, so although we have a couple of ideas let's hear others first. > > > Thanks, > > Andrei Use an output range. It's the generic D approach, and what we already do for the string functions such as std.string.translate: http://dlang.org/phobos/std_string.html#.translate (look down for the output range overloads). Anything "allocator" related should be carried by the output range itself. The function itself should not care nor know about any of that. |
Copyright © 1999-2021 by the D Language Foundation