Use case: eliminate hidden allocations in buildPath - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Use case: eliminate hidden allocations in buildPath

Thread overview

Use case: eliminate hidden allocations in buildPath
Dec 04, 2013 Andrei Alexandrescu
Dec 05, 2013 Michel Fortin
Dec 05, 2013 Manu
Dec 05, 2013 Michel Fortin
Dec 05, 2013 Manu
Dec 05, 2013 Jacob Carlborg
Dec 05, 2013 H. S. Teoh
Dec 05, 2013 Jonathan M Davis
Dec 05, 2013 Manu
Dec 05, 2013 monarch_dodra
Dec 05, 2013 Jacob Carlborg
Dec 05, 2013 monarch_dodra
Dec 05, 2013 Jacob Carlborg
Dec 05, 2013 monarch_dodra
Dec 05, 2013 Andrei Alexandrescu
Dec 05, 2013 monarch_dodra
Dec 05, 2013 Andrei Alexandrescu
Dec 05, 2013 monarch_dodra
Dec 05, 2013 monarch_dodra
Dec 05, 2013 Andrei Alexandrescu
Dec 05, 2013 monarch_dodra
Dec 05, 2013 H. S. Teoh
Dec 05, 2013 Dmitry Olshansky
Dec 07, 2013 Michel Fortin
Dec 07, 2013 Andrei Alexandrescu
Dec 07, 2013 Michel Fortin
Dec 08, 2013 Marco Leise
Dec 05, 2013 Brad Anderson
Dec 05, 2013 monarch_dodra
Dec 05, 2013 inout
Dec 05, 2013 Brad Anderson

December 04, 2013

Use case: eliminate hidden allocations in buildPath

Posted by Andrei Alexandrescu

Andrei Alexandrescu

Hello,


Walter and I were talking about eliminating the surreptitious allocations in buildPath:

http://dlang.org/phobos/std_path.html#.buildPath

We'd need to keep the existing version working, so we're looking at adding one or more new overloads. We're looking at giving the user the option to control any needed memory allocation (or even arrange things such that there's no memory allocated at all).

It's a generous design space, so although we have a couple of ideas let's hear others first.


Thanks,

Andrei

December 05, 2013

Re: Use case: eliminate hidden allocations in buildPath

Posted by Michel Fortin
in reply to Andrei Alexandrescu

Michel Fortin

Posted in reply to Andrei Alexandrescu

On 2013-12-04 23:14:48 +0000, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> said:

> Walter and I were talking about eliminating the surreptitious allocations in buildPath:
> 
> http://dlang.org/phobos/std_path.html#.buildPath
> 
> We'd need to keep the existing version working, so we're looking at adding one or more new overloads. We're looking at giving the user the option to control any needed memory allocation (or even arrange things such that there's no memory allocated at all).
> 
> It's a generous design space, so although we have a couple of ideas let's hear others first.

Allow an allocator as the first argument. Then pass an allocator that uses preallocated memory (or any other strategy that does not really need to allocate). While technically the allocator still "allocates" memory, since you control the allocator it does it the you can redefine "allocate" to not allocate.

Here's a funny thought: allow plain arrays to be *typed* allocators through UFCS, just like arrays are ranges. If you have an array of chars, then "allocating" from it will simply return a slice and "bump the pointer" by becoming the remaining unused slice. The big problem with buildPath is that it won't work with overloading because your allocator has the same type as the other parameters. You'll need to wrap it in some kind of allocator shell. :-/

-- 
Michel Fortin
michel.fortin@michelf.ca
http://michelf.ca

December 05, 2013

Re: Use case: eliminate hidden allocations in buildPath

Posted by H. S. Teoh
in reply to Andrei Alexandrescu

H. S. Teoh

Posted in reply to Andrei Alexandrescu

On Wed, Dec 04, 2013 at 03:14:48PM -0800, Andrei Alexandrescu wrote:
> Hello,
> 
> 
> Walter and I were talking about eliminating the surreptitious allocations in buildPath:
> 
> http://dlang.org/phobos/std_path.html#.buildPath
> 
> We'd need to keep the existing version working, so we're looking at adding one or more new overloads. We're looking at giving the user the option to control any needed memory allocation (or even arrange things such that there's no memory allocated at all).
> 
> It's a generous design space, so although we have a couple of ideas let's hear others first.
[...]

What about a new overload that takes an output range instead of returning a string? The caller can then create the appropriate output range that does whatever is desired (malloc a buffer, use a static fixed-size buffer, etc.), and buildPath doesn't have to care about the implementation details. So, something like:

	void buildPath(OutputRange,Range)(OutputRange output,
					  Range segments)
		if (isOutputRange!(OutputRange, ElementType!Range))
	{ ... }

//

On a related note, I wonder if it's profitable to extend the concept of an output range to include void delegates that take range elements as arguments. The current toString overload looked for by std.format, for example, has this signature:

	void toString(scope void delegate(const(char)[] s) dg);

The delegate here essentially behaves like an output range (it takes snippets of string data and presumably appends them to some buffer somewhere). So why not extend the concept of output ranges to include such delegates, so that we can write:

	void toString(R)(R r) if (isOutputRange!R);

Then any output range can be used as the target of a string conversion, e.g., writing straight to a file or network socket without the unnecessary buffering that returning a string implies.

"But!" I hear you cry, "you can't call a delegate as dg.put(...), which you have to if it is to conform to the output range interface!" To that I'd say:

	module std.range;
	...
	void put(R,T)(R dg, T data)
		if (is(R == void delegate(T)) ||
		    is(R == void function(T)))
	{
		dg(data);
	}

If I'm not mistaken, this should make isOutputRange true for functions and delegates that take the appropriate argument types.

This then allows us to use buildPath with no hidden allocations, for example:

	char[1024] buffer;

	void appendToBuf(const(char)[] data) { ... }

	void main() {
		buildPath(&appendToBuf, "usr", "local", "share", "filename");
	}

T

-- 
Three out of two people have difficulties with fractions. -- Dirk Eddelbuettel

December 05, 2013

Re: Use case: eliminate hidden allocations in buildPath

Posted by Jonathan M Davis

Jonathan M Davis

On Wednesday, December 04, 2013 16:38:54 H. S. Teoh wrote:
> What about a new overload that takes an output range instead of returning a string?

I would have thought that that would be the obvious way to solve the problem. In general, I think that when a function allocates any kind of string or array which it returns, we should overload it with a version that takes an output range. In many cases, it would probably even make sense to make the default just use std.array.appender and make it so that only the output range overload is really doing any work.

What to do in cases of allocation where we're not dealing with arrays being returned is a tougher question, and I think that that starts going into custom allocator territory, but for arrays, output ranges are definitely the way to go IMHO.

- Jonathan M Davis

December 05, 2013

Re: Use case: eliminate hidden allocations in buildPath

Posted by Manu

Manu

Attachments:

text/html part

On 5 December 2013 11:02, Jonathan M Davis <jmdavisProg@gmx.com> wrote:

> On Wednesday, December 04, 2013 16:38:54 H. S. Teoh wrote:
> > What about a new overload that takes an output range instead of returning a string?
>
> I would have thought that that would be the obvious way to solve the
> problem.
> In general, I think that when a function allocates any kind of string or
> array
> which it returns, we should overload it with a version that takes an output
> range. In many cases, it would probably even make sense to make the default
> just use std.array.appender and make it so that only the output range
> overload
> is really doing any work.
>

This seems the intuitive approach to me.

What to do in cases of allocation where we're not dealing with arrays being
> returned is a tougher question, and I think that that starts going into
> custom
> allocator territory, but for arrays, output ranges are definitely the way
> to go
> IMHO.
>

You mean for internal working buffers?
It get's tricky. In my experience, many functions can use alloca, with a
fallback in the event of very large workspaces.
In the event of very large workspaces, it is often possible to break the
process into stages which allow an iterative use of a smaller alloca-ed
buffer if the function doesn't perform highly random access on the working
data.
In the event of large workspace and heavily randomised access to the
working dataset (ie, difficult to avoid one large allocation), then and
only then, maybe start thinking about receiving an allocator to work with
(default arg to a gc allocator), or, if it's a large complex process, break
the process into sub-functions which the user can issue, and they will then
inherit ownership and management of the workspace, which they can treat how
they like.

These are some of my approaches. Almost all code I write must never allocate, except for functions that produce new allocations, where either receiving an output range or an allocator seems intuitive.

The most important question to ask is 'can this function possibly be called recursively?'. Many library functions are effectively leaf functions, and can't lead back into user code via any path, which means alloca is safe to use liberally. If there is potential for a recursive call, then i guess you need to start preferring receiving allocators or output ranges, unless the alloca is in the order of a normal function's stack usage (<1kb-ish?).

December 05, 2013

Re: Use case: eliminate hidden allocations in buildPath

Posted by Manu
in reply to Michel Fortin

Manu

Posted in reply to Michel Fortin

Attachments:

text/html part

On 5 December 2013 10:13, Michel Fortin <michel.fortin@michelf.ca> wrote:

> On 2013-12-04 23:14:48 +0000, Andrei Alexandrescu < SeeWebsiteForEmail@erdani.org> said:
>
>  Walter and I were talking about eliminating the surreptitious allocations
>> in buildPath:
>>
>> http://dlang.org/phobos/std_path.html#.buildPath
>>
>> We'd need to keep the existing version working, so we're looking at
>> adding one or more new overloads. We're looking at giving the user the
>> option to control any needed memory allocation (or even arrange things such
>> that there's no memory allocated at all).
>>
>> It's a generous design space, so although we have a couple of ideas let's hear others first.
>>
>
> Allow an allocator as the first argument. Then pass an allocator that uses preallocated memory (or any other strategy that does not really need to allocate). While technically the allocator still "allocates" memory, since you control the allocator it does it the you can redefine "allocate" to not allocate.
>

Allocator as the first argument? This is so you can use UFCS on the
allocator to make the call?
I think it maybe makes sense for a function that takes an output range to
receive it as the first argument for this reason:
 outputRange.myFunction(); // I'm not sure if I like this, not sure it's
communicating the right process... but maybe people find it convenient?

An overload that receives an allocator though, I'd probably make the allocator the last argument, this way it can have a default arg that is the default GC allocator, thus eliminating the other (original?) overload of the function which receives neither an output range or an allocator (using the GC intrinsically).

Here's a funny thought: allow plain arrays to be *typed* allocators through
> UFCS, just like arrays are ranges. If you have an array of chars, then "allocating" from it will simply return a slice and "bump the pointer" by becoming the remaining unused slice. The big problem with buildPath is that it won't work with overloading because your allocator has the same type as the other parameters. You'll need to wrap it in some kind of allocator shell. :-/

I'm not sure I like this idea. I think I'd prefer to see 2 overloads, one that receives an output range, and one that receives an allocator (which may default to a GC allocator?).

December 05, 2013

Re: Use case: eliminate hidden allocations in buildPath

Posted by Michel Fortin
in reply to Manu

Michel Fortin

Posted in reply to Manu

On 2013-12-05 02:24:02 +0000, Manu <turkeyman@gmail.com> said:

> Allocator as the first argument? This is so you can use UFCS on the
> allocator to make the call?

Haha, no. That's because buildPath's arguments are (const(C[])[] paths...) with a "..." at the end. Can we put an argument after the variadic argument? I didn't check, but I though it was impossible... thus the allocator as the first argument.

-- 
Michel Fortin
michel.fortin@michelf.ca
http://michelf.ca

December 05, 2013

Re: Use case: eliminate hidden allocations in buildPath

Posted by Manu
in reply to Michel Fortin

Manu

Posted in reply to Michel Fortin

Attachments:

text/html part

On 5 December 2013 13:14, Michel Fortin <michel.fortin@michelf.ca> wrote:

> On 2013-12-05 02:24:02 +0000, Manu <turkeyman@gmail.com> said:
>
>  Allocator as the first argument? This is so you can use UFCS on the
>> allocator to make the call?
>>
>
> Haha, no. That's because buildPath's arguments are (const(C[])[] paths...) with a "..." at the end. Can we put an argument after the variadic argument? I didn't check, but I though it was impossible... thus the allocator as the first argument.


Oh yeah! :P
I was thinking in a more general sense, as a principle to be applied across
phobos, not just for this one function.

December 05, 2013

Re: Use case: eliminate hidden allocations in buildPath

Posted by Jacob Carlborg
in reply to Michel Fortin

Jacob Carlborg

Posted in reply to Michel Fortin

On 2013-12-05 04:14, Michel Fortin wrote:

> Haha, no. That's because buildPath's arguments are (const(C[])[]
> paths...) with a "..." at the end. Can we put an argument after the
> variadic argument? I didn't check, but I though it was impossible...
> thus the allocator as the first argument.

No, that's not possible. One could think it would be, as long as there is no conflict in the types.

-- 
/Jacob Carlborg

December 05, 2013

Re: Use case: eliminate hidden allocations in buildPath

Posted by monarch_dodra
in reply to Andrei Alexandrescu

monarch_dodra

Posted in reply to Andrei Alexandrescu

On Wednesday, 4 December 2013 at 23:14:48 UTC, Andrei Alexandrescu wrote:
> Hello,
>
>
> Walter and I were talking about eliminating the surreptitious allocations in buildPath:
>
> http://dlang.org/phobos/std_path.html#.buildPath
>
> We'd need to keep the existing version working, so we're looking at adding one or more new overloads. We're looking at giving the user the option to control any needed memory allocation (or even arrange things such that there's no memory allocated at all).
>
> It's a generous design space, so although we have a couple of ideas let's hear others first.
>
>
> Thanks,
>
> Andrei

Use an output range. It's the generic D approach, and what we already do for the string functions such as std.string.translate:
http://dlang.org/phobos/std_string.html#.translate
(look down for the output range overloads).

Anything "allocator" related should be carried by the output range itself. The function itself should not care nor know about any of that.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation