June 23, 2018
On Friday, 22 June 2018 at 21:37:07 UTC, Walter Bright wrote:
>   text
>   begat textImpl
>   begat to
>   begat toImpl
>   begat toStr
>   begat formatValue
>   begat formatValueImpl
>   begat put
>   begat doPut
>   begat Appender.put
>   begat Appender.put

Hmm, this looks suboptimal for more than one reason. Wouldn't `text` benefit from using an appender in its implementation, rather than letting every invocation of `to` have its own? I.e. there ought to be one Appender per `text` invocation, rather than per one `text` argument.

textImpl seems to already have a branch for >1 arguments, but it still invokes to!S(arg) for them.

June 24, 2018
On Friday, 22 June 2018 at 22:36:23 UTC, Adam D. Ruppe wrote:
> I can't for at last two weeks, I'm about to fly tomorrow... and I'm getting married in August too so my life is kinda hectic.

Congrats!
June 24, 2018
On Saturday, 23 June 2018 at 20:17:01 UTC, Vladimir Panteleev wrote:
> On Friday, 22 June 2018 at 21:37:07 UTC, Walter Bright wrote:
>>   text
>>   begat textImpl
>>   begat to
>>   begat toImpl
>>   begat toStr
>>   begat formatValue
>>   begat formatValueImpl
>>   begat put
>>   begat doPut
>>   begat Appender.put
>>   begat Appender.put
>
> Hmm, this looks suboptimal for more than one reason. Wouldn't `text` benefit from using an appender in its implementation, rather than letting every invocation of `to` have its own? I.e. there ought to be one Appender per `text` invocation, rather than per one `text` argument.
>
> textImpl seems to already have a branch for >1 arguments, but it still invokes to!S(arg) for them.

`text` has an appender [0] but it calls `to!` for every non-integer, so it has effectively O(n+1) GC allocations. We should be able to get this down to O(1).

The initially obvious way to do this would be allowing `to!` to take an `OutputRange`... but this would not improve the OP issue Walter's had with trying to trace template expansions. (in fact it would make it worse and add a lot of template bloat)

So my idea is to have a lower level API that is `@safe @nogc pure nothrow`, takes already allocated memory and writes to it like:

---
// Reterning W[] here with a slice of the original buffer so you can use it in tests easily.
@safe @nogc pure nothrow W[] writeCharsTo(T, W)(T input, scope W[] buffer);
---

This could then be used in both `to!` and `text`, or directly if the user wanted a @nogc option (ie json generator, or use with std.io etc).

I've done some benchmarks with integers and `to!` becomes almost twice as fast with `dmd -O -release -inline` and still ~30% faster with `ldc2 -O3 -release` (as its zero copy unlike the current `toChars` InputRange which copies when used in `to!`)

Early results of a `text` benchmark show even better speed improvements.

I will clean up the benchmark code I have and post it for you to see tonight.

In principle would this type of addition be accepted into phobos?

Thanks,
David

[0] https://github.com/dlang/phobos/blob/master/std/conv.d#L4243
June 25, 2018
On Friday, 22 June 2018 at 21:37:07 UTC, Walter Bright wrote:
>
> If someone wants to figure out how to build an uncomplicated, straightforward, efficient implementation of text() that we can be proud of, that would be a nice contribution.

Here [0] is part of a benchmark for implementing `to!string` with the lower level api design I was talking about in my other post [1].

On my machine i get these results:

$ dmd -dip1000
to!(string):        17 secs, 88 ms, 322 μs, and 1 hnsec
newTo!(string):     8 secs, 785 ms, 325 μs, and 1 hnsec
@nogc writeCharsTo: 4 secs, 404 ms, 21 μs, and 7 hnsecs

$ dmd -dip1000 -O -release -inline
to!(string):        9 secs, 396 ms, 688 μs, and 1 hnsec
newTo!(string):     4 secs, 640 ms, and 698 μs
@nogc writeCharsTo: 1 sec, 288 ms, 176 μs, and 9 hnsecs

$ ldc2 -dip1000 -O3 -release
to!(string):        4 secs, 489 ms, 52 μs, and 3 hnsecs
newTo!(string):     3 secs, 781 ms, 990 μs, and 5 hnsecs
@nogc writeCharsTo: 752 ms, 989 μs, and 5 hnsecs

I will upload the benchmark for `text` tonight.

Also other points to make are:
   Using this low level api will work in -betterC code.
   It might even fit into utiliD [2].

[0] https://run.dlang.io/gist/97639a90931ad8e6e3a1ea564c46d710?args=-O%20-release%20-inline%20-dip1000
[1] https://forum.dlang.org/post/erxtrmnqbcbtgokoutbc@forum.dlang.org
[2] https://github.com/JinShil/utiliD

June 26, 2018
On Monday, 25 June 2018 at 01:03:03 UTC, David Bennett wrote:
> On Friday, 22 June 2018 at 21:37:07 UTC, Walter Bright wrote:
>>
>> If someone wants to figure out how to build an uncomplicated, straightforward, efficient implementation of text() that we can be proud of, that would be a nice contribution.
>
>
> I will upload the benchmark for `text` tonight.
>


I have made a more complete benchmark [0] so we can see the speed of each basic type.

Here's the benchmark results for dmd:

|-----------------------------------------------------------------------|
| <convert>       .to!(string) |   .newTo!(string) | .writeCharsTo(buf) |
|-----------------------------------------------------------------------|
| ubyte    :           1856 ms |           1037 ms |             201 ms |
| ushort   :           2336 ms |           1350 ms |             382 ms |
| uint     :           3415 ms |           1538 ms |             669 ms |
| ulong    :           6171 ms |           2275 ms |            1398 ms |
| byte     :           1908 ms |           1044 ms |             227 ms |
| short    :           2234 ms |           1364 ms |             406 ms |
| int      :           3560 ms |           1603 ms |             629 ms |
| long     :           6276 ms |           2219 ms |            1369 ms |
| char     :           3013 ms |           1061 ms |             181 ms |
| wchar    :           3285 ms |           1043 ms |             242 ms |
| dchar    :           3211 ms |           1048 ms |             277 ms |
|-----------------------------------------------------------------------|

|-----------------------------------------------------------------------|
| <concat>           text(...) |      newText(...) | .writeCharsTo(buf) |
|-----------------------------------------------------------------------|
| ubyte    :           2643 ms |            937 ms |             254 ms |
| ushort   :           3237 ms |           1173 ms |             530 ms |
| uint     :           5144 ms |           1448 ms |             738 ms |
| ulong    :           9854 ms |           2126 ms |            1427 ms |
| byte     :           2805 ms |            951 ms |             310 ms |
| short    :           3272 ms |           1268 ms |             583 ms |
| int      :           5548 ms |           1456 ms |             819 ms |
| long     :          10303 ms |           2182 ms |            1488 ms |
| char     :           3853 ms |            891 ms |             258 ms |
| wchar    :           4151 ms |            940 ms |             263 ms |
| dchar    :           3945 ms |            949 ms |             352 ms |
|-----------------------------------------------------------------------|

And ldc2:

|-----------------------------------------------------------------------|
| <convert>       .to!(string) |   .newTo!(string) | .writeCharsTo(buf) |
|-----------------------------------------------------------------------|
| ubyte    :           1003 ms |           1011 ms |              88 ms |
| ushort   :            959 ms |            772 ms |             163 ms |
| uint     :           1519 ms |           1237 ms |             382 ms |
| ulong    :           2030 ms |           1484 ms |             735 ms |
| byte     :            899 ms |            886 ms |             121 ms |
| short    :           1115 ms |           1061 ms |             189 ms |
| int      :           1267 ms |           1041 ms |             390 ms |
| long     :           2030 ms |           1596 ms |             852 ms |
| char     :           1958 ms |            755 ms |              28 ms |
| wchar    :           1945 ms |            836 ms |             129 ms |
| dchar    :           2141 ms |            850 ms |             142 ms |
|-----------------------------------------------------------------------|

|-----------------------------------------------------------------------|
| <concat>           text(...) |      newText(...) | .writeCharsTo(buf) |
|-----------------------------------------------------------------------|
| ubyte    :           1248 ms |            448 ms |             164 ms |
| ushort   :           1676 ms |            612 ms |             290 ms |
| uint     :           1491 ms |            732 ms |             445 ms |
| ulong    :           2343 ms |           1074 ms |            1095 ms |
| byte     :           1288 ms |            506 ms |             143 ms |
| short    :           1494 ms |            656 ms |             311 ms |
| int      :           1315 ms |            633 ms |             388 ms |
| long     :           2540 ms |           1363 ms |            1044 ms |
| char     :           2528 ms |            411 ms |             189 ms |
| wchar    :           2600 ms |            506 ms |             149 ms |
| dchar    :           2697 ms |            452 ms |             167 ms |
|-----------------------------------------------------------------------|

So I think there are quite a lot of savings to be had... actually I notice that the char version of text(...) is really slow when it should be quite simple... oh its being converted using to! to a GC array and the copied in... I'll submit a PR... [1]

That's better (DMD Master + my patch):

|------------------------------|
| <concat>           text(...) |
|------------------------------|
| char     :            753 ms |
| wchar    :            928 ms |
| dchar    :           1028 ms |
|------------------------------|

Thanks,
David

[0] https://github.com/complistic-gaff/newconv
[1] https://github.com/dlang/phobos/pull/6615
1 2
Next ›   Last »