Thread overview
[phobos] Optimizing std.conv.parse for float conversions
Feb 17, 2010
David Simcha
Mar 10, 2010
Sean Kelly
February 16, 2010
I looked into creating D implementations of equivalent functions to strtold, etc. to address bug 3758 (http://d.puremagic.com/issues/show_bug.cgi?id=3758).  It seems like converting strings to floats is actually pretty hard to do right, i.e. without any loss of precision.  However, the inefficiency of requiring several heap allocations per conversions hurts when reading in large files.  From some measurements I did, it seems like the heap allocation and associated garbage collection is the bigger source of overhead than the string copying.  Is everyone ok with me optimizing std.conv.parse to place the zero-terminated version of the string on the stack if it's reasonably small as a temporary fix?
February 16, 2010
Heh, incidentally I was looking over the same issue recently.

I think we need for a much more radical approach. We need to be able to parse any stream of characters (that means lookahead == 1) and stop as soon as the meaningful input has ended.

Walter wrote a while back some solid floating point parsing routines for dmc. I think we should start with those and adapt them to work with input streams instead of strings.

This applies to std.format too, which is in dire need of an overhaul to make it work with all streams.

Andrei

David Simcha wrote:
> I looked into creating D implementations of equivalent functions to
> strtold, etc. to address bug 3758
> (http://d.puremagic.com/issues/show_bug.cgi?id=3758).  It seems like
> converting strings to floats is actually pretty hard to do right, i.e.
> without any loss of precision.  However, the inefficiency of requiring
> several heap allocations per conversions hurts when reading in large
> files.  From some measurements I did, it seems like the heap allocation
> and associated garbage collection is the bigger source of overhead than
> the string copying.  Is everyone ok with me optimizing std.conv.parse to
> place the zero-terminated version of the string on the stack if it's
> reasonably small as a temporary fix?
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
March 10, 2010
Oops, I guess this means readf, unformat, etc are incomplete?

On Feb 16, 2010, at 7:12 PM, Andrei Alexandrescu wrote:

> Heh, incidentally I was looking over the same issue recently.
> 
> I think we need for a much more radical approach. We need to be able to parse any stream of characters (that means lookahead == 1) and stop as soon as the meaningful input has ended.
> 
> Walter wrote a while back some solid floating point parsing routines for dmc. I think we should start with those and adapt them to work with input streams instead of strings.
> 
> This applies to std.format too, which is in dire need of an overhaul to make it work with all streams.
> 
> Andrei
> 
> David Simcha wrote:
>> I looked into creating D implementations of equivalent functions to strtold, etc. to address bug 3758 (http://d.puremagic.com/issues/show_bug.cgi?id=3758).  It seems like converting strings to floats is actually pretty hard to do right, i.e. without any loss of precision.  However, the inefficiency of requiring several heap allocations per conversions hurts when reading in large files.  From some measurements I did, it seems like the heap allocation and associated garbage collection is the bigger source of overhead than the string copying.  Is everyone ok with me optimizing std.conv.parse to place the zero-terminated version of the string on the stack if it's reasonably small as a temporary fix?
>> _______________________________________________
>> phobos mailing list
>> phobos at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/phobos
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos

March 10, 2010
Yes, they need significant work.

Andrei

On 03/10/2010 12:12 PM, Sean Kelly wrote:
> Oops, I guess this means readf, unformat, etc are incomplete?
>
> On Feb 16, 2010, at 7:12 PM, Andrei Alexandrescu wrote:
>
>> Heh, incidentally I was looking over the same issue recently.
>>
>> I think we need for a much more radical approach. We need to be able to parse any stream of characters (that means lookahead == 1) and stop as soon as the meaningful input has ended.
>>
>> Walter wrote a while back some solid floating point parsing routines for dmc. I think we should start with those and adapt them to work with input streams instead of strings.
>>
>> This applies to std.format too, which is in dire need of an overhaul to make it work with all streams.
>>
>> Andrei
>>
>> David Simcha wrote:
>>> I looked into creating D implementations of equivalent functions to strtold, etc. to address bug 3758 (http://d.puremagic.com/issues/show_bug.cgi?id=3758).  It seems like converting strings to floats is actually pretty hard to do right, i.e. without any loss of precision.  However, the inefficiency of requiring several heap allocations per conversions hurts when reading in large files.  From some measurements I did, it seems like the heap allocation and associated garbage collection is the bigger source of overhead than the string copying.  Is everyone ok with me optimizing std.conv.parse to place the zero-terminated version of the string on the stack if it's reasonably small as a temporary fix?
>>> _______________________________________________
>>> phobos mailing list
>>> phobos at puremagic.com
>>> http://lists.puremagic.com/mailman/listinfo/phobos
>> _______________________________________________
>> phobos mailing list
>> phobos at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/phobos
>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos