UTF8 + SIMD = win (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » UTF8 + SIMD = win (page 2)

July 31, 2012

Re: UTF8 + SIMD = win

Posted by Tobias Pankrath
in reply to Walter Bright

Tobias Pankrath

Posted in reply to Walter Bright

On Tuesday, 31 July 2012 at 19:28:03 UTC, Walter Bright wrote:
> On 7/31/2012 5:24 AM, Jakob Ovrum wrote:
>> On Tuesday, 31 July 2012 at 12:11:25 UTC, bearophile wrote:
>>> Bernard Helyer:
>>>
>>>> Where is UTF-32 actually used?
>>>
>>> I think all std.algorithm and std.range yield UTF-32 dchars, when you give
>>> them a string in input.
>>>
>>> Bye,
>>> bearophile
>>
>> In addition, foreach over a string with a dchar loop variable does implicit
>> UTF-8 decoding.
>>
>
> SIMD isn't going to speed things up at all for decoding one character. It is for transcoding a large array.

You could decode them in advance.

July 31, 2012

Re: UTF8 + SIMD = win

Posted by bearophile
in reply to Walter Bright

bearophile

Posted in reply to Walter Bright

Walter Bright:

> SIMD isn't going to speed things up at all for decoding one character. It is for transcoding a large array.

Right.
Maybe you remember my two or three posts about vectorized lazynesss and related matters (that later was a bit implemented in the half-eager map of std.parallelism). Introducing some vectorized lazyness in std.algorithm when the iterable is a UTF-8 (or rarely UTF-16) string allows to use SIMD and probably leads to higher performance.

Bye,
bearophile

July 31, 2012

Re: UTF8 + SIMD = win

Posted by jerro
in reply to Tobias Pankrath

jerro

Posted in reply to Tobias Pankrath

On Tuesday, 31 July 2012 at 19:41:02 UTC, Tobias Pankrath wrote:
> On Tuesday, 31 July 2012 at 19:28:03 UTC, Walter Bright wrote:
>> On 7/31/2012 5:24 AM, Jakob Ovrum wrote:
>>> On Tuesday, 31 July 2012 at 12:11:25 UTC, bearophile wrote:
>>>> Bernard Helyer:
>>>>
>>>>> Where is UTF-32 actually used?
>>>>
>>>> I think all std.algorithm and std.range yield UTF-32 dchars, when you give
>>>> them a string in input.
>>>>
>>>> Bye,
>>>> bearophile
>>>
>>> In addition, foreach over a string with a dchar loop variable does implicit
>>> UTF-8 decoding.
>>>
>>
>> SIMD isn't going to speed things up at all for decoding one character. It is for transcoding a large array.
>
> You could decode them in advance.

The problem is you don't know how much you are going to need.
This would actually hurt performance in some cases.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation