Round VII. COW in the city, myths. (page 5)

July 21, 2005

Re: Round VII. COW in the city, myths.

Posted by Andrew Fedoniouk
in reply to Walter

Permalink

Andrew Fedoniouk

Posted in reply to Walter

Permalink

"Walter" <newshound@digitalmars.com> wrote in message news:dbnmh1$1qu3$1@digitaldaemon.com...
>
> "Andrew Fedoniouk" <news@terrainformatica.com> wrote in message news:dbeli8$16e6$1@digitaldaemon.com...
>> Conclusion:
>> In given pretty realistic example
>
> It's an interesting benchmark, but I don't agree it is realistic. The
> strings presented by the test program to the tolower_cow() function will
> need to .dup the string essentially 100% of the time (as the string length
> is evenly distributed between 1 and 2048, and each character will be
> uppercase 50% of the time). Duping the string with a test 100% of the time
> will, inevitably, be slower than simply duping 100% of the time and
> skipping
> the test. COW is an effective optimization only when the data doesn't need
> to change a significant percentage of the time.

Agreed.
This is just a demonstration of the fact that
COW cannot be used as a "silver bullet".

>
> Real text tends to be 100% lower case most of the time (just look at this
> post, for example). So, I suggest feeding the program some real text, such
> as splitting into words the complete text of "Alice in Wonderland" which
> is
> linked to from www.digitalmars.com/d/cppstrings.html. It'll be fun to see
> what the relative timings are for that.
>
> Of course, as always, when tuning for speed one must tune to the actual
> data
> expected. std.string.tolower(), being a library function, doesn't have the
> luxury of knowing this in advance, so we just make the best guess we can
> and
> go from there.
>
>

In those C RTLs which have strlwr functions it is implemented as in-place function.

In C++ libraries it usually implemented as:

void strlwr(string &s);
void strupr(string &s);
string strlwr(const string &s);
string strupr(const string &s);

And for me this makes much more sense than then Phobos "dcowed" version - gives me more choices for optimisations.

In article <dbp0i4$2rph$1@digitaldaemon.com>, Andrew Fedoniouk says... > >In those C RTLs which have strlwr functions it is implemented as in-place function. > >In C++ libraries it usually implemented as: > >void strlwr(string &s); >void strupr(string &s); >string strlwr(const string &s); >string strupr(const string &s); For C++, why not just use: std::transform(str.begin(),str.end(),str.begin(),tolower); Is there really any reason to wrap this in a function? Sean

"Sean Kelly" <sean@f4.ca> wrote in message news:dbp4ou$2usq$1@digitaldaemon.com... > In article <dbp0i4$2rph$1@digitaldaemon.com>, Andrew Fedoniouk says... >> >>In those C RTLs which have strlwr functions it is implemented as in-place function. >> >>In C++ libraries it usually implemented as: >> >>void strlwr(string &s); >>void strupr(string &s); >>string strlwr(const string &s); >>string strupr(const string &s); > > For C++, why not just use: > > std::transform(str.begin(),str.end(),str.begin(),tolower); > > Is there really any reason to wrap this in a function? > Reason is to have them both. void strlwr(string &s); string strlwr(const string &s);

"Uwe Salomon" <post@uwesalomon.de> wrote in message news:op.st9qmlt06yjbe6@sandmann.maerchenwald.net... >> the entire operation of checking the count and reading/writing needs to be atomic. > > And there are machine instructions for this problem. The Qt library uses this "atomic reference counting" for its containers. These are a few lines of assembler for every target platform. > > Ciao > uwe There are machine instructions to test and set (or increment) a memory location but I was referring to testing the count and setting some array content which would require using two memory locations. Glancing over the QString it looks like RAII is used to avoid the problem - though I haven't looked at the details.

"Ben Hinkle" <ben.hinkle@gmail.com> wrote in message news:dbp9bj$fc$1@digitaldaemon.com... > > "Uwe Salomon" <post@uwesalomon.de> wrote in message news:op.st9qmlt06yjbe6@sandmann.maerchenwald.net... >>> the entire operation of checking the count and reading/writing needs to be atomic. >> >> And there are machine instructions for this problem. The Qt library uses this "atomic reference counting" for its containers. These are a few lines of assembler for every target platform. >> >> Ciao >> uwe > > There are machine instructions to test and set (or increment) a memory > location but I was referring to testing the count and setting some array > content which would require using two memory locations. > Glancing over the QString it looks like RAII is used to avoid the > problem - though I haven't looked at the details. Actually looking more closely an expression like str[i]='a' isn't thread-safe. The thread safety only applies to the memory management.

> Actually looking more closely an expression like str[i]='a' isn't > thread-safe. The thread safety only applies to the memory management. Yes. The refcounting itself is thread-safe. But that's enough if you don't modify strings that are referenced more than once. Ciao uwe

On Fri, 22 Jul 2005 09:55:40 +0200, Uwe Salomon <post@uwesalomon.de> wrote: >> Actually looking more closely an expression like str[i]='a' isn't >> thread-safe. The thread safety only applies to the memory management. > > Yes. The refcounting itself is thread-safe. But that's enough if you don't modify strings that are referenced more than once. IMO that's all we need in D to do COW with surety, an indication that we are the owner of the string, the only reference holder, etc. Regan

"Uwe Salomon" <post@uwesalomon.de> wrote in message news:op.sua1m2y56yjbe6@sandmann.maerchenwald.net... >> Actually looking more closely an expression like str[i]='a' isn't thread-safe. The thread safety only applies to the memory management. > > Yes. The refcounting itself is thread-safe. But that's enough if you don't modify strings that are referenced more than once. > > Ciao > uwe Agreed. I was thinking of code like QString s; thread 1: s[0] = 'a'; thread 2: QString y(s); y[0]; y[0]; which can result in the y[0] returning different values in thread 2. Granted the code has a race condition anyway so the best one could hope for is that the y[0] return either their initial value or 'a' consitently. The code QString s; thread 1: QString x(s); x[0] = 'a'; thread 2: QString y(s); y[0]; y[0]; is fine since the copy-on-write would detect x reference and dup before changing the value. I guess my (nit-picking) beef is that the implicit sharing is visible to multi-threaded users so that it isn't completely implicit.

> Agreed. I was thinking of code like > QString s; > thread 1: s[0] = 'a'; > thread 2: QString y(s); y[0]; y[0]; > which can result in the y[0] returning different values in thread 2. Yes, of course. As a char[] is not implicitly shared, you cannot expect thread-safety here. You have to work with QStrings throughout to play safe. But, personally, i don't like this programming paradigm. I normally think in terms of "ownership", and dup when the owner changes. Ciao uwe

"Uwe Salomon" <post@uwesalomon.de> wrote in message news:op.subinjm26yjbe6@sandmann.maerchenwald.net... >> Agreed. I was thinking of code like >> QString s; >> thread 1: s[0] = 'a'; >> thread 2: QString y(s); y[0]; y[0]; >> which can result in the y[0] returning different values in thread 2. > > Yes, of course. As a char[] is not implicitly shared, you cannot expect thread-safety here. You have to work with QStrings throughout to play safe. But, personally, i don't like this programming paradigm. I normally think in terms of "ownership", and dup when the owner changes. These are QStrings everywhere - no char[]'s. Use "at" or any modifying operator if operator[] looks too much like char[].

Forums