October 07, 2020
On Wednesday, 7 October 2020 at 04:17:59 UTC, H. S. Teoh wrote:
> On Tue, Oct 06, 2020 at 08:47:33PM -0700, Walter Bright via Digitalmars-d wrote:
>> On 10/6/2020 11:09 AM, Adam D. Ruppe wrote:
>> > The Phobos implementation started life with a very simple implementation too. It became what it is because it *had to*, specifically for performance reasons.
>> 
>> Professional C Standard library implementations tend to be hideous code to perform objectively simple operations. The reason is speed is so desirable in foundational code that it drives out all other considerations. (memcpy() is a standout example.)
>
> A little tangential aside: one time as a little coffee break challenge, I decided to see how D compares to GNU wc in terms of the speed of counting lines in a text file.  The core of wc, of course, is in memchr -- because it's scanning for newline characters in a buffer.  In the course of ferreting out what made wc so fast, I studied how GNU libc implemented memchr.  Basically, in order to speed up scanning large buffers, it uses a series of fancy bit operations on 64-bit words in order to scan 8 bytes at a time, I suppose the goal being to achieve close to 8x speedup, and also to reduce the number of branches per iteration to capitalize on the CPU's pipeline.
>
> In order to do this, however, some not-so-cheap setup was necessary at the beginning and end of the buffer, which generally are not 8-byte aligned.  When a particular call to memchr scanned many bytes, of course, the speed of scanning 8 bytes at a time outweighed this setup / teardown cost, and wc generally outperformed my D code.
>
> However, when the lines being scanned were on the short side, the overhead of setting up the 8-byte-at-a-time scanning added up significantly -- the shorter lines meant less time was spend scanning 8 bytes at a time, and more time spent in the complicated code dealing with non-aligned start/end of the buffer. In this case, I discovered that a simplistic for-loop in D outperformed wc.
>
> This led me to think, realistically speaking, how long do lines in a text file tend to be?  My wild guess is that they tend to be on the shortish side -- maybe about 60-70 characters on average? For code, a lot less, since code generally has more whitespace for readability reasons.  Files that have very long lines tend to be things like XML or compressed Javascript, which generally you don't really use wc on anyway, so in my mind, they seem to be more the exception than the norm when it comes to the applicability of wc.  Furthermore, I venture to propose that your average string length in C is probably closer to the 80-120 character ballpark, than to large buffers of 1K or 8K which is where the 8-byte-at-a-time scanning would perform much better. Sure, large text buffers do get handled in C code routinely; but how many of them realistically would you find being handed to strchr to scan for some given byte?  The kind of C strings you'd want to search for a particular byte in, IME, are usually the shorter kind.
>
> What this means, is that yes memchr may be "optimized" to next week and back -- but that optimization came with some implicit assumptions about how long it takes to find a character in a string buffer. These assumptions unfortunately pessimized the common case with the overhead of a fancy 8-byte-at-a-time algorithm: one that does better in the *less* common case of looking for a particular byte in a very large buffer.
>
> My point behind all this, is that what's considered optimal sometimes changes depending on what kind of use case you anticipate; and with different usage patterns, one man's optimal algorithm may be another man's suboptimal algorithm, and one man's slow, humble for-loop may be another man's speed demon.  Optimizing for the general case in a standard library is generally a very tough problem, because how do you make any assumptions about the general case?  Whichever way you choose to optimize a primitive like memchr, you're imposing additional assumptions that may make it worse for some people, even if you think that you're improving it for your chosen target metric.
>

There's a counter argument to your example (which doesn't invalidate yout point).

The pessimisation for short lines is barely noticeable and is drowned in the noise of process launching and even if you can measure it consistantly and easily, it doesn't matter much even if happening a lot of times (in a batch job f.ex). The optimization for long line though, will be noticeable immediately because of the inherent processing time of the operation. If an operation of 5ms takes now 10ms, it doesn't make a difference. If an operation that took 1 minute now takes 30s it is a big deal.


>
>
>> I remember back in the 80's when Borland came out with Turbo C. The compiler didn't generate very good code, but applications built with it were reasonably fast. How was this done?
>> 
>> Borland carefully implemented the C library in hand-optimized assembler by some very good programmers. Even printf was coded this way. The speedups gained there sped up every Turbo C program. At the time, nobody else had done that.
>> 
>> Much as I hate to admit it, Borland made the smart move there.
>
> So what does this imply in terms of Phobos code in D? ;-)  Should we uglify Phobos for the sake of performance?  Keeping in mind, of course, what I said about the difficulty of optimizing for the general case without pessimizing some legitimate use cases -- or maybe even the *common* case, if we lose sight of the forest of common use cases while trying to optimize our chosen benchmark trees.


My point. Some optimlizations are more about mitigating speed degradation on pathological cases than on the normal general case, or else the whole O() notation would be of no importance (I'm sure the symbol name compression scheme in the D compiler pessimized the common case but saved the language from the pathological cases of recursive symbols).

October 07, 2020
On 10/7/20 4:49 AM, Patrick Schluter wrote:
> On Wednesday, 7 October 2020 at 02:33:21 UTC, Andrei Alexandrescu wrote:
>> On 10/6/20 9:07 PM, claptrap wrote:
>>> On Tuesday, 6 October 2020 at 23:39:24 UTC, H. S. Teoh wrote:
>>>> On Tue, Oct 06, 2020 at 11:16:47PM +0000, claptrap via Digitalmars-d wrote: [...]
>>>>>
>>>> I would write it like this:
>>>>
>>>>     int[] vals = [4,7,28,23,585,73,12];
>>>>
>>>>     int[] getMultiplesOf(int i)
>>>>     {
>>>>         return vals.filter!(v => (v % i) == 0).array;
>>>>     }
>>>>
>>>> One line vs. 4, even more concise. ;-)
>>>
>>> The point is to show language not library.
>>
>> That's a made-up restriction, and it's odd that it is being discussed here as a virtue.
> 
> No, it's not. It's central to the argument.

Then the argument is specious.

I've been also tempted to do this on occasion to tilt a comparison one way or another - take C++ without STL or Boost, or Haskell without Prelude. The reality is these need to be considered together. (Make a hashtable in C++, no standard library allowed...)

>> Beginners are attracted to large languages that have everything built in. A good language is focused on general primitives that allow writing a great deal in libraries.
> 
> Then do lisp or forth but not D or C++.

Much of the size of C++ and D caters to library writers, and their standard libraries grew way faster than the core language - as they should.
October 07, 2020
On Wednesday, 7 October 2020 at 09:04:59 UTC, Ola Fosheim Grøstad wrote:
> On Wednesday, 7 October 2020 at 08:49:21 UTC, Patrick Schluter wrote:
>> On Wednesday, 7 October 2020 at 02:33:21 UTC, Andrei Alexandrescu wrote:
>
>
> Right again. Funtional programming is only pleasant in a dedicated FP language, and even then you need to memorize a large set of library constructs to be productive.

How many pure funtional languages are in the tiobe top 10? None or maybe 1 im not sure. Theres a reason for that i reckon.
October 07, 2020
On Wednesday, 7 October 2020 at 11:10:15 UTC, Andrei Alexandrescu wrote:
> I've been also tempted to do this on occasion to tilt a comparison one way or another - take C++ without STL or Boost, or Haskell without Prelude. The reality is these need to be considered together. (Make a hashtable in C++, no standard

Your argument would work for Go, but not for C++.

Lots of C++ codebases rely only on the most basic stuff like malloc and memcpy.

boost is only useful for prototyping, you find better dedicated libraries. same for much of STL

Cppcon has a 2 HOURS presentation explaning just type traits... That tells me that something went wrong somewhere... Not generic enough, too large vocubalary.

Standard libs should be small and composable.

Over 80% of C++ stdlib is useless or outdated. Of course they have like a 20+ years deprecation cycle...
October 07, 2020
On Wednesday, 7 October 2020 at 11:19:24 UTC, claptrap wrote:
> On Wednesday, 7 October 2020 at 09:04:59 UTC, Ola Fosheim Grøstad wrote:
>> On Wednesday, 7 October 2020 at 08:49:21 UTC, Patrick Schluter wrote:
>>> On Wednesday, 7 October 2020 at 02:33:21 UTC, Andrei Alexandrescu wrote:
>>
>>
>> Right again. Funtional programming is only pleasant in a dedicated FP language, and even then you need to memorize a large set of library constructs to be productive.
>
> How many pure funtional languages are in the tiobe top 10? None or maybe 1 im not sure. Theres a reason for that i reckon.

We are using a pure functional language for web frontend development, and it's a joy ...
October 07, 2020
On Wednesday, 7 October 2020 at 11:10:15 UTC, Andrei Alexandrescu wrote:
> On 10/7/20 4:49 AM, Patrick Schluter wrote:
>> On Wednesday, 7 October 2020 at 02:33:21 UTC, Andrei Alexandrescu wrote:
>>> On 10/6/20 9:07 PM, claptrap wrote:
>>>> On Tuesday, 6 October 2020 at 23:39:24 UTC, H. S. Teoh wrote:
>>>>> On Tue, Oct 06, 2020 at 11:16:47PM +0000, claptrap via Digitalmars-d wrote: [...]
>>>>>>
>>>>> I would write it like this:
>>>>>
>>>>>     int[] vals = [4,7,28,23,585,73,12];
>>>>>
>>>>>     int[] getMultiplesOf(int i)
>>>>>     {
>>>>>         return vals.filter!(v => (v % i) == 0).array;
>>>>>     }
>>>>>
>>>>> One line vs. 4, even more concise. ;-)
>>>>
>>>> The point is to show language not library.
>>>
>>> That's a made-up restriction, and it's odd that it is being discussed here as a virtue.
>> 
>> No, it's not. It's central to the argument.
>
> Then the argument is specious.
>
> I've been also tempted to do this on occasion to tilt a comparison one way or another - take C++ without STL or Boost, or Haskell without Prelude. The reality is these need to be considered together. (Make a hashtable in C++, no standard library allowed...)

If you're just asking is this easier in this language or that then it is unheplful to say no stdlib. But thats not whats going on here. You're comparing two language features within one language. The incubant feature has 20 years of library support behind it. The other does not.

Lets reverse the roles, say TF were invented first, and somebody was arguing for templates to be added now. The TP version would be a couple of stdlib calls and the template version a whole page.

Its a nonsense to make a comparison like that.

Maybe your "temptation too tilt" is at play here but you havent realised it yet?

October 07, 2020
On 10/7/20 7:58 AM, claptrap wrote:
> Lets reverse the roles, say TF were invented first, and somebody was arguing for templates to be added now. The TP version would be a couple of stdlib calls and the template version a whole page.

Incumbency is a huge matter in programming language design. Of course I would not propose another way of doing the same thing.
October 07, 2020
On Wednesday, 7 October 2020 at 11:55:00 UTC, Paolo Invernizzi wrote:
> On Wednesday, 7 October 2020 at 11:19:24 UTC, claptrap wrote:
>> On Wednesday, 7 October 2020 at 09:04:59 UTC, Ola Fosheim Grøstad wrote:
>>> On Wednesday, 7 October 2020 at 08:49:21 UTC, Patrick Schluter wrote:
>>>> On Wednesday, 7 October 2020 at 02:33:21 UTC, Andrei Alexandrescu wrote:
>>>
>>>
>>> Right again. Funtional programming is only pleasant in a dedicated FP language, and even then you need to memorize a large set of library constructs to be productive.
>>
>> How many pure funtional languages are in the tiobe top 10? None or maybe 1 im not sure. Theres a reason for that i reckon.
>
> We are using a pure functional language for web frontend development, and it's a joy ...

Maybe they're great in specific circumstances? I dont know what the reason is but if FP was so intuitive then you'd expect it'd be the norm, or at least on a par, or even in the same ballpark as imperative?
October 07, 2020
On Wednesday, 7 October 2020 at 12:30:15 UTC, Andrei Alexandrescu wrote:
> On 10/7/20 7:58 AM, claptrap wrote:
>> Lets reverse the roles, say TF were invented first, and somebody was arguing for templates to be added now. The TP version would be a couple of stdlib calls and the template version a whole page.
>
> Incumbency is a huge matter in programming language design. Of course I would not propose another way of doing the same thing.

I just looked up Incumbency, but the German translation was 'Time someone has been in office' ...
I am not seeing how these things relate.
Perhaps you can say this in different words?

October 07, 2020
On Wednesday, 7 October 2020 at 12:30:15 UTC, Andrei Alexandrescu wrote:
> On 10/7/20 7:58 AM, claptrap wrote:
>> Lets reverse the roles, say TF were invented first, and somebody was arguing for templates to be added now. The TP version would be a couple of stdlib calls and the template version a whole page.
>
> Incumbency is a huge matter in programming language design. Of course I would not propose another way of doing the same thing.

c++ concepts is another way of doing the same thing.
So you are basically saying they should not have done it?

c++ modules is another way of doing the same thing.
Etc.

C++ concepts is a prettier and more useful version with compile traits qualities.
Probably the most important improvement since RAII...

Nobody that values their time wants to juggle types in c++. D isnt on a different plane either.

Improving usability is a survivalstrategy for any language...

Haskell is failing for that reason, it is not going to improve. It is primarily a research test bed, just like ML and descendants. Academic languages evolve by death and rebirth, not by additions. So if you want D3, ok. But if you want D2, then you cannot pick the path that academics follow.