December 07, 2013
07-Dec-2013 11:15, Dmitry Olshansky пишет:
> 07-Dec-2013 03:55, H. S. Teoh пишет:
>> On Fri, Dec 06, 2013 at 03:19:24PM -0800, Walter Bright wrote:
>>> On 12/6/2013 3:02 PM, Maxim Fomin wrote:
[snip]
> import std.regex.traits;
>
> auto dirEntries(C, RegEx)(in C[] path, RegEx re)
>      if(isSomeChar!C && isRegexFor!(Regex, C))

s/Regex/RegEx/

> {
>      import std.regex; //full package
>      ...
> }
>
>


-- 
Dmitry Olshansky
December 07, 2013
On Friday, 6 December 2013 at 23:19:22 UTC, Walter Bright wrote:
> You can write D code in "C style" and you'll get C results. To get performance advantages from D code, you'll need to write in a structurally different way (as Andrei pointed out).
>
> Looking through Phobos, there is a lot of code that is not written to take advantage of D's strengths. An apt one discussed here recently is the std.path.buildPath, which is written in "C style", as in allocating memory for its result.
>
> A structural D style version would accept a range for its output, and the range need not allocate memory. This would be fundamentally faster than the typical C approach.
>
> This pattern is repeated a lot in Phobos code.
>
>
>> I believe that most of your points are either insignificant (like array length -
>> it is carried together with pointer almost everywhere in C)
>
> I see a lot of C code that does strlen() over and over. I think Tango's XML parser showed what can be done in D versus any known C implementation. It took maximal advantage of D's slicing abilities to avoid copying.
>
> Dmitry's regex also showed huge gains over C regex implementations.

This C code is easy to fix. Unlike in D there is no way to fix constant gc allocations and if gc is disabled, you say good buy to: classes, interfaces, exceptions, dynamic arrays, delegates, lambdas, AA arrays, etc.

By the way, if you mentioned strlen(), lets compare printf() and writeln().

>> or provide some marginal advantage.
>
> Even a marginal advantage is a counter example to the claim "there is no way proper C code can be slower than those languages."

But summing this issues altogether makes D code cannot compete with C code.

>> Such advantages are offset by:
>>
>> - huge runtime library
>
> C has a huge runtime library, too, it's just that you normally don't notice it because it's not statically linked in. Be that as it may, 2.064 substantially reduced the size of "hello world" programs.
>
>> - constant runtime lib invocation and allocation stuff on heap
>
> This is, as I mentioned, a problem with writing C style code in Phobos.

is it a C style?

T[] data = [T, T, T];

or this:

T data;
auto dg = { return data; }

>> - horrible mangling (see
>> http://forum.dlang.org/thread/mailman.207.1369611513.13711.digitalmars-d@puremagic.com
>> examples from hall of D mangling, mangling is so big, that forum software goes
>> astray)
>
> Long mangling is not an inherent language characteristic, as that thread suggests improvements.

But this is flaw in implementation. Language and its advantages are dead without implementation.

And again, notice that you are speaking about 'hypothetical advantages' (language advantages) which implies two things:
1) current efficiency is worse when comparing with some benchmark
2) despite many years of development, community failed to realize these advantages.

This makes me think that probably there is another reason of why code is less efficient, for example fundamental characteristics of the language make him hard to be quick. This is not bad per se, but saying that language code can be faster than C, taking into account some many problems with D, looks like advertisement, rather then technical comparison.
December 07, 2013
On Friday, 6 December 2013 at 23:30:45 UTC, Walter Bright wrote:
> On 12/6/2013 3:06 PM, Maxim Fomin wrote:
>> and what about holes in immutable, pure and rest type system?
>
> If there are bugs in the type system, then that optimization breaks.

Bad news: there are many bugs in type system.

>
>> C doesn't have virtual functions.
>
> Right, but you can (and people do) fake virtual functions with tables of function pointers. No, C doesn't devirtualize those.
>

Neither does D.

>> By the way, does D devirtualize them?
>
> It does for classes/methods marked 'final'

this is essentially telling nothing, because these functions are not virtual. In your speaking, C 'devirtualizes' all direct calling.

> and also in cases where it can statically tell that a class instance is the most derived type.

I haven't noticed that.
December 07, 2013
On Saturday, 7 December 2013 at 00:40:52 UTC, Manu wrote:
> Assuming a comparison to C++, you know perfectly well that D has a severe
> disadvantage. Unless people micro-manage final (I've never seen anyone do
> this to date), then classes will have significantly inferior performance to
> C++.
> C++ coders don't write virtual on everything. Especially not trivial
> accessors which must be inlined.

Yes, but this change will resolve that problem, and I believe it has been approved, correct?

Issue 11616 - Introduce virtual keyword and remove virtual-by-default
https://d.puremagic.com/issues/show_bug.cgi?id=11616

--rt
December 07, 2013
On Saturday, 7 December 2013 at 00:26:34 UTC, H. S. Teoh wrote:
> On Sat, Dec 07, 2013 at 01:09:00AM +0100, John Colvin wrote:
>> On Friday, 6 December 2013 at 23:56:39 UTC, H. S. Teoh wrote:
>> >
>> >It would be nice to decouple Phobos modules more. A *lot* more.
>> 
>> Why? I've seen this point made several times and I can't understand
>> why this is an important concern.
>> 
>> I see the interplay between phobos modules as good, it saves
>> reinventing the wheel all over the place, making for a smaller,
>> cleaner standard library.
>> 
>> Am I missing something fundamental here?
>
> It's not that it's bad to reuse code. The problem is the dependency is
> too coarse-grained, so that if you want to, say, print "hello world", it
> pulls in all sorts of stuff, like algorithms for sorting arrays (just an
> example, not the actual case), or floating-point format parsers (may
> actually be the case), which aren't *needed* to perform that particular
> task. If printing "hello world" requires pulling in file locking code,
> then by all means, pull that in. But it shouldn't pull in, say,
> std.complex just because some obscure corner of writeln's implementation
> makes a reference to std.complex.

What is the actual problem? Compile times? Binary size? Surely not performance or efficency.

I remember someone from the Go team (maybe Pike), that they have deliberate code duplication in the standard library to decouple it. I did not understand the reasoning there, too.

December 07, 2013
On Friday, 6 December 2013 at 22:20:19 UTC, Walter Bright wrote:
> "there is no way proper C code can be slower than those languages."
> http://www.reddit.com/r/programming/comments/1s5ze3/benchmarking_d_vs_go_vs_erlang_vs_c_for_mqtt/cduwwoy
>
> comes up now and then. I think it's incorrect, D has many inherent advantages in generating code over C:

Good choice of words. The competent C programmer is able to perform comparable optimizations at least in the hot spots. D gives you a few of those for free (garbage collection,slices) and makes some things significantly easier (templates).

However, I think the original statement is also true in the technical sense. The same argument can be made with assembly. It is impossible to beat "proper" hand-written asm, where proper means "only theoretically possible". In practice I agree with you that optimizing a D program should be easier than optimizing a C program.
December 07, 2013
On 12/6/2013 11:34 PM, Maxim Fomin wrote:
> On Friday, 6 December 2013 at 23:19:22 UTC, Walter Bright wrote:
>> I see a lot of C code that does strlen() over and over. I think Tango's XML
>> parser showed what can be done in D versus any known C implementation. It took
>> maximal advantage of D's slicing abilities to avoid copying.
>>
>> Dmitry's regex also showed huge gains over C regex implementations.
>
> This C code is easy to fix.

No it isn't. Have you tried? I have, it's very hard to retrofit a non-trivial C program with carrying around all the string lengths as a separate value. Call just about any C library, and you're back again to strlen(). It's even harder to get away from strlen() in C++ because it inextricably tied std::string to it.

A lot of capable people worked for decades on C regexen, yet Dmitry blew them away with his D version.


> Unlike in D there is no way to fix constant gc
> allocations and if gc is disabled, you say good buy to: classes, interfaces,
> exceptions, dynamic arrays, delegates, lambdas, AA arrays, etc.

I think you're way exaggerating. (Note that C has none of those features.)


> By the way, if you mentioned strlen(), lets compare printf() and writeln().

Sure. Feel free.


> But summing this issues altogether makes D code cannot compete with C code.

This is simply not true. You might want to look at Don's presentation at Dconf2013 where he explains that Sociomantic uses D not for romantic reasons but because the performance of it gives a competitive edge.


>>> examples from hall of D mangling, mangling is so big, that forum software goes
>>> astray)
>>
>> Long mangling is not an inherent language characteristic, as that thread
>> suggests improvements.
>
> But this is flaw in implementation. Language and its advantages are dead without
> implementation.

I've worked with a lot of C compilers that had lousy, buggy implementations and lousy code generation. I'm being careful here to deal with characteristics of the language, not the implementation.


> And again, notice that you are speaking about 'hypothetical advantages'
> (language advantages) which implies two things:
> 1) current efficiency is worse when comparing with some benchmark
> 2) despite many years of development, community failed to realize these advantages.
>
> This makes me think that probably there is another reason of why code is less
> efficient, for example fundamental characteristics of the language make him hard
> to be quick. This is not bad per se, but saying that language code can be faster
> than C, taking into account some many problems with D, looks like advertisement,
> rather then technical comparison.

There are several D projects which show faster runs than C. If your goal is to pragmatically write faster D code than in C, you can do it without too much effort. If your goal is to find problem(s) with D, you can certainly do that, too.

December 07, 2013
Am Fri, 06 Dec 2013 15:48:27 -0800
schrieb Walter Bright <newshound2@digitalmars.com>:

> On 12/6/2013 3:40 PM, bearophile wrote:
> > Recently I have seen this through Reddit (with a comment by Anon):
> >
> > http://eli.thegreenplace.net/2013/12/05/the-cost-of-dynamic-virtual-calls-vs-static-crtp-dispatch-in-c/
> >
> > The JavaVM is often able to de-virtualize virtual calls.
> 
> I know. It is an advantage that JITing has. It's also an advantage if you can do whole-program analysis, which can easily be done in Java.

How is that easier in Java? When whole-program analysis finds that there is no class extending C, it could devirtualize all methods of C, but(!) you can load and unload new derived classes at runtime, too.

Also the JVM doesn't load all classes at program startup,
because it would create too much of a delay. This goes so
far that there is even a special class for splash screens with
minimal dependencies, to avoid loading most of the runtime and
GUI library first.

I think whole-program analysis in such an environment is outright impossible.

-- 
Marco

December 07, 2013
On 12/7/2013 12:13 AM, qznc wrote:
> However, I think the original statement is also true in the technical sense. The
> same argument can be made with assembly. It is impossible to beat "proper"
> hand-written asm, where proper means "only theoretically possible". In practice
> I agree with you that optimizing a D program should be easier than optimizing a
> C program.

"there is no way proper C code can be slower than those languages."

It's the qualifier "proper". You say that means theoretically possible, I disagree. I suggest that proper C code means code that is presentable, maintainable and professionally written using commonly accepted best practices.

For example, I've seen metaprogramming done in C using the preprocessor. It works, but I consider the result to be not presentable, not maintainable, and unprofessional and not a best practice.

For another, Maxim suggested that it was easy in C to use D style length-delimited strings instead of 0 terminated ones. It's certainly theoretically possible to write such a string type in C, but it ain't easy and your result will be completely out of step with anyone else's C code and C libraries, which is why people don't do it.

For another, how many times have you seen bubble sort reimplemented in C code? How about the obvious implementation of string searching? etc.? I've seen that stuff a lot. But in D, using a best-of-breed implementation of quicksort is easy as pie, same with searching, etc. These kinds of things also make D faster. I've translated C code into D before and gotten it to run faster by doing these sorts of plug-in algorithm replacements.
December 07, 2013
On 12/7/2013 1:30 AM, Marco Leise wrote:
> How is that easier in Java? When whole-program analysis finds
> that there is no class extending C, it could devirtualize all
> methods of C, but(!) you can load and unload new derived
> classes at runtime, too.

This can be done by noting what new derived classes are introduced by runtime loading, and re-JITing any functions that devirtualized base classes of it.

I don't know if this is actually done, but I don't see an obvious problem with it.