June 02, 2013
Joseph,

IIRC -inline is a DMD specific switch. Adding this to gdc command line produces this:

gdc.exe: error: unrecognized command line option '-inline'

Besides, the improvements mainly come from unrolling short loops not from inlining.

Joseph Rushton Wakeling <joseph.wakeling@webdrake.net> writes:

> On 06/02/2013 11:32 AM, finalpatch wrote:
>> The flags I used
>> OSX LDC: -O3 -release
>> WIN GDC: -O3 -fno-bounds-check -frelease
>
> Does adding -inline make a difference to initial performance (i.e. before your manual interventions)?  I guess it's already covered by -O3 in both cases, but a while back I did notice some differences in "default" LDC and GDC performance that seemed to relate to inlining.

-- 
finalpatch
June 02, 2013
Am 02.06.2013 13:08, schrieb John Colvin:
> On Sunday, 2 June 2013 at 07:32:10 UTC, Manu wrote:
>> On 2 June 2013 01:19, Paulo Pinto <pjmlp@progtools.org> wrote:
>>
>>> Am 01.06.2013 16:24, schrieb Benjamin Thaut:
>>>
>>>  Am 01.06.2013 01:30, schrieb Manu:
>>>>
>>>>> On 1 June 2013 09:15, bearophile <bearophileHUGS@lycos.com
>>>>> <mailto:bearophileHUGS@lycos.**com <bearophileHUGS@lycos.com>>> wrote:
>>>>>
>>>>>     Manu:
>>>>>
>>>>>         On 1 June 2013 01:12, bearophile <bearophileHUGS@lycos.com
>>>>>         <mailto:bearophileHUGS@lycos.**com
>>>>> <bearophileHUGS@lycos.com>>>
>>>>> wrote:
>>>>>
>>>>>             Manu:
>>>>>
>>>>>
>>>>>               Frankly, this is a textbook example of why STL is the
>>>>>             spawn of satan. For
>>>>>
>>>>>                 some reason people are TAUGHT that it's reasonable to
>>>>>                 write code like
>>>>>                 this.
>>>>>
>>>>>
>>>>>             There are many kinds of D code, not everything is a high
>>>>>             performance
>>>>>             ray-tracer or 3D game. So I'm sure there are many many
>>>>>             situations where
>>>>>             using the C++ STL is more than enough. As most tools, you
>>>>>             need to know
>>>>>             where and when to use them. So it's not a Satan-spawn :-)
>>>>>
>>>>>
>>>>>         So why are we having this conversation at all then if faster
>>>>>         isn't better in this instance?
>>>>>
>>>>>
>>>>>     Faster is better in this instance.
>>>>>     What's wrong is your thinking that the STL as the spawn of
>>>>> Satan in
>>>>>     general.
>>>>>
>>>>>
>>>>> Ah, but that's because it is ;)
>>>>> Rule of thumb: never use STL in tight loops. problem solved (well,
>>>>> mostly)...
>>>>>
>>>>
>>>> I have to agree here. Whenever you have a codebase that has to work
>>>> on 9
>>>> platforms and 6 compilers the S in STL vanishes. Also the
>>>> implementations are so varying in quality that you might get really
>>>> good
>>>> performance on one platform but really bad on another. It seems like
>>>> everyone in the games industry avoids STL like the plague.
>>>>
>>>> Kind Regards
>>>> Benjamin Thaut
>>>>
>>>
>>> I used to have that experience even with C, when I started using it
>>> around
>>> 1994. C++ was even worse between CFront, ARM and ongoing standardization
>>> work.
>>>
>>> As for STL, I can assure that HPC guys are huge fans of STL and Boost.
>>>
>>
>> The funny thing about HPC guys though, at least in my experience (a bunch
>> of researchers from Cambridge who I often give _basic_ optimisation
>> tips),
>> is they don't write/run 'high performance software', they're actually
>> pretty terrible programmers and have a tendency to write really low
>> performing software, but run it on super high performance computers, and
>> then call the experience high performance computing...
>> It bends my mind to see them demand an order of magnitude more computing
>> power to run an algorithm that's hamstrung by poor choices of
>> containers or
>> algorithms that probably cost them an order of magnitude in
>> performance ;)
>> And then the Universities take their demands seriously and deliver them
>> more hardware! O_O
>>
>> At least when I did my traineeship at CERN (2003-2004) that was the case.
>>>
>>
>> I hope CERN has better software engineers than Cambridge University ;)
>> Most of these guys are mathematicians and physicists first, and
>> programmers
>> second.
>
> In my experience, physicists are terrible programmers. I should know, I
> am one! As soon as you step outside the realm of simple, < 10kloc, pure
> procedural code, the supposed "HPC"
> guys don't generally have the first clue how to write something fast.
>
> CERN is responsible for the abomination that is ROOT, but to be fair to
> them there is a lot of good code from there too.

There was an office there that had the sentence "You can program Fortran in any language" on the door. :)

--
Paulo
June 02, 2013
> But avoiding heap allocations for array literals is a change that needs to be discussed.

In the meantime I have written a small ER regarding escape analysis for dynamic arrays:
http://d.puremagic.com/issues/show_bug.cgi?id=10242

Bye,
bearophile
June 02, 2013
On 2 June 2013 19:53, Jacob Carlborg <doob@me.com> wrote:

> On 2013-06-01 23:08, Jonathan M Davis wrote:
>
>  If you don't need polymorphism, then in general, you shouldn't use a class
>> (though sometimes it might make sense simply because it's an easy way to
>> get a
>> reference type). Where it becomes more of a problem is when you need a few
>> polymorphic functions and a lot of non-polymorphic functions (e.g. when a
>> class has a few methods which get overridden and then a lot of properties
>> which it makes no sense to override). In that case, you have to use a
>> class,
>> and then you have to mark a lot of functions as final. This is what folks
>> like
>> Manu and Don really don't like, particularly when they're in environments
>> where the extra cost of the virtual function calls actually matters.
>>
>
> If a reference type is needed but not a polymorphic type, then a final class can be used.


I've never said that virtuals are bad. The key function of a class is
polymorphism.
But the reality is that in non-tool or container/foundational classes
(which are typically write-once, use-lots; you don't tend to write these
daily), a typical class will have a couple of virtuals, and a whole bunch
of properties.
The majority of functions in OO code (in the sorts of classes that you tend
to write daily, ie, the common case) are trivial accessors or properties.
The cost of forgetting to type 'final' is severe, particularly so on a
property, and there is absolutely nothing the compiler can do to help you.
There's no reasonable warning it can offer either, it must presume you
intended to do that.

Coders from at least C++ and C# are trained by habit to type virtual
explicitly, so they will forget to write 'final' all the time.
I can tell from hard experience, that despite training programmers that
they need to write 'final', they have actually done so precisely ZERO TIMES
EVER.
People don't just forget the odd final here or there, in practise, they've
never written it yet.

The problem really goes pear shaped when faced with the task of
opportunistic de-virtualisation - that is, de-virtualising functions that
should never have been virtual in the first place, but are; perhaps because
code has changed/evolved, but more likely, because uni tends to output
programmers that are obsessed with the possibility of overriding everything
;)
It becomes even worse than what we already have with C++, because now in D,
I have to consider every single method and manually attempt to determine
whether it should actually be virtual or not. A time consuming and
certainly dangerous/error-prone task when I didn't author the code! In C++
I can at least identify the methods I need to consider by searching for
'virtual', saving maybe 80% of that horrible task by contrast.

But there are other compelling reasons too, for instance during
conversations with Daniel Murphy and others, it was noted that it will
enhance interoperation with C++ (a key target for improvement being flagged
recently), and further enabled the D DFE.
I also think with explicit 'override' a requirement, it's rather
non-orthogonal to not require explicit 'virtual' too.

So, consider this reasonably. At least Myself and Don have both made strong
claims to this end... and we're keen to pay for it with fixing broken base
classes.
Is it REALLY that much of an inconvenience to people to be explicit with
'virtual' (as they already are with 'override')?
Is catering to that inconvenience worth the demonstrable cost? I'm not
talking about minor nuisance, I'm talking about time and money.


June 02, 2013
On 2 June 2013 20:16, Jonathan M Davis <jmdavisProg@gmx.com> wrote:

> On Sunday, June 02, 2013 11:53:26 Jacob Carlborg wrote:
> > On 2013-06-01 23:08, Jonathan M Davis wrote:
> > > If you don't need polymorphism, then in general, you shouldn't use a
> class
> > > (though sometimes it might make sense simply because it's an easy way
> to
> > > get a reference type). Where it becomes more of a problem is when you need a few polymorphic functions and a lot of non-polymorphic functions (e.g. when a class has a few methods which get overridden and then a
> lot
> > > of properties which it makes no sense to override). In that case, you have to use a class, and then you have to mark a lot of functions as final. This is what folks like Manu and Don really don't like, particularly when they're in environments where the extra cost of the virtual function calls actually matters.
> > If a reference type is needed but not a polymorphic type, then a final class can be used.
>
> Yes. The main problem is when you have a class with a few methods which
> should
> be virtual and a lot that don't. You're forced to mark a large number of
> functions as final. That burden can be lessened by using final with a colon
> rather than marking them individually, but rather what seems to inevitably
> happen is that programmers forget to mark any of them as final (Manu can
> rant
> quite a bit about that, as he's had to deal with it at work, and it's cost
> him
> quite a bit of time, as he has to go through every function which wasn't
> marked as final and determine whether it's actuallly supposed to be
> virtual or
> not). Having non-virtual be the default makes functions efficient by
> default.
>

Aye. This, and maybe even a more important result is it makes it _explicit_
by default. Ie, it was intended by the author.
You can tell just by looking how the interface is intended to be used, and
the intent of the programmer who wrote it.

No more guessing, or lengthy searches through all derived classes to find out if it actually is overridden. And what if your interface is public...?

Making a function virtual is a one-way trip, it can never be revoked.
Making it virtual-by-default eliminates the possibility of revoking that
permission to override ever in the future.
The converse isn't true, a non-virtual can safely become virtual at any
later time if it becomes a requirement, or is requested by a customer.

There's some correctness advantages too. A function that's not marked
virtual was obviously not intended to be overridden by design; the author
of the class may have never considered the possibility, and the class might
not even work if someone unexpectedly overrides it somewhere.
If virtual is requested/added at a later time, the author, when considering
if it's a safe change, will likely take the time to validate that the new
usage (which he obviously didn't consider when designing the class
initially) is sound against the rest of the class... this is a good thing
if you ask me.


June 02, 2013
On 2 June 2013 21:46, Joseph Rushton Wakeling <joseph.wakeling@webdrake.net>wrote:

> On 06/02/2013 08:33 AM, Manu wrote:
> > Most of these guys are mathematicians and physicists first, and
> programmers second.
>
> You've hit the nail on the head, but it's also a question of priorities.
>  It's
> _essential_ that the maths or physics be understood and done right.


Well this is another classic point actually. I've been asked by my friends
at Cambridge to give their code a once-over for them on many occasions, and
while I may not understand exactly what their code does, I can often spot
boat-loads of simple functional errors. Like basic programming bugs;
out-by-ones, pointer logic fails, clear lack of understanding of floating
point, or logical structure that will clearly lead to incorrect/unexpected
edge cases.
And it blows my mind that they then run this code on their big sets of
data, write some big analysis/conclusions, and present this statistical
data in some journal somewhere, and are generally accepted as an authority
and taken seriously!

*brain asplode*

I can tell you I usually offer more in the way of fixing basic logical
errors than actually making it run faster ;)
And they don't come to me with everything, just the occasional thing that
they have a hunch should probably be faster than it is.

I hope my experience there isn't too common, but I actually get the feeling
it's more common that you'd like to think!
This is a crowd I'd actually love to promote D to! But the tools they need
aren't all there yet...


It's essential that the programs correctly reflect that maths or physics.
>  It's
> merely _desirable_ that the programs run as fast as possible, or be well
> designed from a maintenance point of view, or any of the other things that
> matter to trained software developers.  (In my day job I have to
> continually
> force myself to _not_ refactor or optimize my code, even though I'd get a
> lot of
> pleasure out of doing so, because it's working adequately and my work
> priority
> is to get results out of it.)
>
> That in turn leads to a hiring situation where the preference is to have
> mathematicians or physicists who can program, rather than programmers who
> can
> learn the maths.  It doesn't help that because of the way academic funding
> is
> made available, the pay scales mean that it's not really possible to
> attract
> top-level developers (unless they have some kind of keen moral desire to
> work on
> academic research); in addition, you usually have to hire them as PhD
> students
> or postdocs or so on (I've also seen masters' students roped in to this
> end),
> which obviously constrains the range of people that you can hire and the
> range
> of skills that will be available, and also the degree of commitment these
> people
> can put into long-term vision and maintenance of the codebase.
>
> There's also a training problem -- in my experience, most physics
> undergraduates
> are given a crash course in C++ in their first year and not much in the
> way of
> real computer science or development training.  In my case as a maths
> undergraduate the first opportunity to learn programming was in the 3rd
> year of
> my degree course, and it was a crash course in a very narrow subset of C
> dedicated towards numerical programming.  And if (like me) you then go on
> into
> research, you largely have to self-teach, which can lead to some very
> idiosyncratic approaches.
>

Yeah, this is an interesting point. These friends of mine all write C code,
not even C++.
Why is that?
I guess it's promoted, because they're supposed to be into the whole 'HPC'
thing, but C is really not a good language for doing maths!

I see stuff like this:
float ***cubicMatrix = (float***)malloc(sizeof(float**)depth);
for(int z=0; z<width; z++)
{
  cubicMatrix[z] = (float**)malloc(sizeof(float**)*height);
  for(int y=0; y<height; y++)
  {
    cubicMatrix[z][y] = (float*)malloc(sizeof(float*)*width);
  }
}

Seriously, float***. Each 1d row is an individual allocation!
And then somewhere later on they want to iterate a column rather than a
row, and get confused about the pointer arithmetic (well, maybe not
precisely that, but you get the idea).


I hope that this will change, because programming is now an absolutely
> essential
> part of just about every avenue of scientific research.  But as it stands,
> it's
> a serious problem.
>


June 02, 2013
On Sunday, 2 June 2013 at 14:34:43 UTC, Manu wrote:
> On 2 June 2013 21:46, Joseph Rushton Wakeling

> Well this is another classic point actually. I've been asked by my friends
> at Cambridge to give their code a once-over for them on many occasions, and
> while I may not understand exactly what their code does, I can often spot
> boat-loads of simple functional errors. Like basic programming bugs;
> out-by-ones, pointer logic fails, clear lack of understanding of floating
> point, or logical structure that will clearly lead to incorrect/unexpected
> edge cases.
> And it blows my mind that they then run this code on their big sets of
> data, write some big analysis/conclusions, and present this statistical
> data in some journal somewhere, and are generally accepted as an authority
> and taken seriously!

You're making this up. I'm sure they do a lot of data-driven
tests or simulations that make most errors detectable. They may
not be savvy programmers, and their programs may not be
error-free, but boat-loads of errors? C'mon.

June 02, 2013
On 06/02/2013 02:26 PM, finalpatch wrote:
> IIRC -inline is a DMD specific switch. Adding this to gdc command line produces this:
> 
> gdc.exe: error: unrecognized command line option '-inline'

GDC and LDC both have their own equivalent flags -- for GDC it's -finline-functions (which as Iain says is already covered by -O3), for LDC it's -enable-inlining.

It probably won't make any difference for LDC either with -O3 enabled, but might be worth checking.

June 02, 2013
On Sunday, 2 June 2013 at 15:59:38 UTC, Joseph Rushton Wakeling wrote:
> GDC and LDC both have their own equivalent flags -- for GDC it's
> -finline-functions (which as Iain says is already covered by -O3), for LDC it's
> -enable-inlining.
>
> It probably won't make any difference for LDC either with -O3 enabled, but might
> be worth checking.

It doesn't affect behavior on -O2 and above.

David
June 02, 2013
On Sunday, 2 June 2013 at 15:53:58 UTC, Roy Obena wrote:
> On Sunday, 2 June 2013 at 14:34:43 UTC, Manu wrote:
>> On 2 June 2013 21:46, Joseph Rushton Wakeling
>
>> Well this is another classic point actually. I've been asked by my friends
>> at Cambridge to give their code a once-over for them on many occasions, and
>> while I may not understand exactly what their code does, I can often spot
>> boat-loads of simple functional errors. Like basic programming bugs;
>> out-by-ones, pointer logic fails, clear lack of understanding of floating
>> point, or logical structure that will clearly lead to incorrect/unexpected
>> edge cases.
>> And it blows my mind that they then run this code on their big sets of
>> data, write some big analysis/conclusions, and present this statistical
>> data in some journal somewhere, and are generally accepted as an authority
>> and taken seriously!
>
> You're making this up. I'm sure they do a lot of data-driven
> tests or simulations that make most errors detectable. They may
> not be savvy programmers, and their programs may not be
> error-free, but boat-loads of errors? C'mon.

I really wish he was making it up. Sadly, he's not.

A lot of HPC scientific code is, at best, horribly fragile.
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18