February 02, 2005 Re: initializer syntax | ||||
---|---|---|---|---|
| ||||
Posted in reply to Vathix | "Vathix" <vathix@dprogramming.com> wrote in message news:opslk5o5vckcck4r@esi... > > P.P.S. it's been suggested that the special initializer syntax: > > = void; > > mean "I know what I'm doing, don't initialize the variable" and I've been > > considering implementing it. > > I like it, but will it work with 'new'? No. > When newing arrays and value types > one might also not want to initialize. True, but I don't think that's a good idea. The cases where initialization of an array *might* make a difference (the critical path in a program tends to be only in a small part of it) are so unusual it is not worth upsetting new. And frankly, uninitialized garbage in gc allocated data can cause problems with the mark/sweep algorithm, and would pull the rug out from doing a future type-aware gc. Use std.c.malloc for allocating uninitialized arrays; if it must be new'd, instead use a wrapper class that malloc's/free's an internal private array. |
February 02, 2005 Re: [performance]PreInitializing is an annoyance | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter | "Walter" wrote:
[...]
> I honestly think that in a non-trivial program, you'd be very, very hard pressed to see a measurable difference in program performance from this.
The measure to b taken in the case I put in this discussion is the steadiness of the run by lazy initializing. This costly in total.
-manfred
|
February 02, 2005 Re: [performance]PreInitializing is an annoyance | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter | Walter wrote: > Currently, the only cases where C fits better are: > > 1) you need to work with existing C code > 2) there isn't a D compiler for the target > 3) you're working with a tool that generates C code > 4) your staff is content using C and will not try anything else 5) You want your code to look pretty: http://www.de.ioccc.org/2004/anonymous.c :-) |
February 02, 2005 Re: [performance]PreInitializing is an annoyance | ||||
---|---|---|---|---|
| ||||
Posted in reply to Norbert Nemec | "Norbert Nemec" <Norbert@Nemec-online.de> wrote in message news:ctrj9i$146d$1@digitaldaemon.com... > Walter wrote: > > Currently, the only cases where C fits better are: > > > > 1) you need to work with existing C code > > 2) there isn't a D compiler for the target > > 3) you're working with a tool that generates C code > > 4) your staff is content using C and will not try anything else > > 5) You want your code to look pretty: > http://www.de.ioccc.org/2004/anonymous.c > :-) > You might want to check out: http://fly.srk.fer.hr/ioccc/years.html#1986_bright <g> |
February 03, 2005 Re: [performance]PreInitializing is an annoyance | ||||
---|---|---|---|---|
| ||||
Posted in reply to Norbert Nemec | On 2005-02-02 02:18:42 -0600, Norbert Nemec <Norbert@Nemec-online.de> said: > Writing assembler to get performance is about as outdated as counting > cycles. I disagree with this statement so very much I wouldn't even know where to begin. Granted it's all depends on the situation and I'm not talking about writing generalized code. It's pointless to continue this because the argument is as old as a PDP-11 rotting away in an MIT basement. We could debate this till were blue in the face I won't convince you to break out an assembler and you wont convince me that a compiler can (or should) do it better. Were just going to have to agree to disagree even though I don't even know what the point of this tread is supposed to be anymore. > As I said before: if the code is simple enough to write it in > assembler, it is also simple enough for a reasonable compiler to optimize > it to the same extent. > > To exploit the full power of a modern processor, you have to do the right > amount of loop unrolling, loop fusing, command interlacing and so on. YOu > have to play with the data layout in memory, perhaps chunking arrays into > smaller pieces. There are several more techniques to use, when you want to > make full use of pipelining, branch prediction, cache lines and so on. Maybe. Or if you had a copy of the CPU's programmer's manual you could just inline a nice slick column of opcodes and do what you want exactly instead of crossing your fingers when you type make or spending all day with a profiler trying various C idioms to various results. I'd rather take the compilers assembly output, grumble once, rewrite it properly and inline it back in. > Languages like Fortran 95 would in principle allow to compiler to do all of > this automatically (Some good implementations begin to emerge.) Well you're most certainly never going to convince me to code in Fortran. > Doing all of it by hand in C results in complete spaghetti code, but it is > possible if you know exactly what you are doing. (In a course we did, we > eventually transformed one single loop into an equivalent of ~500 lines of > highly optimized spaghetti. The result was ten times faster than the > original and somewhere around 80% of the absolute theoretical limit of the > processor. 500 line Duff devices don't impress me. They make me want to do everybody a big favor and promptly delete the last copy of the offending source file on the spot. > The result was still pure C and therefore completely portable. The > performance was, of course, tuned to one specific architecture, but there > were basically constants to adjust for tuning it for about any modern > processor. > > Doing the same thing in assembler would probably not be much faster. (After > you went from 8% to 80%, the remaining factor of 1.25 probably isn't worth > the effort. 80% peak performance is already well beyond what people usually > go for.) > > Furthermore, writing that kind of spaghetti code in C without getting an > error in already needs a lot of discipline. Doing the same thing in > assembler will probably land you in the next psychiatry... Heh, you obviously don't know machine. If that's the kind of stuff you want to write, you go for it man. I'd rather just put some inline SIMD code in an asm block. I don't care how good you think your Fortran compiler or disciplined spegetti code is, it's never gonna know how to fill up all vector piplines to normalize 16 vectors for the price of 4 or fire off a DMA chain to blit gigabytes of data at max utilization. But it's a free world. You can loop unroll, fuse, and chunk arrays if you want. |
February 03, 2005 Re: [performance]PreInitializing is an annoyance | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ben Hinkle | On 2005-02-01 20:59:16 -0600, "Ben Hinkle" <ben.hinkle@gmail.com> said:
> D is aiming to support bare-metal programming, from what I gather from the Major Goals section of http://www.digitalmars.com/d/overview.html:
> "Provide low level bare metal access as required"
> D and Java are lightyears apart in terms of bare-metal access.
>
> The existing way to get gobs of uninitialized memory is to call std.c.stdlib.malloc and manage the memory by hand. That's fine and dandy but we want more. What was that old Queen song? "I want it all and I want it now"? Sounds good to me. :-)
Sorry, Ben. I was out of line with the Java comment. A little moment of blunt humor got the better of me. ;-) D most certainly can't be compared with Java in that (and many other) respects. That wasn't the intent of my point.
I'm just saying anyone who thinks they should be able to do a "ubyte[1<<30] videoData;" and thinks it *should* be optimal or else "we need to fix the compiler" deserves the headache they're going to get. ;-)
But I'm afraid malloc isn't the answer either. More like direct memory mapping, ie: mmap/mlock.
|
February 03, 2005 Re: [performance]PreInitializing is an annoyance | ||||
---|---|---|---|---|
| ||||
Posted in reply to Brian Chapman |
Brian Chapman wrote:
> ...it's never gonna know how to
> fill up all vector piplines to normalize 16 vectors for the price of 4 or fire off a DMA chain to blit gigabytes of data at max utilization.
Why?
|
February 04, 2005 Re: [performance]PreInitializing is an annoyance | ||||
---|---|---|---|---|
| ||||
Posted in reply to Georg Wrede | On 2005-02-03 09:59:01 -0600, Georg Wrede <georg.wrede@nospam.org> said:
> Brian Chapman wrote:
>
>> ...it's never gonna know how to
>> fill up all vector piplines to normalize 16 vectors for the price of 4 or fire off a DMA chain to blit gigabytes of data at max utilization.
>
> Why?
Excellent question. All enlightenment begins with asking a good "why?"
If you really want to understand, I would invite you to start by reading some of the great information available at arstechnica.com.
|
February 04, 2005 Re: [performance]PreInitializing is an annoyance | ||||
---|---|---|---|---|
| ||||
Posted in reply to Brian Chapman | Brian Chapman wrote: > On 2005-02-03 09:59:01 -0600, Georg Wrede <georg.wrede@nospam.org> said: > >> Brian Chapman wrote: >> >>> ...it's never gonna know how to >>> fill up all vector piplines to normalize 16 vectors for the price of 4 or fire off a DMA chain to blit gigabytes of data at max utilization. >> >> >> Why? > > > Excellent question. All enlightenment begins with asking a good "why?" Thank you! > If you really want to understand, I would invite you to start by reading some of the great information available at arstechnica.com. Well, I at least hope the answer would be of general interest in this forum. Also, you seem to have a good idea of "why", based on the above quote. So, essentially a short(ish) answer would be appreciated. |
February 04, 2005 Re: [performance]PreInitializing is an annoyance | ||||
---|---|---|---|---|
| ||||
Posted in reply to Brian Chapman | Brian Chapman wrote: > On 2005-02-02 02:18:42 -0600, Norbert Nemec <Norbert@Nemec-online.de> said: > >> Writing assembler to get performance is about as outdated as counting cycles. > > I disagree with this statement so very much I wouldn't even know where to begin. Granted it's all depends on the situation and I'm not talking about writing generalized code. It's pointless to continue this because the argument is as old as a PDP-11 rotting away in an MIT basement. We could debate this till were blue in the face I won't convince you to break out an assembler and you wont convince me that a compiler can (or should) do it better. Were just going to have to agree to disagree even though I don't even know what the point of this tread is supposed to be anymore. I agree - I've discussed with several people on this and hardly ever came to a conclusion. High-performace numerics experts would certainly agree on it. Old-school assemblists would never... >> To exploit the full power of a modern processor, you have to do the right amount of loop unrolling, loop fusing, command interlacing and so on. YOu have to play with the data layout in memory, perhaps chunking arrays into smaller pieces. There are several more techniques to use, when you want to make full use of pipelining, branch prediction, cache lines and so on. > > Maybe. Or if you had a copy of the CPU's programmer's manual you could just inline a nice slick column of opcodes and do what you want exactly instead of crossing your fingers when you type make or spending all day with a profiler trying various C idioms to various results. I'd rather take the compilers assembly output, grumble once, rewrite it properly and inline it back in. Well - do so, if you like to, just to realize that once you've spent hours optimizing your code >> Languages like Fortran 95 would in principle allow to compiler to do all of this automatically (Some good implementations begin to emerge.) > > Well you're most certainly never going to convince me to code in Fortran. Me neither, that's why I would like to see the same features in D - so far, Fortran 95 is the only widely-spread language with that kind of performance. >> Doing all of it by hand in C results in complete spaghetti code, but it is possible if you know exactly what you are doing. (In a course we did, we eventually transformed one single loop into an equivalent of ~500 lines of highly optimized spaghetti. The result was ten times faster than the original and somewhere around 80% of the absolute theoretical limit of the processor. > > 500 line Duff devices don't impress me. They make me want to do everybody a big favor and promptly delete the last copy of the offending source file on the spot. The algorithm was simple but nontrivial: solving partial differential equations. The original code was not stupidly coded, but just straightforward, as anyone would write it at the first shot unless they think of tricky issues of modern processor architecture. Back in the cycle counting times, the latter version would have been even slower, since it did many integer operations that the original did not need. > Heh, you obviously don't know machine. If that's the kind of stuff you want to write, you go for it man. I'd rather just put some inline SIMD code in an asm block. I don't care how good you think your Fortran compiler or disciplined spegetti code is, it's never gonna know how to fill up all vector piplines to normalize 16 vectors for the price of 4 or fire off a DMA chain to blit gigabytes of data at max utilization. Why shouldn't it? As long as the compiler has the chance to reorder the instructions within certain constraints and has enough intelligence built-in to search for the optimum order it may do a pretty good job at crunching the numbers and find something quite efficient. The behaviour of the pipeline follows very strict rules that are different for each architecture. You put all the rules into a file and the compiler will optimize for a given architecture. Of course, this can only be done if the language gives the necessary flexibility. This is exactly the point why I believe that vectorized expressions in D are essential for high-performance computing. > But it's a free world. You can loop unroll, fuse, and chunk arrays if you want. I don't care about doing that myself. I would like to teach it to a compiler. |
Copyright © 1999-2021 by the D Language Foundation