Thread overview | |||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
January 09, 2007 suggested improvements to D | ||||
---|---|---|---|---|
| ||||
Here are some suggested improvements for D. a <--> b; swaps a and b, same as making a tempvar t and then t=a; a=b; b=t; As was pointed out by Robert W. Floyd ages ago, in his Turing award lecture (but nobody listened) this comes up so often it is worth having special syntax for it. labeled break and continue statements: that was a fine idea to eliminate a lot of gotos. HOWEVER, the most common way in which gotos happen that's not covered by D's break and continue is this: for(some loop condition){ if(something){ goto A; } } fallthru_code; A: break_out_code; is there a nice way to handle these without gotos? One possibility is to add a "default" block at ends of loops: for(some loop condition){ if(something){ break; } }default{ fallthru_code; } break_out_code; popcount and first1: Often one represents a set by means of an array of booleans, or equivalently by the bits of a machine word. Now, the cray used to have a wonderful "popcount" machine-instruction a = popcount(b); which would cause the number of 1-bits in the uint b, to be placed in the integer a. E.g. popcount(21)=3. Also, some machines had a "first1" instruction which would cause, e.g, c=0; printf("here are the locations of the 1bits in b: "); for(a = first1(b); b!=0; b >>= a){ c += a; printf("%d,", c); } printf("\n"); to work as a way to loop thru the elements of the set. This was all very fine. It led to TREMENDOUS speedup and simplification of the right kinds of programs. BUT, later hardware got rid of the instruction and languages did not have these as features. Vicious circle: Hardware designers: no language has popcount and first1, so we see no reason to support it in hardware. Language designers: hardware does not support these, so why should our language? Break the cycle - put these in D! This'll speed up chess programs by a factor of 2. [Aside: Another excellent hardware instruction that as far as I know was never implemented is the "dovetail" and "undovetail" instructions which place the even-indexed bits of a word into a halfword, and the odd-index ones into another (undovetail) and dovetail is the reverse.] Undenying access to arithmetic: hardware goes to a great amount of trouble to provide add-with-carry, multiply-two-singlewords-to-get-a-doubleword, long-division-with-quotient&remainder (if a is a doubleword divide a by b to get quotient q and remainder r, all singlewords), etc. to us. Another good thing available in a lot of hardware is shift-with-carry, and circular shift with and without carry. Why the hell does the language then tell the user to go screw themselves - we intend to deny you access to that power??!!! Hello? This makes it very painful or difficult or slow to build a bignum package. It is just gratuitously asinine. If D delivered this it'd (a) be veyr easy, (b) instantly attract a lot of converts and you could build a portable bignum code that ran about 4 times faster. Warren D. Smith http://RangeVoting.org <-- add your endorsement (by clicking "endorse" as 1st step) and math.temple.edu/~wds/homepage/works.html |
January 09, 2007 Re: suggested improvements to D | ||||
---|---|---|---|---|
| ||||
Posted in reply to Warren D Smith | Warren D Smith wrote: > Here are some suggested improvements for D. [snipped suggestions] 'swap' can easily be done with a function template: void swap(T) (inout T a, inout T b) { T t = a; a = b; b = t; } The second example seems like a perfectly reasonable place to just use goto. The other suggestions seem like precisely the kind of thing we have an inline assembler for. -- Kirk McDonald Pyd: Wrapping Python with D http://pyd.dsource.org |
January 09, 2007 Re: suggested improvements to D | ||||
---|---|---|---|---|
| ||||
Posted in reply to Warren D Smith | Reply to Warren, > popcount and first1: available using intrinsics > to work as a way to loop thru the elements of the set. > This was all very fine. It led to TREMENDOUS speedup and > simplification of the right kinds of programs. > BUT, later hardware got rid of the instruction and languages > did not have these as features. Vicious circle: > Hardware designers: no language has popcount and first1, so we see > no > reason to support it in hardware. > Language designers: hardware does not support these, so why should > our language? > Break the cycle - put these in D! This'll speed up chess programs by > a factor of 2. > IIRC thay can be emulated with a bit not and an add or subtract (I forget exactly how) > Undenying access to arithmetic: > hardware goes to a great amount of trouble to provide add-with-carry, > multiply-two-singlewords-to-get-a-doubleword, > long-division-with-quotient&remainder > (if a is a doubleword divide a by b to get quotient q and remainder r, > all singlewords), etc. to us. [...] > to build a bignum package. It is just gratuitously asinine. If D > delivered this it'd (a) be veyr easy, (b) instantly attract a lot of > converts > and you could build a portable bignum code that ran about 4 > times faster. > Warren D. Smith take a look at this: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D.announce&article_id=6217 Not fantastically clean, but language support for these without using inline ASM would be "difficult". |
January 09, 2007 generalised slicing | ||||
---|---|---|---|---|
| ||||
Posted in reply to Warren D Smith | On Tue, 09 Jan 2007 01:40:25 +0000, Warren D Smith wrote: > Here are some suggested improvements for D. You should post them piecewise on http://all-technology.com/eigenpolls/dwishlist/ > > popcount and first1: > Often one represents a set by means of an array of booleans, or > equivalently by the bits of a machine word. > Now, the cray used to have a wonderful "popcount" machine-instruction > a = popcount(b); > which would cause the number of 1-bits in the uint b, to be placed in > the integer a. > E.g. popcount(21)=3. > Also, some machines had a "first1" instruction which would cause, e.g, > c=0; > printf("here are the locations of the 1bits in b: "); > for(a = first1(b); b!=0; b >>= a){ > c += a; > printf("%d,", c); > } > printf("\n"); > to work as a way to loop thru the elements of the set. > > This was all very fine. It led to TREMENDOUS speedup and > simplification of the right kinds of programs. > BUT, later hardware got rid of the instruction and languages > did not have these as features. Vicious circle: > Hardware designers: no language has popcount and first1, so we see no > reason to support it in hardware. > Language designers: hardware does not support these, so why should > our language? > Break the cycle - put these in D! This'll speed up chess programs by > a factor of 2. > Well if we generalise the concept of array slicing to index generator functions. Then if onebits(b) returned the index for 1-bits in b then you example could be written: foreach(c in onebits(b)) { printf("%d,", c); } and popcount(b) as onebits(b).length expanding the generator functions to more dimensions would be very helpful in numerical calculations. Ex. the generator function symmetric(3) generated the 3 dimensional symmetric permutations e.g.: (0,1,2) (2,0,1) (1,2,0) and antisymmetic(3): (0,2,1) (2,1,0) (1,0,2) then using the vectorzation syntax from http://all-technology.com/eigenpolls/dwishlist/index.php?it=10 the determinant of the 3x3 matrix "a" can be written by. det=sum([i in symmetric(3)](a[0,i[0]]*a[1,i[1]]*a[2,i[2])) -sum([i in antisymmetric(3)](a[0,i[0]]*a[1,i[1]]*a[2,i[2])); and the general nxn case can be written as det=sum([i in symmetric(n)](multiply([j in 0..n](a[j,i[j]]))) -sum([i in antisymmetric(n)](multiply([j in 0..n](a[j,i[j]]))); If the generator function is defined by a constant like symmetric(3) the loop should be unfolded at compiler time for optimal speed. D could use a interface for defining index generator functions such that this is possible. |
January 09, 2007 Re: suggested improvements to D (Swap Operator) | ||||
---|---|---|---|---|
| ||||
Posted in reply to Warren D Smith | Warren D Smith Wrote: > a <--> b; > swaps a and b, same as making a tempvar t and then t=a; a=b; b=t; > As was pointed out by Robert W. Floyd ages ago, in his Turing award > lecture (but nobody listened) this comes up so > often it is worth having special syntax for it. A few simpler symbols: <-> I think this would work best >< Simpler, but the developer could possibly confuse this for another symbol :: I think this is a good symbol, except it can be confused with C's scope operator AFAIK, D makes no use of any of these. |
January 09, 2007 Re: suggested improvements to D | ||||
---|---|---|---|---|
| ||||
Posted in reply to Warren D Smith | == Quote from Warren D Smith (wds@math.temple.edu)'s article
> is there a nice way to handle these without gotos?
Using lazy evaluation seems D-ish here:
bool otherwise( lazy void dg)
{
dg;
return false;
}
void main()
{
while( false || otherwise( printf("Falling through.\n")) ){
// if( something) break
}
printf( "breaking out.\n");
}
|
January 09, 2007 Re: suggested improvements to D | ||||
---|---|---|---|---|
| ||||
Posted in reply to Warren D Smith | Warren D Smith wrote: > Here are some suggested improvements for D. > > a <--> b; > swaps a and b, same as making a tempvar t and then t=a; a=b; b=t; > As was pointed out by Robert W. Floyd ages ago, in his Turing award > lecture (but nobody listened) this comes up so > often it is worth having special syntax for it. As has been pointed out, you can write a swap function for this. Actually, what I'd really like to see is this being made possible: a,b = b,a; Yet another cool thing we could steal from Python :3 > labeled break and continue statements: > that was a fine idea to eliminate a lot of gotos. > HOWEVER, the most common way in which gotos happen that's not covered by D's > break and continue is this: > > for(some loop condition){ > if(something){ goto A; } > } > fallthru_code; > A: > break_out_code; > > is there a nice way to handle these without gotos? One possibility > is to add a "default" > block at ends of loops: > for(some loop condition){ > if(something){ break; } > }default{ > fallthru_code; > } > break_out_code; What I've always wanted was this: for( blah ) { ... } else { ... } The same thing for all loops, actually. And while I'm here, these would be nice, too: foreach( foo ; bar ) { first { // Do something with first element }; else { // Do something with non-first elements }; } > popcount and first1: > ... I'm sure every architecture could point to a cool opcode that isn't directly supported in any language and go "why not?" I won't say anything either way except: I don't really see the need. > [Aside: > Another excellent hardware instruction that as far as I know was never > implemented is > the "dovetail" and "undovetail" instructions > which place the even-indexed bits of a word into a halfword, and the > odd-index ones > into another (undovetail) and dovetail is the reverse.] I think there's some similar stuff in MMX/SSE, might be worth checking out. > Undenying access to arithmetic: I do agree that this is a cardinal sin of high level languages these days, but I wouldn't say they "deny" us access. It's more a case of "haven't worked out a good way of doing it". I mean, I'm not sure how to take > c = a + b; and come up with any half-decent syntax for adding the carry bit into the mix. Actually, I think the way most languages deal with numbers is a bit strange: "+" in D isn't *really* addition: it just looks like it 999 times out of 1000. So I *do* think we should be given access to things like the carry bits, and other features like it without having to resort to assembler, but I'm yet to see a good proposal on how to actually do so. (Libraries aren't a great solution since one of the reasons to use the carry bit is speed: and you just know that all those function calls aren't helping. That reminds me: can we have expression templates? ;P) -- Daniel |
January 09, 2007 Re: suggested improvements to D | ||||
---|---|---|---|---|
| ||||
Posted in reply to Warren D Smith | Warren D Smith wrote: > popcount and first1: > Often one represents a set by means of an array of booleans, or > equivalently by the bits of a machine word. > Now, the cray used to have a wonderful "popcount" machine-instruction > a = popcount(b); > which would cause the number of 1-bits in the uint b, to be placed in > the integer a. > E.g. popcount(21)=3. > Also, some machines had a "first1" instruction which would cause, e.g, > c=0; > printf("here are the locations of the 1bits in b: "); > for(a = first1(b); b!=0; b >>= a){ > c += a; > printf("%d,", c); > } > printf("\n"); > to work as a way to loop thru the elements of the set. > > This was all very fine. It led to TREMENDOUS speedup and > simplification of the right kinds of programs. > BUT, later hardware got rid of the instruction and languages > did not have these as features. We have first1 in the form of the bsf and bsr intrinsics. popcount is being added to X86 together with the next SSE revision, so it's coming back into hardware. I would like to see a library implementation of popcount added. Here's one I wrote some time ago: /** * Calculates the number of set bits in a 32-bit integer. * */ int bitcount(uint x) { // Avoid branches, and the potential for cache misses which // could be incurred with a table lookup. // We need to mask alternate bits to prevent the // sum from overflowing. // add neighbouring bits. Each bit is 0 or 1. x = x - ((x>>1) & 0x5555_5555); // now each two bits of x is a number 00,01 or 10. // now add neighbouring pairs x = ((x&0xCCCC_CCCC)>>2) + (x&0x3333_3333); // now each nibble holds 0000-0100. Adding them won't // overflow any more, so we don't need to mask any more // Now add the nibbles, then the bytes, then the words // We still need to mask to prevent double-counting. // Note that if we used a rotate instead of a shift, we // wouldn't need the masks, and could just divide the sum // by 8 to account for the double-counting. // On some CPUs, it may be faster to perform a multiply. x += (x>>4); x &= 0x0F0F_0F0F; x += (x>>8); x &= 0x00FF_00FF; x += (x>>16); x &= 0xFFFF; return x; } unittest { assert(bitcount(0)==0); assert(bitcount(7)==3); assert(bitcount(0xAA)==4); assert(bitcount(0x8421_1248)==8); assert(bitcount(0xFFFF_FFFF)==32); assert(bitcount(0xCCCC_CCCC)==16); assert(bitcount(0x7777_7777)==24); } > [Aside: > Another excellent hardware instruction that as far as I know was never > implemented is > the "dovetail" and "undovetail" instructions > which place the even-indexed bits of a word into a halfword, and the > odd-index ones > into another (undovetail) and dovetail is the reverse.] Great! I never had a name for those. > Undenying access to arithmetic: > hardware goes to a great amount of trouble to provide add-with-carry, > multiply-two-singlewords-to-get-a-doubleword, > long-division-with-quotient&remainder > (if a is a doubleword divide a by b to get quotient q and remainder r, > all singlewords), etc. to us. > Another good thing available in a lot of hardware is shift-with-carry, > and circular shift with and without carry. > Why the hell does the language then tell the user to go screw > themselves - we intend to deny > you access to that power??!!! > Hello? This makes it very painful or > difficult or slow > to build a bignum package. It is just gratuitously asinine. If D > delivered this it'd (a) be veyr easy, (b) instantly attract a lot of converts > and you could build a portable bignum code that ran about 4 > times faster. Rotations without carry would be very nice. But for the others, are there many applications of these OTHER than bignum arithmetic? My feeling is that you'd always want to do bignum in asm anyway. Note that D is a great language for writing asm in; you can have a single source code shared between Windows, *nix, and the new x86 Macs. |
January 09, 2007 Re: suggested improvements to D | ||||
---|---|---|---|---|
| ||||
Posted in reply to Kirk McDonald | Kirk McDonald escribió: > Warren D Smith wrote: >> Here are some suggested improvements for D. > [snipped suggestions] > > 'swap' can easily be done with a function template: > > void swap(T) (inout T a, inout T b) { > T t = a; > a = b; > b = t; > } This should be in phobos, if it's not allready. -- Leandro Lucarella Integratech S.A. 4571-5252 |
January 09, 2007 Re: suggested improvements to D | ||||
---|---|---|---|---|
| ||||
Posted in reply to Warren D Smith | Reply to Warren, > Here are some suggested improvements for D. > [...] > labeled break and continue statements: > that was a fine idea to eliminate a lot of gotos. > HOWEVER, the most common way in which gotos happen that's not covered > by D's > break and continue is this: > for(some loop condition){ > if(something){ goto A; } > } > fallthru_code; > A: > break_out_code; loop: while(false) { for(some loop condition) { if(something) break loop; } fallthru_code; } break_out_code; the dummy labeled loop body might be just useful enough to make this legal bod: { // labeled compound statement if(cond) break bod; else continue bod; } |
Copyright © 1999-2021 by the D Language Foundation