July 29, 2012
Era Scarecrow:

As you guess I have had to use many arrays of bits in my programs :-)

> 4124 - Set/reset all bits ie:
>   BitArray ba = BitArray(100);
>   ba[] = true;
> Others haven't been done yet.

Among those 4124 suggestions the most important, and very simple to implement, are:
- set n-th bit (this can be a little more efficient than bs[n]=1;)
- reset n-th bit (this can be a little more efficient than bs[n]=1;)

Another commonly needed operation is a very fast bit count. There are very refined algorithms to do this.


> 7487 - Done. When prepending it extends the array to size_t extra and slowly backtracks until it needs to do it again.

Leaving a bit of "capacity" at the beginning is a good idea.


> 7488 - Done, union used and is 'compact' version (by default or resized and can fit)

Good, this makes BitArray usable in many other situations :-)


>> Related:
>> http://d.puremagic.com/issues/show_bug.cgi?id=6697
>
>  Not so sure. Could make a multi-dimentional one that returns slices to various sections, but  that's iffy. I'd need an example of how you'd use it with BitArray before I could build a solution.

I have needed many times a fast BitMatrix, but I see it as a data structure distinct from BitArray, so don't worry about making them inter-operate.

For me a BitMatrix must be as fast as possible, even if this causes some waste of memory (see my implementation in the attach of issue 6697) and I use it for medium and large bit matrices. Generally I don't to count the set bits in a bit matrix. So the two data structures need different capabilities and they are better implemented in different ways (this means a FastBitMatrix is not an array of a BitArray!).

Taking a row of a FastBitMatrix and turning it into a a BitArray sounds cute, but generally I don't need to do this operation, so I don't care about this feature.

Bye,
bearophile
July 29, 2012
On Sunday, 29 July 2012 at 14:41:48 UTC, bearophile wrote:
> As you guess I have had to use many arrays of bits in my programs :-)

 I'll do what I can :)

>> 4124 - Set/reset all bits ie:
>>  BitArray ba = BitArray(100);
>>  ba[] = true;
>> Others haven't been done yet.
>
> Among those 4124 suggestions the most important, and very simple to implement, are:
> - set n-th bit (this can be a little more efficient than bs[n]=1;)
> - reset n-th bit (this can be a little more efficient than bs[n]=1;)

 If it has to determine between compact and non-compact, then I really don't see how any speed improvement can be made, and if it's optimized it may be inlined which would be about as fast as you could get.

> Another commonly needed operation is a very fast bit count. There are very refined algorithms to do this.

 Likely similar to the hamming weight table mentioned in TDPL. Combined with the canUseBulk I think I could make it fairly fast.

>> 7487 - Done. When prepending it extends the array to size_t extra and slowly backtracks until it needs to do it again.
>
> Leaving a bit of "capacity" at the beginning is a good idea.

 With how it's made for speed and slices, it had to :P

>> 7488 - Done, union used and is 'compact' version (by default or resized and can fit)
>
> Good, this makes BitArray usable in many other situations :-)

 Yes, currently a debate of it's partial ref/value issue is coming up. As it is it's usable, and if you want to be sure it's unique you can always dup it, otherwise as a previous suggestion I'll either make two array types for either full reference vs value, or take his separate slices (reference) idea.

>>> Related:
>>> http://d.puremagic.com/issues/show_bug.cgi?id=6697
>>
>> Not so sure. Could make a multi-dimentional one that returns slices to various sections, but  that's iffy. I'd need an example of how you'd use it with BitArray before I could build a solution.
>
> I have needed many times a fast BitMatrix, but I see it as a data structure distinct from BitArray, so don't worry about making them inter-operate.

 A Matrix is just a multidimensional array right? I'll have to read up on it a bit; doing simple division (Likely whole rows of 32bit at a time) would do that job, but expanding width several times (compared to height) would be quite a bit slower; Although not too horrible.

> For me a BitMatrix must be as fast as possible, even if this causes some waste of memory (see my implementation in the attach of issue 6697) and I use it for medium and large bit matrices. Generally I don't to count the set bits in a bit matrix. So the two data structures need different capabilities and they are better implemented in different ways (this means a FastBitMatrix is not an array of a BitArray!).

 more likely bool array/matrix.. Although if you heavily use specific lines it could convert to bool[], and then back again, but if you go all over the place it would be worse than just using bool[].

> Taking a row of a FastBitMatrix and turning it into a a BitArray sounds cute, but generally I don't need to do this operation, so I don't care about this feature.

July 29, 2012
Era Scarecrow:

>  If it has to determine between compact and non-compact, then I really don't see how any speed improvement can be made,

Maybe it's better to consider two use cases: dynamic length allocated on the heap (or allocated with a given allocator, often a stack-styled allocator), and fixed-size allocated in place, so there's no need for that run-time test.


> and if it's optimized it may be inlined which would be about as fast as you could get.

I think that  "a[x] = 1;"  is slower than  "a.set(x);"  even if the array a is allocated in place and everything is inlined. This is probably the most important operation an array of bits has to support. If this is missing, I have to use my own implementation still.


>  Likely similar to the hamming weight table mentioned in TDPL. Combined with the canUseBulk I think I could make it fairly fast.

There is lot of literature about implementing this operation efficiently. For the first implementation a moderately fast (and short) code is probably enough. Later faster versions of this operation will go in Phobos, coming from papers.


>  Yes, currently a debate of it's partial ref/value issue is coming up. As it is it's usable, and if you want to be sure it's unique you can always dup it, otherwise as a previous suggestion I'll either make two array types for either full reference vs value, or take his separate slices (reference) idea.

The idea of making two different types (for the dynamic and static versions) sounds nice.

There is a third use case, that is bit arrays that start small and can grow, and rarely grow bigger than one or two words. This asks for a dynamic-length bit array type that allocates locally only when it's small (so it needs to perform run-time test of the tag. Too bad the D GC doesn't support tagged pointers!). But probably this use case is not common enough to require a third type of array. So I suggest to not worry about this case now, and to focus on the other two more common ones.


>  A Matrix is just a multidimensional array right?

Well, yes, but there are many ways to implement arrays and nD arrays. Sometimes the implementations differ.


> I'll have to read up on it a bit; doing simple division (Likely whole rows of 32bit at a time) would do that job,

Take a look at my implementation in Bugzilla :-) It's another type, the third. The 1D and 2D cases are the most common.


> but expanding width several times (compared to height) would be quite a bit slower; Although not too horrible.

A 2D array of bits probably doesn't need to change its size after creation, this is not a common use case.


>  more likely bool array/matrix.. Although if you heavily use specific lines it could convert to bool[], and then back again, but if you go all over the place it would be worse than just using bool[].

It's not so simple :-) A bool[] causes many more cache misses, if the matrix is not tiny. A bool[] or bool[][] is a good idea only if the matrix is very small, or if you don't have a bit matrix library and you don't want to write lot of code. Converting to bool[] the current line is not a good idea.

Bye,
bearophile
July 30, 2012
On Sunday, 29 July 2012 at 12:39:13 UTC, Era Scarecrow wrote:
>  But having them statically separated by name/type seems much more likely to be safer in the long run with reliable results.

 A question regarding templates. A template with different parameters is completely incompatible correct? So...

struct X(T) {
}

alias X!bool XB;
alias X!int XI;

void func(XB xb) {

}

func(XB()); //works
func(XI()); //should fail to compile

Same if it was built differently? so...

template X(bool something) //'bool something' should be statically checkable
  struct XY {
    static if (something) {
      //conditional functions or code
    }
    XY makeX(){
      //XY type is only currently formed template version correct?
      return XY();
    }
  }
}

//against above func, with these two...
alias X!(false).XY XB;
alias X!(true).XY XI;

Course if there's a way to avoid asking about the inner XY, I wonder how I would do that.


 Now if all that is correct, say I want to make two functions that both use X, but are not compatible, but template functions will allow it. So...

void tempFunc(T, U)(T t, U u)
if(
  //What do i enter to check they are both template X or struct XY? As signatures will be different...
)
body{
  //now acts as a middle-man to be compatible for said operation
}


 Finally, I came across an anomaly and may just be a bug. What about a postblits?


T func2(T)(T x) @safe pure {
  return T();
}

struct XY {
  this(this) @safe pure {} //safe pure added so func can call it
  void func(XY x) @safe pure {
    XY y = x; //postblits called on all three lines
    func2(x);
    func2(y);
  }
}

template X(bool something) {
struct XY {
    this(this) @safe pure {}
    void func(XY x) @safe pure {
      XY y = x; //Error: see below
      func2(x);
      func2(y);
    }
  }
}

alias X!(true).XY Xtrue;

pure function 'func' cannot call impure function '__cpctor'
safe function 'func' cannot call system function '__cpctor'
July 30, 2012
On 30-Jul-12 23:50, Era Scarecrow wrote:
> On Sunday, 29 July 2012 at 12:39:13 UTC, Era Scarecrow wrote:
>>  But having them statically separated by name/type seems much more
>> likely to be safer in the long run with reliable results.
>
>   A question regarding templates. A template with different parameters
> is completely incompatible correct? So...
>

Yeah, like twin brothers the have different IDs :)

> struct X(T) {
> }
>
> alias X!bool XB;
> alias X!int XI;
>
> void func(XB xb) {
>
> }
>
> func(XB()); //works
> func(XI()); //should fail to compile
>
> Same if it was built differently? so...
>
> template X(bool something) //'bool something' should be statically
> checkable
>    struct XY {
>      static if (something) {
>        //conditional functions or code
>      }
>      XY makeX(){
>        //XY type is only currently formed template version correct?
>        return XY();
>      }
>    }
> }
>
> //against above func, with these two...
> alias X!(false).XY XB;
> alias X!(true).XY XI;
>
> Course if there's a way to avoid asking about the inner XY, I wonder how
> I would do that.
>
>
>   Now if all that is correct, say I want to make two functions that both
> use X, but are not compatible, but template functions will allow it. So...
>
> void tempFunc(T, U)(T t, U u)
> if(
>    //What do i enter to check they are both template X or struct XY? As
> signatures will be different...
> )
> body{
>    //now acts as a middle-man to be compatible for said operation
> }
>

Not sure what you would like to accomplish here.

> alias X!(true).XY Xtrue;
>
> pure function 'func' cannot call impure function '__cpctor'
> safe function 'func' cannot call system function '__cpctor'

Yeah, it's a bug. File it please.

-- 
Dmitry Olshansky
July 30, 2012
On Mon, Jul 30, 2012 at 9:50 PM, Era Scarecrow <rtcvb32@yahoo.com> wrote:

>  A question regarding templates. A template with different parameters is
> completely incompatible correct?

Correct. They have no reason, in general, too even generate the same code:

template Chameleon(T)
{
    static if (is(T == struct))
        enum Chameleon = "I'm a string!"; // Chamelon!(SomeStruct) is
a string value
    else static if (is(T == class))
        void Chameleon(int i) { return i+1;} // Chameleon!(SomeClass)
is a function
    else
        alias T[int] Chameleon; // Here Chameleon is a type
}

The same for you struct S(T) { ... }. Inside the braces, anything can
happen (mwwaahhha hhaa!), depending on T.

> So...
>
> struct X(T) {
> }
>
> alias X!bool XB;
> alias X!int XI;
>
> void func(XB xb) {
>
> }
>
> func(XB()); //works
> func(XI()); //should fail to compile

Yes.

>  Now if all that is correct, say I want to make two functions that both use
> X, but are not compatible, but template functions will allow it. So...

I'm not sure I understand what you're trying to do. Do you mean you want a function that accept X!(T)s for any T, and not any other type?

struct X(T) {}

void func(T)(X!T x)
{}

void main()
{
    X!bool b;
    X!int i;
    func(b);
    func(i);
}

> void tempFunc(T, U)(T t, U u)
> if(
>   //What do i enter to check they are both template X or struct XY? As
> signatures will be different...
> )
> body{
>   //now acts as a middle-man to be compatible for said operation
> }

Here you want a constraint that checks that U and T are both
X!(SomeType) or an XY?

void tempFunc(T,U)(T t, U u) if (is(T a == X!(SomeType), SomeType) &&
is(U a == X!(SomeType), SomeType)
                                                || is(T == XY) && is(U == XY))
{
...
}


Is that what you need?
July 30, 2012
On Monday, 30 July 2012 at 20:19:51 UTC, Dmitry Olshansky wrote:
> Not sure what you would like to accomplish here.

Than an example...

struct BitArray { //assume template...
  ref BitArray opSliceAssign(T)(const T ba, int start, int end)
  if (
   //if T is type bitArray but only a different change regarding parameters
   //we can use canUseBulk and other speedups. Otherwise a range is required.
   //checking all possibilities seems silly, but possible
  )
  body {
    //slice specific speedup and code
    return this;
  }
  int opEquals(T)(const T ba) const
  //if constraint as above
  body {}
}

Code wise; If BitSet is always value semantics and BitArray is always reference (a previous post) and slices/groups work basically the same; then...

BitSet bs = BitSet(100);
BitArray ba = BitArray(100);

 bs[0 .. 10] = ba[0 .. 10]; //should work
 assert(bs[0 .. 10] == ba[0 .. 10]); //should also be possible to work.

>> alias X!(true).XY Xtrue;
>>
>> pure function 'func' cannot call impure function '__cpctor'
>> safe function 'func' cannot call system function '__cpctor'
>
> Yeah, it's a bug. File it please.

Alright... Considered a major (Maybe even blocking).

 Reminds me. Due to how rvalue/lvalues work, assuming I need to const reference something but I don't care if it's a temporary or not, would I then do...

struct X {
  int opCmp(const X x) const { //catch temporary
    return opCmp(x); //becomes reference, as it's now named 'x'
  }
  int opCmp(const ref X x) const {
    //do work here
    return 0;
  }
}

X x;
assert(x == x);   //ref
assert(x == X()); //temporary

 course if the 'in' keyword is the key i will have to test that to make sure.
July 30, 2012
On Monday, 30 July 2012 at 21:03:39 UTC, Era Scarecrow wrote:
> Alright... Considered a major (Maybe even blocking).

http://d.puremagic.com/issues/show_bug.cgi?id=8475
July 30, 2012
On Monday, 30 July 2012 at 20:48:26 UTC, Philippe Sigaud wrote:
>> Now if all that is correct, say I want to make two functions that both use X, but are not compatible, but template functions will allow it. So...
>
> I'm not sure I understand what you're trying to do. Do you mean you want a function that accept X!(T)s for any T, and not any other type?

 Anything of template X! previous post of mine shows sort of an example.

> struct X(T) {}
>
> void func(T)(X!T x)
> {}
>
> void main()
> {
>     X!bool b;
>     X!int i;
>     func(b);
>     func(i);
> }

 Hmmm i do think that seems right... but if it contains multiple parameters, then...?

template X(x1, x2, x3) {
 struct XT {}
}

> void func(T)(X!T x) {}
Will this still work?

> Here you want a constraint that checks that U and T are both
> X!(SomeType) or an XY?

 As before, of Template X, and struct XY, but T (or sometype) doesn't matter.

> void tempFunc(T,U)(T t, U u) if (is(T a == X!(SomeType), SomeType) &&
> is(U a == X!(SomeType), SomeType)
>                                                 || is(T == XY) && is(U == XY))
> {
> ...
> }
> Is that what you need?

 I want to say no... But I haven't actually tested it for my use cases. Leaving it unconstrained (without checking) to make it work is a disaster waiting to happen: I want T and U to both be of the same template (X in this case) but not the same instantiation arguments.

 I hope I'm not repeating myself too much.
July 30, 2012
On 31-Jul-12 01:03, Era Scarecrow wrote:
> On Monday, 30 July 2012 at 20:19:51 UTC, Dmitry Olshansky wrote:
>> Not sure what you would like to accomplish here.
>
> Than an example...
>

You can go for simpler separation:

struct BitArray{
//() - is an empty template spec
	ref BitArrayopSliceAssign()(const BitArray ba, int start, int end)
{
//two bit array can try balk mode etc.

>   Reminds me. Due to how rvalue/lvalues work, assuming I need to const
> reference something but I don't care if it's a temporary or not, would I
> then do...
>
> struct X {
>    int opCmp(const X x) const { //catch temporary
>      return opCmp(x); //becomes reference, as it's now named 'x'
>    }
>    int opCmp(const ref X x) const {
>      //do work here
>      return 0;
>    }
> }
>
> X x;
> assert(x == x);   //ref
> assert(x == X()); //temporary
>
>   course if the 'in' keyword is the key i will have to test that to make
> sure.
in == scope const not sure what scope buys here but couldn't hurt.

-- 
Dmitry Olshansky