bitfields - 4425 bells & whistles (page 6) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Learn » bitfields - 4425 bells & whistles (page 6)

August 02, 2012

bitfields - 4425 bells & whistles

Posted by Era Scarecrow
in reply to Era Scarecrow

Era Scarecrow

Posted in reply to Era Scarecrow

Following from issue list:

bearophile_hugs 2010-10-04 18:16:44 PDT
>Another possible feature is to make std.bitmanip.bitfields generate two
versions of the code, that get compiled conditionally according to the CPU
endianess:

> static if (std.system.endian == std.system.Endian.BigEndian) {
>    ...
> } else {
>     ...
> }

Having it endian specific brings up a couple questions.

 Which endian is 'correct'? Resolutions could be...
  a) Both are correct (Leave as is)
  b) big/little is correct and should always convert to the other (which one?)
  c) Specify correct type as a flag, then swaps for the other in generated code

 Second, which would be the best approach?
  a) swap bytes in a temporary, modify the temporary, replace storage? (In every function)
  b) use current endianness, then prepping saving/loading have it swap? (very unlikely)
  c) use ubyte array and use shifting on each byte (small Endian, also allows for odd byte sizes, not very likely, but possible)

 I know an instruction for the Pentium system (686+?) bswap will swap the appropriate bytes as a conversion and takes a single cycle, using assembler that can be incorporated assuming it's ready for it (course than the template would be updated more likely and not the bitfield template).

August 02, 2012

bitfield - 4935/4937 Allow repeated (identical) bitfield declarations

Posted by Era Scarecrow
in reply to Era Scarecrow

Era Scarecrow

Posted in reply to Era Scarecrow

repost from issue list:

Austin Hastings 2010-09-24 18:06:53 PDT
> Also, I think I'm going to request that repeated bitfield definitions be allowed if they are identical - I'd like to redeclare "opcode" rather than "".

How would you tell them apart? If I know how you may want to call them, I may
be able to make something. I can understand with registers, but still need some
way to work with them. Perhaps as a set then?

> So I would like the bitmanip code to permit redeclaration of bitfields that are identical in all respects.

> That is, obviously the names are the same, but the field width, offset, and type representation has to be the same as well.

Maybe....?

struct S {
    mixin(bitfields!(
        uint, "opcode", 4,
        uint, "register", 4,
        uint, "register", 4,
        uint, "register", 4
    ));
}

and using the registers would have function signature like...

struct Register {
  uint register_1;
  uint register_2;
  uint register_3;

}

//setters, likely can't be @propery
void register(uint reg1, uint reg2, uint reg3);
void register(uint[] register ...); //maybe?
void register(Register register);

//getter ??
Register register() const;

Or perhaps...

struct S {
    mixin(bitfields!(
        uint, "opcode", 4,
        uint, "reg1", 4,
        uint, "reg2", 4,
        uint, "reg3", 4
    ));
    mixin(sharedNameSet(
        "nameForGetterAndSetter",
        "struct name for returning/passing",
        "reg1", "reg2", "reg3" //named variables as a set
    ));

    //nameForGetterAndSetter's would be added here, perhaps as above.
}

August 03, 2012

Re: Removing a pull...

Posted by Era Scarecrow
in reply to David Nadlinger

Era Scarecrow

Posted in reply to David Nadlinger

On Thursday, 2 August 2012 at 11:26:42 UTC, David Nadlinger wrote:
> Hm? You can just close pull requests (button at the bottom of the Page), and reopen them later.

 So I've removed the pull, but the auto tester is still pulling both code sources, so it's failing automatically because of it. Do I close the bitfields pull too? Or do I open another/new pull for it?

December 31, 2012

BitArray/BitFields - Resumed and polishing

Posted by Era Scarecrow
in reply to Jonathan M Davis

Era Scarecrow

Posted in reply to Jonathan M Davis

 As BitArray is coming back up and my resumed work I'll comment a few questions and suggestions and go from there. I'll polish up the code and try to resubmit it.

On Saturday, 28 July 2012 at 21:07:31 UTC, Jonathan M Davis wrote:
> I would point out that while hasSlicing doesn't currently check for it, if opSlice doesn't return a type which is assignable to the original range, it won't work in a lot of Phobos functions. I keep meaning to bring that up for discussion in the main newsgroup. I'd argue that hasSlicing really be changed from

 Hmmm.. Well Since the 'sometimes ref' as discussed won't seem to work well from before I'm considering to put the following two rules into effect.
1) If it's an allocated block, it stays allocated.
2) If you dup a BitArray and it's small enough to be compact, it converts.
3) If it's a small/compact array and you give out a slice, it converts the original block to dynamic before returning the new slice.

 With 1 & 2, reserved and length work a little differently, more ref friendly.
 With 3 the only issue comes up with if it's const, or a local variable.

 const BitArray x = BitArray(32);
 func(x);

 Would the function get passed the original compact buffer? Copy of x or a dup?
 May not be so much an issue. However...

 BitArray x = BitArray(32);
 const y = x[]; //slice

 Now if y points to the actual compact buffer and we do the following...

 x.length = 256; //allocates if it was compact

 Although y will have const access the data instead of a compact array it's a pointer to an array and broken. If during the x call it reallocated then y would work fine no matter what in that case. If x can't reallocate it, then the issue remains but is much smaller than before, most likely a dup would be safest.

 Thoughts? Questions? Ideas?

January 02, 2013

Re: BitArray/BitFields - Resumed and polishing

Posted by Dmitry Olshansky
in reply to Era Scarecrow

Dmitry Olshansky

Posted in reply to Era Scarecrow

12/31/2012 9:35 PM, Era Scarecrow пишет:
>   As BitArray is coming back up and my resumed work I'll comment a few
> questions and suggestions and go from there. I'll polish up the code and
> try to resubmit it.

Yay!

> On Saturday, 28 July 2012 at 21:07:31 UTC, Jonathan M Davis wrote:
>> I would point out that while hasSlicing doesn't currently check for
>> it, if opSlice doesn't return a type which is assignable to the
>> original range, it won't work in a lot of Phobos functions. I keep
>> meaning to bring that up for discussion in the main newsgroup. I'd
>> argue that hasSlicing really be changed from
>
>   Hmmm.. Well Since the 'sometimes ref' as discussed won't seem to work
> well from before I'm considering to put the following two rules into
> effect.
> 1) If it's an allocated block, it stays allocated.
> 2) If you dup a BitArray and it's small enough to be compact, it converts.
> 3) If it's a small/compact array and you give out a slice, it converts
> the original block to dynamic before returning the new slice.
>
>   With 1 & 2, reserved and length work a little differently, more ref
> friendly.
>   With 3 the only issue comes up with if it's const, or a local variable.
>

Personally I believe that if we introduce a slice of a BitArray as a separate range type that represents a view of array (but can't be appended to, etc. it's just a RA range with slicing).

>   const BitArray x = BitArray(32);
>   func(x);
>

Then here x is passed either by ref or a copy (depending on func).

>   Would the function get passed the original compact buffer? Copy of x
> or a dup?
>   May not be so much an issue. However...
>
>   BitArray x = BitArray(32);
>   const y = x[]; //slice
>
Then y has type BitArraySlice and references x.

>   Now if y points to the actual compact buffer and we do the following...
>
>   x.length = 256; //allocates if it was compact
>
>   Although y will have const access the data instead of a compact array
> it's a pointer to an array and broken.

Then IMO that should be a slice invalidation (akin to c++ iterator invalidation). Depending on how slice is implemented it could avoid being invalidated in this case but in some others it surely will (e.g. shrinking).

> If during the x call it

You mean x[] call i.e. opSlice() call?
Seems like an overkill and semantically bad as I totally expect that the following to be true for any Array-like container:
auto cont = ...;
auto range = cont[];
range[0] = 1;
assert(cont[0] == 1);

Also:
foreach(v; x) --> foreach(v; x[]) if x defines opSlice

however since BitArray defines opApply it's called instead so this one is probably OK.

> reallocated then y would work fine no matter what in that case. If x
> can't reallocate it, then the issue remains but is much smaller than
> before, most likely a dup would be safest.
>
>   Thoughts? Questions? Ideas?

The link to the latest code would be cool ;)

I'm thinking that the main problem is trying to mimic built-in arrays.

They are not ordinary containers in that they easily combine slice and container that is (in fact) partially hidden in run-time & GC implementation.

The fact that combination of language features and run-time support makes it easy shouldn't encourage other to go this path in the library.

For one thing this way a small string optimization is not achievable because of the many slices that have to reference the data packed in _small_ chunk. Then whenever original array changes (and becomes not small) any other slice to it have to somehow reflect this change. And it's hard if slices internally are exactly the same as original (if the same type).

...

Now that I tested the current behavior that we shouldn't break a lot. Clearly current BitArray is old and rusty D1 design.

    BitArray a, b, c;
    //the interesting thing is that BitArray(32) doesn't construct BitArray with length 32 but some explosive garbage !!

    c.length = 512;
    a.length = 32;
    assert(a.length == 32);
    b = a;
    b[0] = true;
//this one is bad as it means both a & b has to reference the same data
    assert(a[0] == 1);
    b ~= c;
    b[1] = 1;
    //the next fails for me (it also may not fail depending on c's length)
    assert(a[1] == 1);

Currently I think that D container design have to be extended to include 2 kinds of containers:

 -small containers, values type a-la C++ STL these all use small container optimization (as they can cleanly)
 -big heavy-duty ones, these have reference semantics and tuned for large sets of values

Then the proposed way out is to make 2 containers:
- current BitArray to always have reference semantics (both asserts in the above always pass);
- and BitString or BitVector to be small value semantic array that is always copied/duped on assign, these should be passed by ref or by range/slice.

That's was lengthy but what do you think? And it seems like we'd need to bring this last topic to general D NG.

-- 
Dmitry Olshansky

January 03, 2013

Re: BitArray/BitFields - Resumed and polishing

Posted by Era Scarecrow
in reply to Dmitry Olshansky

Era Scarecrow

Posted in reply to Dmitry Olshansky

On Wednesday, 2 January 2013 at 21:00:38 UTC, Dmitry Olshansky wrote:
> 12/31/2012 9:35 PM, Era Scarecrow пишет:

> Personally I believe that if we introduce a slice of a BitArray as a separate range type that represents a view of array (but can't be appended to, etc. it's just a RA range with slicing).

 Could do that I suppose, but then either exact length or append/alignment optimizations may go out the window if we remove some of those as options. To support those making slices part of the main type was easy enough. But the other issues still have to be addressed (that's mentioned below)

>
>>  const BitArray x = BitArray(32);
>>  func(x);
>>
>
> Then here x is passed either by ref or a copy (depending on func).

 yeah... assuming func returned a struct of...

 struct S {
   BitArray x;
   //other stuff
 }

 const(BitArray) func(const BitArray s);

 in the case it's small string optimization and the original string gets allocated OR gets passed outside the original function that made the BitArray. So small string if possible, convert to ref and slice it, or dup it. not 100% ref in all cases but pretty close.

>>  Would the function get passed the original compact buffer? Copy of x or a dup?  May not be so much an issue. However...
>>
>>  BitArray x = BitArray(32);
>>  const y = x[]; //slice
>>
> Then y has type BitArraySlice and references x.

 Agreed, although I'm still iffy about a separate slice as a range. Could do the range just to give it limitations, but seems like a headache to do less and extra code to manage.

 Current code I'm writing converts (if possible) the BitArray to a allocated type, then passes that slice to y.

>>  Now if y points to the actual compact buffer and we do the following...
>>
>>  x.length = 256; //allocates if it was compact
>>
>>  Although y will have const access the data instead of a compact array it's a pointer to an array and broken.
>
> Then IMO that should be a slice invalidation (akin to c++ iterator invalidation). Depending on how slice is implemented it could avoid being invalidated in this case but in some others it surely will (e.g. shrinking).

 Maybe, with the small string optimization (no allocation) it isn't so much an easy option. That's what this is about. Say it uses the compact (fixed string) and in opSlice...

BitArray {
  bool isCompact;
  bool canExpand; //represents owner for slices or compact.
  union {
    size_t[] normal;
    size_t[2] compact;
  }
  BitArray opSlice() {
    BitArray ba = this;
    ba.canExpand = false;
    if (isCompact) {
      ba.isCompact = false;
      ba.normal = cast(size_t[]) this.compact[0 .. $];
    }

    return ba;
  }
}

 This is where the allocation could cause a problem as the slice could be referring to stack memory at which all the data could be duds. Worse if you overwrite that data suddenly the allocated memory could point to something else! I think it's safe to say that's a bad idea.

>> If during the x call it
>
> You mean x[] call i.e. opSlice() call?

 well both this(this) and opSlice() would be called at different times so they would have different meanings (copying vs slicing).

> Seems like an overkill and semantically bad as I totally expect that the following to be true for any Array-like container:
> auto cont = ...;
> auto range = cont[];
> range[0] = 1;
> assert(cont[0] == 1);
>
> Also:
> foreach(v; x) --> foreach(v; x[]) if x defines opSlice
>
> however since BitArray defines opApply it's called instead so this one is probably OK.

 Does that mean we might be able to drop opApply? Hmmm... Depends on if we have enough implemented so the range accepts it. I'll glance over std.range.

>> reallocated then y would work fine no matter what in that case. If x can't reallocate it, then the issue remains but is much smaller than before, most likely a dup would be safest.
>>
>>  Thoughts? Questions? Ideas?
>
> The link to the latest code would be cool ;)

 https://github.com/rtcvb32/phobos/blob/BitArray-Updates/std/bitmanip.d

 There's a merge issue due to having tried to keep the bitfields and bitarray separated; That seems to have made more problems than solutions. Working on it but may have to wait until I've read my Git book when it arrives before I understand this all.

> I'm thinking that the main problem is trying to mimic built-in arrays.

 Not really, it's trying to do small string optimization. I suppose if this was done as a class instead most of these problems could/would go away.

> They are not ordinary containers in that they easily combine slice and container that is (in fact) partially hidden in run-time & GC implementation.
>
> The fact that combination of language features and run-time support makes it easy shouldn't encourage other to go this path in the library.
>
> For one thing this way a small string optimization is not achievable because of the many slices that have to reference the data packed in _small_ chunk. Then whenever original array changes (and becomes not small) any other slice to it have to somehow reflect this change. And it's hard if slices internally are exactly the same as original (if the same type).

 Really depends on the use case. I'm guessing there's enough call for the 64bit small packed bitarray that justifies it, otherwise it would be better to throw it out. As you mention later, separating them does seem like a better idea.

> ...
>
> Now that I tested the current behavior that we shouldn't break a lot. Clearly current BitArray is old and rusty D1 design.
>
>     BitArray a, b, c;
>     //the interesting thing is that BitArray(32) doesn't construct BitArray with length 32 but some explosive garbage !!
>
>     c.length = 512;
>     a.length = 32;
>     assert(a.length == 32);
>     b = a;
>     b[0] = true;
> //this one is bad as it means both a & b has to reference the same data
>     assert(a[0] == 1);
>     b ~= c;
>     b[1] = 1;
>     //the next fails for me (it also may not fail depending on c's length)
>     assert(a[1] == 1);

 My code tests that the last two asserts fail, course, a being under 64bits remains with small string optimization. Making it slice... seems I'll add this to my list of tests and get it fixed. Even if i reserved the space it still failed the last one. Mmm...

> Currently I think that D container design have to be extended to include 2 kinds of containers:
>
>  -small containers, values type a-la C++ STL these all use small container optimization (as they can cleanly)
>  -big heavy-duty ones, these have reference semantics and tuned for large sets of values
>
> Then the proposed way out is to make 2 containers:
> - current BitArray to always have reference semantics (both asserts in the above always pass);
> - and BitString or BitVector to be small value semantic array that is always copied/duped on assign, these should be passed by ref or by range/slice.
>
> That's was lengthy but what do you think? And it seems like we'd need to bring this last topic to general D NG.

 hmmm.. Either would need some code duplication, or template to turn off/swap small string optimization. But separating them does seem like a good idea.

January 03, 2013

Re: BitArray/BitFields - Resumed and polishing

Posted by Dmitry Olshansky
in reply to Era Scarecrow

Dmitry Olshansky

Posted in reply to Era Scarecrow

1/3/2013 6:05 AM, Era Scarecrow пишет:
> On Wednesday, 2 January 2013 at 21:00:38 UTC, Dmitry Olshansky wrote:
>> 12/31/2012 9:35 PM, Era Scarecrow пишет:
>
>> Personally I believe that if we introduce a slice of a BitArray as a
>> separate range type that represents a view of array (but can't be
>> appended to, etc. it's just a RA range with slicing).
>
>   Could do that I suppose, but then either exact length or
> append/alignment optimizations may go out the window if we remove some
> of those as options. To support those making slices part of the main
> type was easy enough. But the other issues still have to be addressed
> (that's mentioned below)
>
>>
>>>  const BitArray x = BitArray(32);
>>>  func(x);
>>>
>>
>> Then here x is passed either by ref or a copy (depending on func).
>
>   yeah... assuming func returned a struct of...
>
>   struct S {
>     BitArray x;
>     //other stuff
>   }
>
>   const(BitArray) func(const BitArray s);
>
>   in the case it's small string optimization and the original string
> gets allocated OR gets passed outside the original function that made
> the BitArray. So small string if possible, convert to ref and slice it,
> or dup it. not 100% ref in all cases but pretty close.
>
>>>  Would the function get passed the original compact buffer? Copy of x
>>> or a dup?  May not be so much an issue. However...
>>>
>>>  BitArray x = BitArray(32);
>>>  const y = x[]; //slice
>>>
>> Then y has type BitArraySlice and references x.
>
>   Agreed, although I'm still iffy about a separate slice as a range.
> Could do the range just to give it limitations, but seems like a
> headache to do less and extra code to manage.
>
>   Current code I'm writing converts (if possible) the BitArray to a
> allocated type, then passes that slice to y.
>
>>>  Now if y points to the actual compact buffer and we do the following...
>>>
>>>  x.length = 256; //allocates if it was compact
>>>
>>>  Although y will have const access the data instead of a compact
>>> array it's a pointer to an array and broken.
>>
>> Then IMO that should be a slice invalidation (akin to c++ iterator
>> invalidation). Depending on how slice is implemented it could avoid
>> being invalidated in this case but in some others it surely will (e.g.
>> shrinking).
>
>   Maybe, with the small string optimization (no allocation) it isn't so
> much an easy option. That's what this is about. Say it uses the compact
> (fixed string) and in opSlice...
>
> BitArray {
>    bool isCompact;
>    bool canExpand; //represents owner for slices or compact.
>    union {
>      size_t[] normal;
>      size_t[2] compact;
>    }
>    BitArray opSlice() {
>      BitArray ba = this;
>      ba.canExpand = false;
>      if (isCompact) {
>        ba.isCompact = false;
>        ba.normal = cast(size_t[]) this.compact[0 .. $];
>      }
>
>      return ba;
>    }
> }
>
>   This is where the allocation could cause a problem as the slice could
> be referring to stack memory at which all the data could be duds. Worse
> if you overwrite that data suddenly the allocated memory could point to
> something else! I think it's safe to say that's a bad idea.
>

Hm, I'd think that having Slice type be:

BitArraySlice{
BitArray* bp;
size_t start, end;
// all else forwards to the pointed array
}
should work avoiding the most of code duplication. With any luck inliner will expand all of this wrapping. To enable bulk mode operations:

void opOpSliceAssign(BitArraySlice rhs)
{//has all the data to do it
   bulkOp(bp, start, end, rhs.bp, rhs.start, rhs.end);
}

then it's a question of bit of refactoring & tweaking of array bulks ops.

Then the only problem left is slice invalidation and original array going out of scope.

>>> If during the x call it
>>
>> You mean x[] call i.e. opSlice() call?
>
>   well both this(this) and opSlice() would be called at different times
> so they would have different meanings (copying vs slicing).
>
>> Seems like an overkill and semantically bad as I totally expect that
>> the following to be true for any Array-like container:
>> auto cont = ...;
>> auto range = cont[];
>> range[0] = 1;
>> assert(cont[0] == 1);
>>
>> Also:
>> foreach(v; x) --> foreach(v; x[]) if x defines opSlice
>>
>> however since BitArray defines opApply it's called instead so this one
>> is probably OK.
>
>   Does that mean we might be able to drop opApply? Hmmm... Depends on if
> we have enough implemented so the range accepts it. I'll glance over
> std.range.

Yes, it's got to be a range to work like this and then opApply is useless.
>
>>> reallocated then y would work fine no matter what in that case. If x
>>> can't reallocate it, then the issue remains but is much smaller than
>>> before, most likely a dup would be safest.
>>>
>>>  Thoughts? Questions? Ideas?
>>
>> The link to the latest code would be cool ;)
>
>
> https://github.com/rtcvb32/phobos/blob/BitArray-Updates/std/bitmanip.d
>
>   There's a merge issue due to having tried to keep the bitfields and
> bitarray separated; That seems to have made more problems than
> solutions. Working on it but may have to wait until I've read my Git
> book when it arrives before I understand this all.
>
>> I'm thinking that the main problem is trying to mimic built-in arrays.
>
>   Not really, it's trying to do small string optimization. I suppose if
> this was done as a class instead most of these problems could/would go
> away.
>
>> They are not ordinary containers in that they easily combine slice and
>> container that is (in fact) partially hidden in run-time & GC
>> implementation.
>>
>> The fact that combination of language features and run-time support
>> makes it easy shouldn't encourage other to go this path in the library.
>>
>> For one thing this way a small string optimization is not achievable
>> because of the many slices that have to reference the data packed in
>> _small_ chunk. Then whenever original array changes (and becomes not
>> small) any other slice to it have to somehow reflect this change. And
>> it's hard if slices internally are exactly the same as original (if
>> the same type).
>
>   Really depends on the use case. I'm guessing there's enough call for
> the 64bit small packed bitarray that justifies it, otherwise it would be
> better to throw it out. As you mention later, separating them does seem
> like a better idea.
>

I've meant the fact that it has to preserve current std lib behavior *and* has small string optimization that makes it overly painful and partially defeats the optimization (by requiring pointless compact->big conversions on slice etc.).

>> ...
>>
>> Now that I tested the current behavior that we shouldn't break a lot.
>> Clearly current BitArray is old and rusty D1 design.
>>
>>     BitArray a, b, c;
>>     //the interesting thing is that BitArray(32) doesn't construct
>> BitArray with length 32 but some explosive garbage !!
>>
>>     c.length = 512;
>>     a.length = 32;
>>     assert(a.length == 32);
>>     b = a;
>>     b[0] = true;
>> //this one is bad as it means both a & b has to reference the same data
>>     assert(a[0] == 1);
>>     b ~= c;
>>     b[1] = 1;
>>     //the next fails for me (it also may not fail depending on c's
>> length)
>>     assert(a[1] == 1);
>
>   My code tests that the last two asserts fail, course, a being under
> 64bits remains with small string optimization. Making it slice... seems
> I'll add this to my list of tests and get it fixed. Even if i reserved
> the space it still failed the last one. Mmm...

Right, since it's _a_ that's tested and in current design it shares data with b unless that got relocated (b ~= c can relocate b).
>
>> Currently I think that D container design have to be extended to
>> include 2 kinds of containers:
>>
>>  -small containers, values type a-la C++ STL these all use small
>> container optimization (as they can cleanly)
>>  -big heavy-duty ones, these have reference semantics and tuned for
>> large sets of values
>>
>> Then the proposed way out is to make 2 containers:
>> - current BitArray to always have reference semantics (both asserts in
>> the above always pass);
>> - and BitString or BitVector to be small value semantic array that is
>> always copied/duped on assign, these should be passed by ref or by
>> range/slice.
>>
>> That's was lengthy but what do you think? And it seems like we'd need
>> to bring this last topic to general D NG.
>
>   hmmm.. Either would need some code duplication, or template to turn
> off/swap small string optimization. But separating them does seem like a
> good idea.

I'd expect a fair chunk of code to be split off into free functions, then both containers would make use of them.

Now note that there is also std.container Array!bool that has reference semantics and got to be doing bit packing... it could potentially take place of big BitArray mentioned.
-- 
Dmitry Olshansky

January 03, 2013

Re: BitArray/BitFields - Resumed and polishing

Posted by Era Scarecrow
in reply to Dmitry Olshansky

Era Scarecrow

Posted in reply to Dmitry Olshansky

On Thursday, 3 January 2013 at 07:57:46 UTC, Dmitry Olshansky wrote:
> 1/3/2013 6:05 AM, Era Scarecrow wrote:

> Hm, I'd think that having Slice type be:
>
> BitArraySlice{
> BitArray* bp;
> size_t start, end;
> // all else forwards to the pointed array
> }
> should work avoiding the most of code duplication. With any luck inliner will expand all of this wrapping. To enable bulk mode operations:
>
> void opOpSliceAssign(BitArraySlice rhs)
> {//has all the data to do it
>    bulkOp(bp, start, end, rhs.bp, rhs.start, rhs.end);
> }
>
> then it's a question of bit of refactoring & tweaking of array bulks ops.
>
> Then the only problem left is slice invalidation and original array going out of scope.

 I know we've had this discussion before. So let's assume you do slicing this way. So what happens when...

 BitArray ba = [0,1,1,0];

 ba = ba[1 .. $-1]; // ba is not a BitArraySlice

 Suddenly it won't work and slicing is only a range and can only be used in foreach. Correct? You could do appending but likely it would require two functions in the original code (one for a BitArray and one for a slice). It it supports a opAssign to handle that now you have to take care of all opAssign's as well (like C++, once you handle one portion of it and opCast, you handle them all).

>> Does that mean we might be able to drop opApply? Hmmm... Depends on if we have enough implemented so the range accepts it. I'll glance over std.range.
>
> Yes, it's got to be a range to work like this and then opApply is useless.

 Maybe. Foreach if I remember right worked three different ways, and selected whichever one worked. opApply, front/pop/empty, and array (opIndex access). If enough is implemented that it recognizes it as an array then opApply can be removed.

 A quick tests shows that currently isn't the case.

>> Really depends on the use case. I'm guessing there's enough call for the 64bit small packed bitarray that justifies it, otherwise it would be better to throw it out. As you mention later, separating them does seem like a better idea.
>
> I've meant the fact that it has to preserve current std lib behavior *and* has small string optimization that makes it overly painful and partially defeats the optimization (by requiring pointless compact->big conversions on slice etc.).

 We could always 'dup' on slice instead, but then you can't use references on it, at which point it's a value type. Hmmm. Being a pure value type then a separate range makes more sense to use.

>> hmmm.. Either would need some code duplication, or template to turn off/swap small string optimization. But separating them does seem like a good idea.
>
> I'd expect a fair chunk of code to be split off into free functions, then both containers would make use of them.
>
> Now note that there is also std.container Array!bool that has reference semantics and got to be doing bit packing... it could potentially take place of big BitArray mentioned.

 I'm aware of the container version, and overall if it can just replace BitArray as it is, then we might consider only worrying about a compact version, at which point it would become a template... Might start seeing BitArray!1024  in places (or BitString!1024 ?).

January 03, 2013

Re: BitArray/BitFields - Resumed and polishing

Posted by Dmitry Olshansky
in reply to Era Scarecrow

Dmitry Olshansky

Posted in reply to Era Scarecrow

1/3/2013 2:20 PM, Era Scarecrow пишет:
> On Thursday, 3 January 2013 at 07:57:46 UTC, Dmitry Olshansky wrote:
>> 1/3/2013 6:05 AM, Era Scarecrow wrote:
>
>> Hm, I'd think that having Slice type be:
>>
>> BitArraySlice{
>> BitArray* bp;
>> size_t start, end;
>> // all else forwards to the pointed array
>> }
>> should work avoiding the most of code duplication. With any luck
>> inliner will expand all of this wrapping. To enable bulk mode operations:
>>
>> void opOpSliceAssign(BitArraySlice rhs)
>> {//has all the data to do it
>>    bulkOp(bp, start, end, rhs.bp, rhs.start, rhs.end);
>> }
>>
>> then it's a question of bit of refactoring & tweaking of array bulks ops.
>>
>> Then the only problem left is slice invalidation and original array
>> going out of scope.
>
>   I know we've had this discussion before. So let's assume you do
> slicing this way. So what happens when...
>
>   BitArray ba = [0,1,1,0];
>
>   ba = ba[1 .. $-1]; // ba is not a BitArraySlice
>
>   Suddenly it won't work and slicing is only a range and can only be
> used in foreach.

No surprise here. Basically container != range over it. It all flows from there. Range doesn't have append nor it owns any data.
About foreach see below. Yes, the BitArray container won't pass hasSlicing trait, but it shouldn't as it's meant for ranges.
(so in a sense my comment about trying to mimic built-in array applies)

And slice can have opIndex and opSlice being a random access range of bool.

To help people convert one to the other (exactly where they need it) we can add .dup(?) for the slice that allocates a BitArray and assigns values from range.

> Correct? You could do appending but likely it would
> require two functions in the original code (one for a BitArray and one
> for a slice).

Again I don't think there is a need to append to a slice. Create an array out of slice (or in fact any range of bool) then append.

>It it supports a opAssign to handle that now you have to
> take care of all opAssign's as well (like C++, once you handle one
> portion of it and opCast, you handle them all).

Yes, I'm afraid that for optimal speed they both have to be implemented.
The version for BitArray can just forward to the one for slice over the whole of it. Or there could be a common template function that handles all of bit-ops over [a, b) portions of BitArrays. Everything else then forwards to it.

>>> Does that mean we might be able to drop opApply? Hmmm... Depends on
>>> if we have enough implemented so the range accepts it. I'll glance
>>> over std.range.
>>
>> Yes, it's got to be a range to work like this and then opApply is
>> useless.
>
>   Maybe. Foreach if I remember right worked three different ways, and
> selected whichever one worked. opApply, front/pop/empty, and array
> (opIndex access). If enough is implemented that it recognizes it as an
> array then opApply can be removed.

There are 2 ways: range or opApply. How built-in arrays are done is up to compiler, there is no way to tap into it from user level.

There is an extra rule however, it is the following. If the object in question is not a range and doesn't have opApply but has opSlice() then it slices it first, then again it checks the 2 ways on this slice - as range or opApply.
>
>   A quick tests shows that currently isn't the case.
>
>>> Really depends on the use case. I'm guessing there's enough call for
>>> the 64bit small packed bitarray that justifies it, otherwise it would
>>> be better to throw it out. As you mention later, separating them does
>>> seem like a better idea.
>>
>> I've meant the fact that it has to preserve current std lib behavior
>> *and* has small string optimization that makes it overly painful and
>> partially defeats the optimization (by requiring pointless
>> compact->big conversions on slice etc.).
>
>   We could always 'dup' on slice instead, but then you can't use
> references on it, at which point it's a value type.

Dup on slice is very bad. Slice in container world (I'm looking at std.container and this not likely to change) is a view of the same data, thus it's *not* a new container on its own.

>Hmmm. Being a pure
> value type then a separate range makes more sense to use.
>
>>> hmmm.. Either would need some code duplication, or template to turn
>>> off/swap small string optimization. But separating them does seem
>>> like a good idea.
>>
>> I'd expect a fair chunk of code to be split off into free functions,
>> then both containers would make use of them.
>>
>> Now note that there is also std.container Array!bool that has
>> reference semantics and got to be doing bit packing... it could
>> potentially take place of big BitArray mentioned.
>
>   I'm aware of the container version, and overall if it can just replace
> BitArray as it is, then we might consider only worrying about a compact
> version, at which point it would become a template... Might start seeing
> BitArray!1024  in places (or BitString!1024 ?).

I'm not sure if it will gain that much to have templated fixed size bit array over one that just has small optimization. Could be a lot but somehow I don't expect much, that needs some measurement.

-- 
Dmitry Olshansky

January 03, 2013

Re: BitArray/BitFields - Resumed and polishing

Posted by Era Scarecrow
in reply to Dmitry Olshansky

Era Scarecrow

Posted in reply to Dmitry Olshansky

On Thursday, 3 January 2013 at 15:48:50 UTC, Dmitry Olshansky wrote:
> 1/3/2013 2:20 PM, Era Scarecrow wrote:
>> Suddenly it won't work and slicing is only a range and can only be used in foreach.
>
> No surprise here. Basically container != range over it. It all flows from there. Range doesn't have append nor it owns any data.
> About foreach see below. Yes, the BitArray container won't pass hasSlicing trait, but it shouldn't as it's meant for ranges. (so in a sense my comment about trying to mimic built-in array applies)
>
> And slice can have opIndex and opSlice being a random access range of bool.

 Hmmm.. shouldn't try to make it as close to an actual array as possible. I feel like I'm missing something very very simple to making it all correct.

> To help people convert one to the other (exactly where they need it) we can add .dup(?) for the slice that allocates a BitArray and assigns values from range.
>
>> Correct? You could do appending but likely it would require two functions in the original code (one for a BitArray and one for a slice).
>
> Again I don't think there is a need to append to a slice. Create an array out of slice (or in fact any range of bool) then append.

 I want to say you're wrong, I'm thinking heavily in the field where you would be doing compression or encryption where random access and appending could be needed at any step. I can see creating a simple table of the compressed sequences (Say huffman), and then appending the slices which then greatly simplifies the interface. I can also see where if you append to a slice it automatically making a new block of memory would be useful as well, as you know you're going to start fresh. I can make a short test set of functions and post them if you like, doing ranges & slices otherwise might not allow such diversity (simplicity?) in the code.

>> If it supports a opAssign to handle that now you have to take care of all opAssign's as well (like C++, once you handle one portion of it and opCast, you handle them all).
>
> Yes, I'm afraid that for optimal speed they both have to be implemented. The version for BitArray can just forward to the one for slice over the whole of it. Or there could be a common template function that handles all of bit-ops over [a, b) portions of BitArrays. Everything else then forwards to it.
>
>>>> Does that mean we might be able to drop opApply? Hmmm... Depends on if we have enough implemented so the range accepts it. I'll glance over std.range.
>>>
>>> Yes, it's got to be a range to work like this and then opApply is useless.
>>
>> Maybe. Foreach if I remember right worked three different ways, and selected whichever one worked. opApply, front/pop/empty, and array (opIndex access). If enough is implemented that it recognizes it as an array then opApply can be removed.
>
> There are 2 ways: range or opApply. How built-in arrays are done is up to compiler, there is no way to tap into it from user level.
>
> There is an extra rule however, it is the following. If the object in question is not a range and doesn't have opApply but has opSlice() then it slices it first, then again it checks the 2 ways on this slice - as range or opApply.

 I didn't know that; I know it attempted a slice but not that the slice has to be a range. Likely that needs to be emphasized in the documentation.

>>  A quick tests shows that currently isn't the case.
>>
>>>> Really depends on the use case. I'm guessing there's enough call for the 64bit small packed bitarray that justifies it, otherwise it would be better to throw it out. As you mention later, separating them does seem like a better idea.
>>>
>>> I've meant the fact that it has to preserve current std lib behavior *and* has small string optimization that makes it overly painful and partially defeats the optimization (by requiring pointless compact->big conversions on slice etc.).
>>
>> We could always 'dup' on slice instead, but then you can't use references on it, at which point it's a value type.
>
> Dup on slice is very bad. Slice in container world (I'm looking at std.container and this not likely to change) is a view of the same data, thus it's *not* a new container on its own.
>
>> Hmmm. Being a pure value type then a separate range makes more sense to use.
>>
>>>> hmmm.. Either would need some code duplication, or template to turn off/swap small string optimization. But separating them does seem like a good idea.
>>>
>>> I'd expect a fair chunk of code to be split off into free functions, then both containers would make use of them.
>>>
>>> Now note that there is also std.container Array!bool that has reference semantics and got to be doing bit packing... it could potentially take place of big BitArray mentioned.
>>
>> I'm aware of the container version, and overall if it can just replace BitArray as it is, then we might consider only worrying about a compact version, at which point it would become a template... Might start seeing BitArray!1024  in places (or BitString!1024 ?).
>
> I'm not sure if it will gain that much to have templated fixed size bit array over one that just has small optimization. Could be a lot but somehow I don't expect much, that needs some measurement.

 I agree, but if it's going to be a value type that's heavy on small string optimization with as little referencing as possible, then it has to know what size to depend on, since resizing beyond the max may not be an option if we take that part out.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation