Thread overview | |||||||
---|---|---|---|---|---|---|---|
|
December 13, 2014 repack ubyte[] to use only 7 bits | ||||
---|---|---|---|---|
| ||||
Is there a standard way to do this? The code below is untested, as I haven't yet written the x7to8 routine, and came up with a better way to do what this was to accomplish, but it feels as if this should be somewhere in the standard library, if I could only find it. /** Repack the data from an array of ubytes into an array of ubytes of * which only the last 7 are significant. The high bit will be set only * if the byte would otherwise be zero. */ byte[] x8to7 (ubyte[] bin) { ubyte[] bout; // bit masks: 0 => 0xfe = 11111110, 0x00 = 00000000 // 1 => 0x7f = 01111111, 0x00 = 00000000 // 2 => 0x3f = 00111111, 0x80 = 10000000 // 3 => 0x1f = 00011111, 0xc0 = 11000000 // 4 => 0x0f = 00001111, 0xe0 = 11100000 // 5 => 0x07 = 00000111, 0xf0 = 11110000 // 6 => 0x03 = 00000011, 0xf8 = 11111000 // 7 => 0x01 = 00000001, 0xfc = 11111100 if (bin.length < 1) return bout; int fByte, fBit; while (fByte < bin.length) { if (fByte + 1 == bin.length && fBit > 1) break; ubyte b; switch (fBit) { case 0: b = bin[fByte] / 2; break; case 1: b = bin[fByte] & 0x7f; break; case 2: ubyte b1 = (bin[fByte] & 0x3f) << 1; ubyte b2 = (bin[fByte + 1] & 0x80) >>> 7; b ~= (b1 | b2); break; case 3: ubyte b1 = (bin[fByte] & 0x1f) << 2; ubyte b2 = (bin[fByte + 1] & 0xc0) >>> 6; b ~= (b1 | b2); break; case 4: ubyte b1 = (bin[fByte] & 0x0f) << 3; ubyte b2 = (bin[fByte + 1] & 0xe0) >>> 5; b ~= (b1 | b2); break; case 5: ubyte b1 = (bin[fByte] & 0x07) << 4; ubyte b2 = (bin[fByte + 1] & 0xf0) >>> 4; b ~= (b1 | b2); break; case 6: ubyte b1 = (bin[fByte] & 0x03) << 5; ubyte b2 = (bin[fByte + 1] & 0xf8) >>> 3; b ~= (b1 | b2); break; case 7: ubyte b1 = (bin[fByte] & 0x01) << 6; ubyte b2 = (bin[fByte + 1] & 0xfc) >>> 2; b ~= (b1 | b2); break; default: assert (false, "This path should never be taken"); } // switch (fBit) if (b == 0) bout ~= 0x80; else bout ~= b; fBit = fBit + 7; if (fBit > 7) { fByte++; fBit -= 7; } } } |
December 13, 2014 Re: repack ubyte[] to use only 7 bits | ||||
---|---|---|---|---|
| ||||
Posted in reply to Charles Hixson | On Saturday, 13 December 2014 at 10:09:27 UTC, Charles Hixson via Digitalmars-d-learn wrote: > Is there a standard way to do this? The code below is untested, as I haven't yet written the x7to8 routine, and came up with a better way to do what this was to accomplish, but it feels as if this should be somewhere in the standard library, if I could only find it. > > /** Repack the data from an array of ubytes into an array of ubytes of > * which only the last 7 are significant. The high bit will be set only > * if the byte would otherwise be zero. */ > byte[] x8to7 (ubyte[] bin) > { > ubyte[] bout; > // bit masks: 0 => 0xfe = 11111110, 0x00 = 00000000 > // 1 => 0x7f = 01111111, 0x00 = 00000000 > // 2 => 0x3f = 00111111, 0x80 = 10000000 > // 3 => 0x1f = 00011111, 0xc0 = 11000000 > // 4 => 0x0f = 00001111, 0xe0 = 11100000 > // 5 => 0x07 = 00000111, 0xf0 = 11110000 > // 6 => 0x03 = 00000011, 0xf8 = 11111000 > // 7 => 0x01 = 00000001, 0xfc = 11111100 > if (bin.length < 1) return bout; > int fByte, fBit; > while (fByte < bin.length) > { if (fByte + 1 == bin.length && fBit > 1) break; > ubyte b; > switch (fBit) > { case 0: > b = bin[fByte] / 2; > break; > case 1: > b = bin[fByte] & 0x7f; > break; > case 2: > ubyte b1 = (bin[fByte] & 0x3f) << 1; > ubyte b2 = (bin[fByte + 1] & 0x80) >>> 7; > b ~= (b1 | b2); > break; > case 3: > ubyte b1 = (bin[fByte] & 0x1f) << 2; > ubyte b2 = (bin[fByte + 1] & 0xc0) >>> 6; > b ~= (b1 | b2); > break; > case 4: > ubyte b1 = (bin[fByte] & 0x0f) << 3; > ubyte b2 = (bin[fByte + 1] & 0xe0) >>> 5; > b ~= (b1 | b2); > break; > case 5: > ubyte b1 = (bin[fByte] & 0x07) << 4; > ubyte b2 = (bin[fByte + 1] & 0xf0) >>> 4; > b ~= (b1 | b2); > break; > case 6: > ubyte b1 = (bin[fByte] & 0x03) << 5; > ubyte b2 = (bin[fByte + 1] & 0xf8) >>> 3; > b ~= (b1 | b2); > break; > case 7: > ubyte b1 = (bin[fByte] & 0x01) << 6; > ubyte b2 = (bin[fByte + 1] & 0xfc) >>> 2; > b ~= (b1 | b2); > break; > default: > assert (false, "This path should never be taken"); > } // switch (fBit) > if (b == 0) bout ~= 0x80; > else bout ~= b; > fBit = fBit + 7; > if (fBit > 7) > { fByte++; > fBit -= 7; > } > } > } Are you trying to make a "kind-of" Variable-Length quantity encoder ? eg: 0b10101110: last bit not set, integrate 0b10101110 and stop reading. 0b10011001: last bit set, integrate 0b10011000 and continue to next byte. http://en.wikipedia.org/wiki/Variable-length_quantity except that this algo limits the length to 24 bits. It was used a lot with MIDI, at a time when hardware memory was costly (eg inside hardware synthesizer or workstations). |
December 13, 2014 Re: repack ubyte[] to use only 7 bits | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manolo | On Saturday, 13 December 2014 at 11:20:21 UTC, Manolo wrote:
> On Saturday, 13 December 2014 at 10:09:27 UTC, Charles Hixson via Digitalmars-d-learn wrote:
>> Is there a standard way to do this? The code below is untested, as I haven't yet written the x7to8 routine, and came up with a better way to do what this was to accomplish, but it feels as if this should be somewhere in the standard library, if I could only find it.
>>
>> /** Repack the data from an array of ubytes into an array of ubytes of
>> * which only the last 7 are significant. The high bit will be set only
>> * if the byte would otherwise be zero. */
>> byte[] x8to7 (ubyte[] bin)
>> {
>> ubyte[] bout;
>> // bit masks: 0 => 0xfe = 11111110, 0x00 = 00000000
>> // 1 => 0x7f = 01111111, 0x00 = 00000000
>> // 2 => 0x3f = 00111111, 0x80 = 10000000
>> // 3 => 0x1f = 00011111, 0xc0 = 11000000
>> // 4 => 0x0f = 00001111, 0xe0 = 11100000
>> // 5 => 0x07 = 00000111, 0xf0 = 11110000
>> // 6 => 0x03 = 00000011, 0xf8 = 11111000
>> // 7 => 0x01 = 00000001, 0xfc = 11111100
>> if (bin.length < 1) return bout;
>> int fByte, fBit;
>> while (fByte < bin.length)
>> { if (fByte + 1 == bin.length && fBit > 1) break;
>> ubyte b;
>> switch (fBit)
>> { case 0:
>> b = bin[fByte] / 2;
>> break;
>> case 1:
>> b = bin[fByte] & 0x7f;
>> break;
>> case 2:
>> ubyte b1 = (bin[fByte] & 0x3f) << 1;
>> ubyte b2 = (bin[fByte + 1] & 0x80) >>> 7;
>> b ~= (b1 | b2);
>> break;
>> case 3:
>> ubyte b1 = (bin[fByte] & 0x1f) << 2;
>> ubyte b2 = (bin[fByte + 1] & 0xc0) >>> 6;
>> b ~= (b1 | b2);
>> break;
>> case 4:
>> ubyte b1 = (bin[fByte] & 0x0f) << 3;
>> ubyte b2 = (bin[fByte + 1] & 0xe0) >>> 5;
>> b ~= (b1 | b2);
>> break;
>> case 5:
>> ubyte b1 = (bin[fByte] & 0x07) << 4;
>> ubyte b2 = (bin[fByte + 1] & 0xf0) >>> 4;
>> b ~= (b1 | b2);
>> break;
>> case 6:
>> ubyte b1 = (bin[fByte] & 0x03) << 5;
>> ubyte b2 = (bin[fByte + 1] & 0xf8) >>> 3;
>> b ~= (b1 | b2);
>> break;
>> case 7:
>> ubyte b1 = (bin[fByte] & 0x01) << 6;
>> ubyte b2 = (bin[fByte + 1] & 0xfc) >>> 2;
>> b ~= (b1 | b2);
>> break;
>> default:
>> assert (false, "This path should never be taken");
>> } // switch (fBit)
>> if (b == 0) bout ~= 0x80;
>> else bout ~= b;
>> fBit = fBit + 7;
>> if (fBit > 7)
>> { fByte++;
>> fBit -= 7;
>> }
>> }
>> }
>
> Are you trying to make a "kind-of" Variable-Length quantity encoder ?
>
> eg:
> 0b10101110: last bit not set, integrate 0b10101110 and stop reading.
> 0b10011001: last bit set, integrate 0b10011000 and continue to next byte.
>
> http://en.wikipedia.org/wiki/Variable-length_quantity
>
> except that this algo limits the length to 24 bits. It was used a lot with
> MIDI, at a time when hardware memory was costly (eg inside hardware synthesizer or workstations).
Sorry, lack or accuraccy: the maximum value represented was a 24 bit unsigned integer, but the data length was 32 bit for this value.
The thing is that the format included various fields, but because of the memory price the algo saved space when values where less than 0X7F, because only one byte was needed. Nowadays such as format would allow for example always 4 bytes to describes the data length:
nowadays:
data len "L" | data
0 1 2 3 | 4 ... L-1
so "nowadays", we can afford 4 bytes to say that a field length is 1
olddays:
variable len VL | data
1 to ? | VL ... VL-1
"olddays", they used only one byte to say that a field length is 1.
|
December 13, 2014 Re: repack ubyte[] to use only 7 bits | ||||
---|---|---|---|---|
| ||||
Posted in reply to Manolo |
On 12/13/2014 03:20 AM, Manolo via Digitalmars-d-learn wrote:
> On Saturday, 13 December 2014 at 10:09:27 UTC, Charles Hixson via Digitalmars-d-learn wrote:
>> Is there a standard way to do this? The code below is untested, as I haven't yet written the x7to8 routine, and came up with a better way to do what this was to accomplish, but it feels as if this should be somewhere in the standard library, if I could only find it.
>>
>> /** Repack the data from an array of ubytes into an array of ubytes of
>> * which only the last 7 are significant. The high bit will be set only
>> * if the byte would otherwise be zero. */
>> byte[] x8to7 (ubyte[] bin)
>> {
>> ubyte[] bout;
>> // bit masks: 0 => 0xfe = 11111110, 0x00 = 00000000
>> // 1 => 0x7f = 01111111, 0x00 = 00000000
>> // 2 => 0x3f = 00111111, 0x80 = 10000000
>> // 3 => 0x1f = 00011111, 0xc0 = 11000000
>> // 4 => 0x0f = 00001111, 0xe0 = 11100000
>> // 5 => 0x07 = 00000111, 0xf0 = 11110000
>> // 6 => 0x03 = 00000011, 0xf8 = 11111000
>> // 7 => 0x01 = 00000001, 0xfc = 11111100
>> if (bin.length < 1) return bout;
>> int fByte, fBit;
>> while (fByte < bin.length)
>> { if (fByte + 1 == bin.length && fBit > 1) break;
>> ubyte b;
>> switch (fBit)
>> { case 0:
>> b = bin[fByte] / 2;
>> break;
>> case 1:
>> b = bin[fByte] & 0x7f;
>> break;
>> case 2:
>> ubyte b1 = (bin[fByte] & 0x3f) << 1;
>> ubyte b2 = (bin[fByte + 1] & 0x80) >>> 7;
>> b ~= (b1 | b2);
>> break;
>> case 3:
>> ubyte b1 = (bin[fByte] & 0x1f) << 2;
>> ubyte b2 = (bin[fByte + 1] & 0xc0) >>> 6;
>> b ~= (b1 | b2);
>> break;
>> case 4:
>> ubyte b1 = (bin[fByte] & 0x0f) << 3;
>> ubyte b2 = (bin[fByte + 1] & 0xe0) >>> 5;
>> b ~= (b1 | b2);
>> break;
>> case 5:
>> ubyte b1 = (bin[fByte] & 0x07) << 4;
>> ubyte b2 = (bin[fByte + 1] & 0xf0) >>> 4;
>> b ~= (b1 | b2);
>> break;
>> case 6:
>> ubyte b1 = (bin[fByte] & 0x03) << 5;
>> ubyte b2 = (bin[fByte + 1] & 0xf8) >>> 3;
>> b ~= (b1 | b2);
>> break;
>> case 7:
>> ubyte b1 = (bin[fByte] & 0x01) << 6;
>> ubyte b2 = (bin[fByte + 1] & 0xfc) >>> 2;
>> b ~= (b1 | b2);
>> break;
>> default:
>> assert (false, "This path should never be taken");
>> } // switch (fBit)
>> if (b == 0) bout ~= 0x80;
>> else bout ~= b;
>> fBit = fBit + 7;
>> if (fBit > 7)
>> { fByte++;
>> fBit -= 7;
>> }
>> }
>> }
>
> Are you trying to make a "kind-of" Variable-Length quantity encoder ?
>
> eg:
> 0b10101110: last bit not set, integrate 0b10101110 and stop reading.
> 0b10011001: last bit set, integrate 0b10011000 and continue to next byte.
>
> http://en.wikipedia.org/wiki/Variable-length_quantity
>
> except that this algo limits the length to 24 bits. It was used a lot with
> MIDI, at a time when hardware memory was costly (eg inside hardware synthesizer or workstations).
>
What I was trying to do was pack things into 7 bits so I could recode 0's as 128. I finally thought clearly about it and realized that I only needed to use one particular byte value (I chose 127) to duplicate so I could repack things with a string of 0's replaced by 127 followed by the length (up to 126) of zeros, and for 127 itself I'd just emit 127 twice. This was to pack binary data into a string that C routines wouldn't think had ended partway through. (If I get more than 127 zeros in a row, I just have more than one packing code.)
|
December 13, 2014 Re: repack ubyte[] to use only 7 bits | ||||
---|---|---|---|---|
| ||||
Posted in reply to Charles Hixson | On Saturday, 13 December 2014 at 19:52:33 UTC, Charles Hixson via Digitalmars-d-learn wrote:
>
> On 12/13/2014 03:20 AM, Manolo via Digitalmars-d-learn wrote:
>> On Saturday, 13 December 2014 at 10:09:27 UTC, Charles Hixson via Digitalmars-d-learn wrote:
>>> Is there a standard way to do this? The code below is untested, as I haven't yet written the x7to8 routine, and came up with a better way to do what this was to accomplish, but it feels as if this should be somewhere in the standard library, if I could only find it.
>>>
>>> /** Repack the data from an array of ubytes into an array of ubytes of
>>> * which only the last 7 are significant. The high bit will be set only
>>> * if the byte would otherwise be zero. */
>>> byte[] x8to7 (ubyte[] bin)
>>> {
>>> ubyte[] bout;
>>> // bit masks: 0 => 0xfe = 11111110, 0x00 = 00000000
>>> // 1 => 0x7f = 01111111, 0x00 = 00000000
>>> // 2 => 0x3f = 00111111, 0x80 = 10000000
>>> // 3 => 0x1f = 00011111, 0xc0 = 11000000
>>> // 4 => 0x0f = 00001111, 0xe0 = 11100000
>>> // 5 => 0x07 = 00000111, 0xf0 = 11110000
>>> // 6 => 0x03 = 00000011, 0xf8 = 11111000
>>> // 7 => 0x01 = 00000001, 0xfc = 11111100
>>> if (bin.length < 1) return bout;
>>> int fByte, fBit;
>>> while (fByte < bin.length)
>>> { if (fByte + 1 == bin.length && fBit > 1) break;
>>> ubyte b;
>>> switch (fBit)
>>> { case 0:
>>> b = bin[fByte] / 2;
>>> break;
>>> case 1:
>>> b = bin[fByte] & 0x7f;
>>> break;
>>> case 2:
>>> ubyte b1 = (bin[fByte] & 0x3f) << 1;
>>> ubyte b2 = (bin[fByte + 1] & 0x80)
>>> >>> 7;
>>> b ~= (b1 | b2);
>>> break;
>>> case 3:
>>> ubyte b1 = (bin[fByte] & 0x1f) << 2;
>>> ubyte b2 = (bin[fByte + 1] & 0xc0)
>>> >>> 6;
>>> b ~= (b1 | b2);
>>> break;
>>> case 4:
>>> ubyte b1 = (bin[fByte] & 0x0f) << 3;
>>> ubyte b2 = (bin[fByte + 1] & 0xe0)
>>> >>> 5;
>>> b ~= (b1 | b2);
>>> break;
>>> case 5:
>>> ubyte b1 = (bin[fByte] & 0x07) << 4;
>>> ubyte b2 = (bin[fByte + 1] & 0xf0)
>>> >>> 4;
>>> b ~= (b1 | b2);
>>> break;
>>> case 6:
>>> ubyte b1 = (bin[fByte] & 0x03) << 5;
>>> ubyte b2 = (bin[fByte + 1] & 0xf8)
>>> >>> 3;
>>> b ~= (b1 | b2);
>>> break;
>>> case 7:
>>> ubyte b1 = (bin[fByte] & 0x01) << 6;
>>> ubyte b2 = (bin[fByte + 1] & 0xfc)
>>> >>> 2;
>>> b ~= (b1 | b2);
>>> break;
>>> default:
>>> assert (false, "This path should never be taken");
>>> } // switch (fBit)
>>> if (b == 0) bout ~= 0x80;
>>> else bout ~= b;
>>> fBit = fBit + 7;
>>> if (fBit > 7)
>>> { fByte++;
>>> fBit -= 7;
>>> }
>>> }
>>> }
>>
>> Are you trying to make a "kind-of" Variable-Length quantity encoder ?
>>
>> eg:
>> 0b10101110: last bit not set, integrate 0b10101110 and stop reading.
>> 0b10011001: last bit set, integrate 0b10011000 and continue to next byte.
>>
>> http://en.wikipedia.org/wiki/Variable-length_quantity
>>
>> except that this algo limits the length to 24 bits. It was used a lot with
>> MIDI, at a time when hardware memory was costly (eg inside hardware synthesizer or workstations).
>>
> What I was trying to do was pack things into 7 bits so I could recode 0's as 128. I finally thought clearly about it and realized that I only needed to use one particular byte value (I chose 127) to duplicate so I could repack things with a string of 0's replaced by 127 followed by the length (up to 126) of zeros, and for 127 itself I'd just emit 127 twice. This was to pack binary data into a string that C routines wouldn't think had ended partway through. (If I get more than 127 zeros in a row, I just have more than one packing code.)
Sorry, I misunderstood the thing.
|
Copyright © 1999-2021 by the D Language Foundation