Thread overview | ||||||||
---|---|---|---|---|---|---|---|---|
|
November 12, 2016 Variable-Length Bit-Level Encoding | ||||
---|---|---|---|---|
| ||||
I'm looking for libraries/snippets (either in D or similar languages) that perform variable-length encoding of unsigned integers onto a bit-stream. Requirement is that smaller inputs (integer values) should be encoded with equal or fewer bits. This 0 => [0] 1 => [1,0] 2 => [1,1,0] is easy but assumes a too extreme input value distribution. Does anybody have a suggestion for an encoder that is more suitable for real-world values that are, for instance, normally distributed? |
November 12, 2016 Re: Variable-Length Bit-Level Encoding | ||||
---|---|---|---|---|
| ||||
Posted in reply to Nordlöw | On Saturday, 12 November 2016 at 19:13:13 UTC, Nordlöw wrote:
> Does anybody have a suggestion for an encoder that is more suitable for real-world values that are, for instance, normally distributed?
Doh, forget the normal distribution thing here. It, of course, doesn't work with unsigned integers.
|
November 12, 2016 Re: Variable-Length Bit-Level Encoding | ||||
---|---|---|---|---|
| ||||
Posted in reply to Nordlöw | On Saturday, 12 November 2016 at 19:13:13 UTC, Nordlöw wrote:
> Does anybody have a suggestion for an encoder that is more suitable for real-world values that are, for instance, normally distributed?
I don't recall the name, but there is an algorithm for encoding data of an arbitrary number of bytes/bits into a stream of octets. It reserves the MSB of each octet for use as a marker. If it is set to 1, then there are yet more bits to read. If it is set to 0, then this is the last group of bits. This enables each octet to carry 7 bits at a time, while allowing you to encode data of any bit size into an octet stream. You just need to break your data down into groups of 7 bits.
|
November 13, 2016 Re: Variable-Length Bit-Level Encoding | ||||
---|---|---|---|---|
| ||||
Posted in reply to Nordlöw | On Saturday, 12 November 2016 at 19:13:13 UTC, Nordlöw wrote: > 0 => [0] > 1 => [1,0] > 2 => [1,1,0] > > is easy but assumes a too extreme input value distribution. > > Does anybody have a suggestion for an encoder that is more suitable for real-world values that are, for instance, normally distributed? Hmm off hand I'm recalling there's a different encoding that uses 0's to determine how long the encoding is, and then the first 1 breaks it off to the actual encoding. Great for short ints. 1 = 0 010 = 1 011 = 2 00101 = 4 000011111 = 30 https://en.wikipedia.org/wiki/Exponential-Golomb_coding If you know the range of the data, I'd almost be more inclined to do range encoding mixed with arithmetic coding. |
November 13, 2016 Re: Variable-Length Bit-Level Encoding | ||||
---|---|---|---|---|
| ||||
Posted in reply to James Buren | On Saturday, 12 November 2016 at 19:19:58 UTC, James Buren wrote: > On Saturday, 12 November 2016 at 19:13:13 UTC, Nordlöw wrote: >> Does anybody have a suggestion for an encoder that is more suitable for real-world values that are, for instance, normally distributed? > > I don't recall the name, but there is an algorithm for encoding data of an arbitrary number of bytes/bits into a stream of octets. It reserves the MSB of each octet for use as a marker. If it is set to 1, then there are yet more bits to read. If it is set to 0, then this is the last group of bits. This enables each octet to carry 7 bits at a time, while allowing you to encode data of any bit size into an octet stream. You just need to break your data down into groups of 7 bits. Maybe the LEB128 encoding [1] ? [1] https://en.wikipedia.org/wiki/LEB128 |
November 14, 2016 Re: Variable-Length Bit-Level Encoding | ||||
---|---|---|---|---|
| ||||
Posted in reply to Nordlöw | On Saturday, 12 November 2016 at 19:13:13 UTC, Nordlöw wrote: > I'm looking for libraries/snippets (either in D or similar languages) that perform variable-length encoding of unsigned integers onto a bit-stream. Requirement is that smaller inputs (integer values) should be encoded with equal or fewer bits. > > This > > 0 => [0] > 1 => [1,0] > 2 => [1,1,0] > > is easy but assumes a too extreme input value distribution. > > Does anybody have a suggestion for an encoder that is more suitable for real-world values that are, for instance, normally distributed? If you have a sample of your data, perhaps Huffman codes (https://en.wikipedia.org/wiki/Huffman_coding) might be an option? |
Copyright © 1999-2021 by the D Language Foundation