Jump to page: 1 2
Thread overview
Ecoji-d v1.0.0 is released - Base1024 using emojis ๐Ÿ˜‚๐Ÿ‘Œ
Mar 14, 2018
Anton Fediushin
Mar 15, 2018
bauss
Mar 15, 2018
Anton Fediushin
Mar 16, 2018
bauss
Mar 18, 2018
Anton Fediushin
Mar 18, 2018
bauss
Mar 16, 2018
Rainer Schuetze
Mar 18, 2018
Manu
Mar 18, 2018
Cym13
Mar 18, 2018
Anton Fediushin
Mar 17, 2018
Faux Amis
Mar 18, 2018
Abdulhaq
March 14, 2018
๐Ÿ––, I'm glad to announce that ecoji-d - pure D implementation of ecoji encoding version 1๏ธโƒฃ.0๏ธโƒฃ.0๏ธโƒฃ is finally releasedโ—

What is ecoji?

Ecoji encodes data as base1024 with an emoji character set. It can be used instead of boring and old base64 ๐Ÿคฎ๐Ÿคฎ๐Ÿคฎ.

Encoding example:

---
$ echo "Base64 is so 1999, isn't there something better?" | ecoji-d
๐Ÿ—๐Ÿ“ฉ๐ŸŽฆ๐Ÿ‡๐ŸŽ›๐Ÿ“˜๐Ÿ”ฏ๐Ÿšœ๐Ÿ’ž๐Ÿ˜ฝ๐Ÿ†–๐ŸŠ๐ŸŽฑ๐Ÿฅ๐Ÿš„๐ŸŒฑ๐Ÿ’ž๐Ÿ˜ญ๐Ÿ’ฎ๐Ÿ‡ต๐Ÿ’ข๐Ÿ•ฅ๐Ÿญ๐Ÿ”ธ๐Ÿ‰๐Ÿšฒ๐Ÿฆ‘๐Ÿถ๐Ÿ’ข๐Ÿ•ฅ๐Ÿ”ฎ๐Ÿ”บ๐Ÿ‰๐Ÿ“ธ๐Ÿฎ๐ŸŒผ๐Ÿ‘ฆ๐ŸšŸ๐Ÿฅด๐Ÿ“‘
---

And decoding:

---
$ echo -n "๐Ÿ—๐Ÿ“ฉ๐ŸŽฆ๐Ÿ‡๐ŸŽ›๐Ÿ“˜๐Ÿ”ฏ๐Ÿšœ๐Ÿ’ž๐Ÿ˜ฝ๐Ÿ†–๐ŸŠ๐ŸŽฑ๐Ÿฅ๐Ÿš„๐ŸŒฑ๐Ÿ’ž๐Ÿ˜ญ๐Ÿ’ฎ๐Ÿ‡ต๐Ÿ’ข๐Ÿ•ฅ๐Ÿญ๐Ÿ”ธ๐Ÿ‰๐Ÿšฒ๐Ÿฆ‘๐Ÿถ๐Ÿ’ข๐Ÿ•ฅ๐Ÿ”ฎ๐Ÿ”บ๐Ÿ‰๐Ÿ“ธ๐Ÿฎ๐ŸŒผ๐Ÿ‘ฆ๐ŸšŸ๐Ÿฅด๐Ÿ“‘" | ecoji-d -d
Base64 is so 1999, isn't there something better?
---


Ecoji-d's features:

    โœ”๏ธ Range interface
    โœ”๏ธ Lazy encoding/decoding
    โœ”๏ธ Low memory usage
    โœ”๏ธ @safe and pure when possible
    โœ”๏ธ Many tests
    โœ”๏ธ Can be used as a library and as a CLI utility


API consists of just 2๏ธโƒฃ functions:

    ๐Ÿ‘‰ `encode`, which does encoding
    ๐Ÿ‘‰ `decode`, which does decoding


Links:

    ๐Ÿ“ฆ DUB package page: http://code.dlang.org/packages/ecoji-d
    ๐Ÿ‘ GitHub repository: https://github.com/ohdatboi/ecoji-d
    ๐ŸคŸ GitHub repository of the reference Go implementation: https://github.com/keith-turner/ecoji


March 15, 2018
On Wednesday, 14 March 2018 at 17:30:18 UTC, Anton Fediushin wrote:
> ๐Ÿ––, I'm glad to announce that ecoji-d - pure D implementation of ecoji encoding version 1๏ธโƒฃ.0๏ธโƒฃ.0๏ธโƒฃ is finally releasedโ—
>
> What is ecoji?
>
> Ecoji encodes data as base1024 with an emoji character set. It can be used instead of boring and old base64 ๐Ÿคฎ๐Ÿคฎ๐Ÿคฎ.
>
> Encoding example:
>
> ---
> $ echo "Base64 is so 1999, isn't there something better?" | ecoji-d
> ๐Ÿ—๐Ÿ“ฉ๐ŸŽฆ๐Ÿ‡๐ŸŽ›๐Ÿ“˜๐Ÿ”ฏ๐Ÿšœ๐Ÿ’ž๐Ÿ˜ฝ๐Ÿ†–๐ŸŠ๐ŸŽฑ๐Ÿฅ๐Ÿš„๐ŸŒฑ๐Ÿ’ž๐Ÿ˜ญ๐Ÿ’ฎ๐Ÿ‡ต๐Ÿ’ข๐Ÿ•ฅ๐Ÿญ๐Ÿ”ธ๐Ÿ‰๐Ÿšฒ๐Ÿฆ‘๐Ÿถ๐Ÿ’ข๐Ÿ•ฅ๐Ÿ”ฎ๐Ÿ”บ๐Ÿ‰๐Ÿ“ธ๐Ÿฎ๐ŸŒผ๐Ÿ‘ฆ๐ŸšŸ๐Ÿฅด๐Ÿ“‘
> ---
>
> And decoding:
>
> ---
> $ echo -n "๐Ÿ—๐Ÿ“ฉ๐ŸŽฆ๐Ÿ‡๐ŸŽ›๐Ÿ“˜๐Ÿ”ฏ๐Ÿšœ๐Ÿ’ž๐Ÿ˜ฝ๐Ÿ†–๐ŸŠ๐ŸŽฑ๐Ÿฅ๐Ÿš„๐ŸŒฑ๐Ÿ’ž๐Ÿ˜ญ๐Ÿ’ฎ๐Ÿ‡ต๐Ÿ’ข๐Ÿ•ฅ๐Ÿญ๐Ÿ”ธ๐Ÿ‰๐Ÿšฒ๐Ÿฆ‘๐Ÿถ๐Ÿ’ข๐Ÿ•ฅ๐Ÿ”ฎ๐Ÿ”บ๐Ÿ‰๐Ÿ“ธ๐Ÿฎ๐ŸŒผ๐Ÿ‘ฆ๐ŸšŸ๐Ÿฅด๐Ÿ“‘" | ecoji-d -d
> Base64 is so 1999, isn't there something better?
> ---
>
>
> Ecoji-d's features:
>
>     โœ”๏ธ Range interface
>     โœ”๏ธ Lazy encoding/decoding
>     โœ”๏ธ Low memory usage
>     โœ”๏ธ @safe and pure when possible
>     โœ”๏ธ Many tests
>     โœ”๏ธ Can be used as a library and as a CLI utility
>
>
> API consists of just 2๏ธโƒฃ functions:
>
>     ๐Ÿ‘‰ `encode`, which does encoding
>     ๐Ÿ‘‰ `decode`, which does decoding
>
>
> Links:
>
>     ๐Ÿ“ฆ DUB package page: http://code.dlang.org/packages/ecoji-d
>     ๐Ÿ‘ GitHub repository: https://github.com/ohdatboi/ecoji-d
>     ๐ŸคŸ GitHub repository of the reference Go implementation: https://github.com/keith-turner/ecoji

Fun, but seems pretty useless in practice.
March 15, 2018
On Thursday, 15 March 2018 at 09:32:50 UTC, bauss wrote:
> Fun, but seems pretty useless in practice.

I disagree. Ecoji (base1024) has bigger character set meaning that it can encode more information per emoji than base64 can encode per character.

For example ecoji encoded "abcde" looks like this: "๐Ÿ‘–๐Ÿ“ธ๐ŸŽฆ๐ŸŒญ"
And base64 encoded one looks like this: "YWJjZGU=".

Even though each emoji is 4 bytes long, there is a noticable difference in size when we are talking about larger chunks of data:

---
$ dd if=/dev/urandom bs=4K count=16K of=test.raw
16384+0 records in
16384+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 1.90423 s, 35.2 MB/s
$ dd if=test.raw | ./ecoji-d |  wc -c
67108864 bytes (67 MB, 64 MiB) copied, 6.7699 s, 9.9 MB/s
71591534 # Size increased just by 6%
$ dd if=test.raw | base64 |  wc -c
67108864 bytes (67 MB, 64 MiB) copied, 0.750174 s, 89.5 MB/s
90655837 # 35%(!) increase in size
---

And if we move to real word scenarios, where web pages are gzip'ped most of the time:

---
$ dd if=test.raw | gzip -c | wc -c
67108864 bytes (67 MB, 64 MiB) copied, 5.49022 s, 12.2 MB/s
67119122 # Raw files are terrible for compression
$ dd if=test.raw | ./ecoji-d | gzip -c | wc -c
67108864 bytes (67 MB, 64 MiB) copied, 27.9972 s, 2.4 MB/s
32178275 # 48% improvement
$ dd if=test.raw | base64 | gzip -c | wc -c
67108864 bytes (67 MB, 64 MiB) copied, 10.3381 s, 6.5 MB/s
68892893 # Pretty bad, yeah
---

So yeah, ecoji is better than base64 in everything but speed. Speed will be improved. Later.

March 16, 2018
On Thursday, 15 March 2018 at 18:45:51 UTC, Anton Fediushin wrote:
> On Thursday, 15 March 2018 at 09:32:50 UTC, bauss wrote:
>> Fun, but seems pretty useless in practice.
>
> I disagree. Ecoji (base1024) has bigger character set meaning that it can encode more information per emoji than base64 can encode per character.
>
> For example ecoji encoded "abcde" looks like this: "๐Ÿ‘–๐Ÿ“ธ๐ŸŽฆ๐ŸŒญ"
> And base64 encoded one looks like this: "YWJjZGU=".
>
> Even though each emoji is 4 bytes long, there is a noticable difference in size when we are talking about larger chunks of data:
>
> ---
> $ dd if=/dev/urandom bs=4K count=16K of=test.raw
> 16384+0 records in
> 16384+0 records out
> 67108864 bytes (67 MB, 64 MiB) copied, 1.90423 s, 35.2 MB/s
> $ dd if=test.raw | ./ecoji-d |  wc -c
> 67108864 bytes (67 MB, 64 MiB) copied, 6.7699 s, 9.9 MB/s
> 71591534 # Size increased just by 6%
> $ dd if=test.raw | base64 |  wc -c
> 67108864 bytes (67 MB, 64 MiB) copied, 0.750174 s, 89.5 MB/s
> 90655837 # 35%(!) increase in size
> ---
>
> And if we move to real word scenarios, where web pages are gzip'ped most of the time:
>
> ---
> $ dd if=test.raw | gzip -c | wc -c
> 67108864 bytes (67 MB, 64 MiB) copied, 5.49022 s, 12.2 MB/s
> 67119122 # Raw files are terrible for compression
> $ dd if=test.raw | ./ecoji-d | gzip -c | wc -c
> 67108864 bytes (67 MB, 64 MiB) copied, 27.9972 s, 2.4 MB/s
> 32178275 # 48% improvement
> $ dd if=test.raw | base64 | gzip -c | wc -c
> 67108864 bytes (67 MB, 64 MiB) copied, 10.3381 s, 6.5 MB/s
> 68892893 # Pretty bad, yeah
> ---
>
> So yeah, ecoji is better than base64 in everything but speed. Speed will be improved. Later.

If your care about size of data then you're not going to encode anyway.
Same goes for speed.

Besides your encoding isn't going to work with actual web-pages anyway, because your encoder doesn't have browser support.

Sure you can encode your data and gzip it, but once it reaches the browser and it unzips it, then what? The browser doesn't know what to do with the data. You can't even use base64 for http headers.

At most it could be used for email clients, since they do support "Content-Transfer-Encoding" but browsers don't. They only support "Content-Encoding" which at most can be compressions such as gzip.
March 16, 2018

On 15/03/2018 19:45, Anton Fediushin wrote:
> $ dd if=test.raw | ./ecoji-d | gzip -c | wc -c
> 67108864 bytes (67 MB, 64 MiB) copied, 27.9972 s, 2.4 MB/s
> 32178275 # 48% improvement

If you can compress random data to 52% of the original data, you should repeat this step until there is a single byte left.
March 18, 2018
On 2018-03-14 18:30, Anton Fediushin wrote:
> ๐Ÿ––, I'm glad to announce that ecoji-d - pure D implementation of ecoji encoding version 1๏ธโƒฃ.0๏ธโƒฃ.0๏ธโƒฃ is finally releasedโ—
> 
> What is ecoji?
> 
> Ecoji encodes data as base1024 with an emoji character set. It can be used instead of boring and old base64 ๐Ÿคฎ๐Ÿคฎ๐Ÿคฎ.
> 
> Encoding example:
> 
> ---
> $ echo "Base64 is so 1999, isn't there something better?" | ecoji-d
> ๐Ÿ—๐Ÿ“ฉ๐ŸŽฆ๐Ÿ‡๐ŸŽ›๐Ÿ“˜๐Ÿ”ฏ๐Ÿšœ๐Ÿ’ž๐Ÿ˜ฝ๐Ÿ†–๐ŸŠ๐ŸŽฑ๐Ÿฅ๐Ÿš„๐ŸŒฑ๐Ÿ’ž๐Ÿ˜ญ๐Ÿ’ฎ๐Ÿ‡ต๐Ÿ’ข๐Ÿ•ฅ๐Ÿญ๐Ÿ”ธ๐Ÿ‰๐Ÿšฒ๐Ÿฆ‘๐Ÿถ๐Ÿ’ข๐Ÿ•ฅ๐Ÿ”ฎ๐Ÿ”บ๐Ÿ‰๐Ÿ“ธ๐Ÿฎ๐ŸŒผ๐Ÿ‘ฆ๐ŸšŸ๐Ÿฅด๐Ÿ“‘ 
> 

Useful feature: Easy manual verification.
March 17, 2018
On 15 March 2018 at 11:45, Anton Fediushin via Digitalmars-d-announce < digitalmars-d-announce@puremagic.com> wrote:

>
> Even though each emoji is 4 bytes long, there is a noticable difference in size when we are talking about larger chunks of data:
>

This doesn't make sense. For every 10 bits, you're emitting 32 bits...
you're more than tripling the size of the data.
Base64 takes 6 bits and emits 8 bits, which is a third larger. 1.333x is
smaller than 3.2x. O_o


March 18, 2018
On Thursday, 15 March 2018 at 18:45:51 UTC, Anton Fediushin wrote:
> $ dd if=test.raw | gzip -c | wc -c
> 67108864 bytes (67 MB, 64 MiB) copied, 5.49022 s, 12.2 MB/s
> 67119122 # Raw files are terrible for compression
> $ dd if=test.raw | ./ecoji-d | gzip -c | wc -c
> 67108864 bytes (67 MB, 64 MiB) copied, 27.9972 s, 2.4 MB/s
> 32178275 # 48% improvement
> $ dd if=test.raw | base64 | gzip -c | wc -c
> 67108864 bytes (67 MB, 64 MiB) copied, 10.3381 s, 6.5 MB/s
> 68892893 # Pretty bad, yeah

Randomness isn't compressible. The fact that ecoji-d compresses anything above 1% shows only that there is a bug in your library:

```
$ dd if=/dev/urandom bs=4K count=16K of=test.raw
16384+0 records in
16384+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 0.373423 s, 180 MB/s

$ dd if=test.raw | ./ecoji-d | gzip -c | gzip -cd | ./ecoji-d -d > test2.raw
131072+0 records in
131072+0 records out
67108864 bytes (67 MB, 64 MiB) copied, 24.9523 s, 2.7 MB/s

$ wc -c test.raw test2.raw
67108864 test.raw
11185155 test2.raw
```

So definitely not the same files before and after compression/decompression. However the beginning is the same:

```
$ xxd test.raw | head
00000010: a05f c801 bf01 13c1 04a2 556a 6d79 a09c  ._........Ujmy..
00000020: 8032 523e 851d 419a b0d3 0c4f e7ba 93e1  .2R>..A....O....
00000030: 9fdc 7c55 2645 f6e7 3f9e f5db bc92 1e29  ..|U&E..?......)
00000040: 457a a3b9 c274 3b08 6bde 486a 1798 f281  Ez...t;.k.Hj....
00000050: 9d91 e97a f13f db8b 5d0c 114a 27be 2154  ...z.?..]..J'.!T
00000060: a9a2 3a17 36e4 9181 64f2 35b6 aa91 064d  ..:.6...d.5....M
00000070: 863a ddbd 8776 f87d 3eb2 634f 12dc 6e7f  .:...v.}>.cO..n.
00000080: 46c9 bc95 2620 b315 e84d 9ee4 8651 d172  F...& ...M...Q.r
00000090: 836d 7bf8 9e1c 09c3 0e10 b787 7e06 bc39  .m{.........~..9

$ xxd test2.raw | head
00000010: a05f c801 bf01 13c1 04a2 556a 6d79 a09c  ._........Ujmy..
00000020: 8032 523e 851d 419a b0d3 0c4f e7ba 93e1  .2R>..A....O....
00000030: 9fdc 7c55 2645 f6e7 3f9e f5db bc92 1e29  ..|U&E..?......)
00000040: 457a a3b9 c274 3b08 6bde 486a 1798 f281  Ez...t;.k.Hj....
00000050: 9d91 e97a f13f db8b 5d0c 114a 27be 2154  ...z.?..]..J'.!T
00000060: a9a2 3a17 36e4 9181 64f2 35b6 aa91 064d  ..:.6...d.5....M
00000070: 863a ddbd 8776 f87d 3eb2 634f 12dc 6e7f  .:...v.}>.cO..n.
00000080: 46c9 bc95 2620 b315 e84d 9ee4 8651 d172  F...& ...M...Q.r
00000090: 836d 7bf8 9e1c 09c3 0e10 b787 7e06 bc39  .m{.........~..9
```

So I think ecoji-d just truncates its input at some point.
March 18, 2018
On Friday, 16 March 2018 at 08:25:30 UTC, bauss wrote:
> Besides your encoding isn't going to work with actual web-pages anyway, because your encoder doesn't have browser support.

Well, encoding is not *mine*, only D implementation is. What do you mean by "browser support"? Indeed, ecoji-d cannot be used on the client side, but since algorithm is simple and code is publically available anyone can implement decoding in JavaScript or any other language.

> Sure you can encode your data and gzip it, but once it reaches the browser and it unzips it, then what? The browser doesn't know what to do with the data. You can't even use base64 for http headers.

Then you use client-side decoder, of course!

March 18, 2018
On Sunday, 18 March 2018 at 11:25:45 UTC, Cym13 wrote:
> So I think ecoji-d just truncates its input at some point.

Indeed, there's an error somewhere. For some reason it stops after 7457792 bytes. I'll create an issue for that and will look into this later
« First   ‹ Prev
1 2