Thread overview
Swapping nibbles range style
Dec 10, 2013
Volcz
Dec 10, 2013
MrSmith
Dec 10, 2013
bearophile
Dec 11, 2013
Volcz
Dec 11, 2013
Marco Leise
Dec 11, 2013
bearophile
Dec 10, 2013
Justin Whear
December 10, 2013
Hi!
As a small exercise I've been trying to implement a nibble swapping method which utilizes D's ranges.

I received some help from eco on #d but still couldn't complete it.
his contribution: http://dpaste.dzfl.pl/f228d0a8

Example:
Input  1234
Output 2143

This is going to be used to build a small utility program which can be used "echo 1234 | nibbleswap"

Help is appreciated!
December 10, 2013
On Tuesday, 10 December 2013 at 22:08:17 UTC, Volcz wrote:
> Hi!
> As a small exercise I've been trying to implement a nibble swapping method which utilizes D's ranges.
>
> I received some help from eco on #d but still couldn't complete it.
> his contribution: http://dpaste.dzfl.pl/f228d0a8
>
> Example:
> Input  1234
> Output 2143
>
> This is going to be used to build a small utility program which can be used "echo 1234 | nibbleswap"
>
> Help is appreciated!

Can you explain a bit more. It's unclear what is the exact problem.
Is it intended to work with unicode strings of any length? Simply swap pairs of chars? What if string length is odd?
December 10, 2013
On Tue, 10 Dec 2013 23:08:16 +0100, Volcz wrote:

> Hi!
> As a small exercise I've been trying to implement a nibble swapping
> method which utilizes D's ranges.
> 
> I received some help from eco on #d but still couldn't complete it. his contribution: http://dpaste.dzfl.pl/f228d0a8
> 
> Example:
> Input  1234 Output 2143
> 
> This is going to be used to build a small utility program which can be used "echo 1234 | nibbleswap"
> 
> Help is appreciated!

How's this?
http://dpaste.dzfl.pl/135285f4
December 10, 2013
A nibble is 4 bits. Working with strings like this is not so efficient.

Two more versions of the OP code, the second is a little more efficient:


import std.stdio, std.algorithm, std.range, std.string;

string swapAdjacent(in string s) pure
in {
    assert(s.length % 2 == 0);
    assert(s.all!(c => c < 128));
} out(result) {
    assert(result.length == s.length);
} body {
    return s
           .chunks(2)
           .map!(c => [cast(char)c.dropOne.front, cast(char)c.front])
           .join;
}

string swapAdjacent2(in string s) pure
in {
    assert(s.length % 2 == 0);
    assert(s.all!(c => c < 128));
} out(result) {
    assert(result.length == s.length);
} body {
    auto result = new char[s.length];
    foreach (immutable i, ref c; result)
        c = s[i + (i & 1 ? -1 : 1)];
    return result;
}

void main() {
    "0123456789ABCDEF".swapAdjacent.writeln;
    "0123456789ABCDEF".swapAdjacent2.writeln;
}


Bye,
bearophile
December 11, 2013
On Tuesday, 10 December 2013 at 22:20:54 UTC, MrSmith wrote:
> On Tuesday, 10 December 2013 at 22:08:17 UTC, Volcz wrote:
>
> Can you explain a bit more. It's unclear what is the exact problem.
> Is it intended to work with unicode strings of any length? Simply swap pairs of chars? What if string length is odd?

From my experience this is a common exercise in the telecom business.
My data is luckily always stored in a hex format, so there is no need to consider unicode(?).
I've never seen data with a missing nibble. I usually checks that the data is a even number chars. In real world cases F usually indicates unused nibble.
These things are all over the 3gpp specifications.

On Tuesday, 10 December 2013 at 23:26:01 UTC, Justin Whear wrote:
> On Tue, 10 Dec 2013 23:08:16 +0100, Volcz wrote:
>
> How's this?
> http://dpaste.dzfl.pl/135285f4

On Tuesday, 10 December 2013 at 23:39:04 UTC, bearophile wrote:
> A nibble is 4 bits. Working with strings like this is not so efficient.
> string swapAdjacent(in string s) pure
> in {
>     assert(s.length % 2 == 0);
>     assert(s.all!(c => c < 128));
> } out(result) {
>     assert(result.length == s.length);
> } body {
>     return s
>            .chunks(2)
>            .map!(c => [cast(char)c.dropOne.front, cast(char)c.front])
>            .join;
> Bye,
> bearophile

Are there any obvious difference between the three solutions I have received? They all to work the same to me.
I like very much the contract programming, it's a area still unexplored to me.
Readability wise I think
chunk => chunk.cycle.dropOne.takeExactly(2)
is more readable than
c => [cast(char)c.dropOne.front, cast(char)c.front]

My Dlang-fu has much improved thanks to the masters help,
Volcz
December 11, 2013
Am Wed, 11 Dec 2013 07:19:41 +0100
schrieb "Volcz" <volcz@kth.se>:

> Are there any obvious difference between the three solutions I have received? They all to work the same to me.

I hope they all work the same! Depending on your background you just might prefer one style over the other.

As an example take me. I like high efficiency so I look at the version with

  src.chunks(2).map!((a) {
    auto a1 = a.front();
    a.popFront();
    return format("%s%s", a.front(), a1);
  } )

and notice the format() function, which looks overkill to me for a task of swapping two bytes. Then there is chunks(), which will decode the UTF-8 string in src, which I don't need, since there are no multi-byte characters in src. The same is true for

  return range.chunks(2).map!(
    chunk => chunk.cycle.dropOne.takeExactly(2)
  ).joiner;

and

  return s.chunks(2).map!(
    c => [cast(char)c.dropOne.front, cast(char)c.front]
  ).join;

So I would use the second version offered by bearophile:

  auto result = new char[s.length];
  foreach (immutable i, ref c; result)
    c = s[i + (i & 1 ? -1 : 1)];
  return result;

I can easily reason about it from looking at it. It makes one allocation "new char[s.length]" and does no UTF-8 decoding.

-- 
Marco

December 11, 2013
Marco Leise:

>   return s.chunks(2).map!(
>     c => [cast(char)c.dropOne.front, cast(char)c.front]
>   ).join;
>
> So I would use the second version offered by bearophile:
>
>   auto result = new char[s.length];
>   foreach (immutable i, ref c; result)
>     c = s[i + (i & 1 ? -1 : 1)];
>   return result;
>
> I can easily reason about it from looking at it. It makes one
> allocation "new char[s.length]" and does no UTF-8 decoding.

But the first version with dropOne and two fronts is for me simper to see as correct, while in the second code I need a bit of thinking to prove there are no out of bounds accesses.

Perhaps for/foreach are becoming a code smell, they are becoming an optimization over the more common and safer functional code (or an alternative solution where the functional code is a problem).

To avoid the conditional (I have not benchmarked them):

c = s[i + (2 * !(i & 1) - 1)];

Bye,
bearophile