View mode: basic / threaded / horizontal-split · Log in · Help
August 07, 2010
palindrome function
Hello!
I'm learning D and I was wondering on what can be optimized of my
implementation of a palindrome function:

bool isPalindrome(int num) {
   string s = text(num);
   int len = s.length;
   for (int i = 0; i < len / 2; ++i) {
       if (s[i] != s[len - 1 - i]) {
           return false;
       }
   }
   return true;
}

1) Is std.conv.text() the best function to call here to convert a number to
string?
2) Does using a variable to store the string length is more efficient than
calling the length method?
3) This example use numeric characters but if the string were UTF8 the []
operator would be OK for multibyte characters?

Not related:

4) Is some way to declare the nested functions at the end, this way I could
create nice "routines" of a function.
5) I could not find in the documentation if when I declare a variable inside a
block (a loop for example) that variable creation is optimized by the compiler.
August 07, 2010
Re: palindrome function
Sorry I forg to paste my last version"

bool isPalindrome(int num) {
   string s = text(num);
   int length = s.length;
   int limit = length / 2;
   for (int i = 0; i < limit; ++i) {
       if (s[i] != s[$ - 1 - i]) {
           return false;
       }
   }
   return true;
}
August 08, 2010
Re: palindrome function
On Saturday 07 August 2010 16:56:08 ex wrote:
> Hello!
> I'm learning D and I was wondering on what can be optimized of my
> implementation of a palindrome function:
> 
> bool isPalindrome(int num) {
>     string s = text(num);
>     int len = s.length;
>     for (int i = 0; i < len / 2; ++i) {
>         if (s[i] != s[len - 1 - i]) {
>             return false;
>         }
>     }
>     return true;
> }
> 
> 1) Is std.conv.text() the best function to call here to convert a number to
> string?

Use to!string(num) instead of text(num).

> 2) Does using a variable to store the string length is more efficient than
> calling the length method?

length is a property, not a method. I find it highly unlikely that storing length 
in a variable is going to make it any faster.

> 3) This example use numeric characters but if the string were UTF8 the []
> operator would be OK for multibyte characters?

Using [] with strings is a _bad_ idea. [] gives you the chars (or wchars in the 
case of wstring). chars are UTF-8 code units, and wchars are UTF-16 code units. 
If you want to use [] with a string, use a dstring, since dchars are UTF-32 code 
units, and unlike UTF-8 and UTF-16, a code unit and a code point are the same 
size. A code point represents what you would usually consider a character, while 
a code unit is one portion of a character's encoding (which could be the whole 
thing if the character is only one code unit which ASCII characters would be).

For instance, if you're looping over a string or wstring, you need to use dchar 
even if though the array itself is char or wchar:

foreach(dchar c; str)
{
 //...
}

The compiler understands the conversion and loops over the code points rather 
than the code units. [] isn't going to work with strings and wstrings if you 
want code points (characters) rather than code units. So, in your function, 
you'd want to use dstring rather than string. So, you'd do

auto s = to!dstring(num)

to get your string.


> 
> Not related:
> 
> 4) Is some way to declare the nested functions at the end, this way I could
> create nice "routines" of a function.

You can declare nested functions, but I think that they're like local variables 
in that they have to be declared before you can use them. Otherwise, the symbol 
isn't in scope yet when you try and use it.

> 5) I could not find in the documentation if when I declare a variable
> inside a block (a loop for example) that variable creation is optimized by
> the compiler.

Whether the compiler optimizes stuff or not is not likely to be in the 
documentation. That's not part of the D spec and is entirely implementation-
dependent. You're going to have to look at the generated assembly code if you 
want to know what it actually, does.


As for a better way to do what you're doing, I'd do this:

bool isPalindrome(int num)
{
   auto s = to!dstring(num);

   while(!s.empty)
   {
       if(s.length == 1)
           return true;

       if(s.front != s.back)
           return false;

       s.popFront();
       s.popBack();
   }

   return true;
}

It would be more typical to use strings as ranges in this manner. Also, in this 
case, it doesn't matter whether you use string, wstring, or dstring, since 
front, back, popFront(), and popBack() all deal with the unicode properly. front 
and back return dchar in all cases, and popFront() and popBack() always pop a 
code point rather than a code unit. It's probably better to use a dstring in 
this case because then no conversions have to take place from UTF-8 or UTF-16 to 
UTF-32, but it probably doesn't matter unless you're calling the function a lot.

In addition, this version of the code is easier to read than having to deal with 
array indicies.

- Jonathan M Davis
August 08, 2010
Re: palindrome function
As a side note, I'd suggest making your function take a string of some kind, if 
not outright making it a template to deal with multiple string types. That way, 
you can check for a palindrome regardless of whether you're dealing with 
numbers, and if you need to do it with a number, you just convert it to a string 
when you call isPalidrome. So, you're function (without changing anything other 
than the signature) would become something like

bool isPalindrome(S)(in S s) {
   int length = s.length;
   int limit = length / 2;
   for (int i = 0; i < limit; ++i) {
       if (s[i] != s[$ - 1 - i]) {
           return false;
       }
   }
   return true;
}


and my version would look like this:

bool isPalindrome(S)(in S s)
{
   while(!s.empty)
   {
       if(s.length == 1)
           return true;

       if(s.front != s.back)
           return false;

       s.popFront();
       s.popBack();
   }

   return true;
}


So, calls to it with a number would look like

isPalindrome(to!string(num))

- Jonathan M Davis
August 08, 2010
Re: palindrome function
ex:
Welcome to D :-) Few more comments beside the ones written by Jonathan M Davis.

> I'm learning D and I was wondering on what can be optimized of my
> implementation of a palindrome function:

See below.


> 2) Does using a variable to store the string length is more efficient than
> calling the length method?

Nope. When you have similar questions you can write a small benchmark and you can take a look at the produced assembly code.


> 4) Is some way to declare the nested functions at the end, this way I could
> create nice "routines" of a function.

I don't understand this question. But inside a function the names need to be written in their natural visibility order, so you have to define nested functions before you use them.


> 5) I could not find in the documentation if when I declare a variable inside a
> block (a loop for example) that variable creation is optimized by the compiler.

With DMD sometimes the compiler is able to pull it out of the loop, and some other times it's not able to do it. Generally in the simple cases DMD is able to do it. But I have seen were updating an object field inside a method was not pulled out of the loop and was not replaced by a temporary variable by the DMD compiler. LDC (D1) compiler is usually better in this regard. So if you have a performance critical routine it's better to manually move all things out of the loop :-)


> Sorry I forg to paste my last version"
> 
> bool isPalindrome(int num) {
>     string s = text(num);
>     int length = s.length;
>     int limit = length / 2;
>     for (int i = 0; i < limit; ++i) {
>         if (s[i] != s[$ - 1 - i]) {
>             return false;
>         }
>     }
>     return true;
> }

D is a multi-level language, so usually there are always ways to further optimize code. At the end you can even translate your routine in inlined assembly. Often you don't need to write the faster code, a more natural implementation is enough. Idiomatic D style is often not the most efficient way to write your code, so if you look for max efficiency, you end with code that is often not idiomatic.


This is written in C-style, and seems faster:

import std.c.stdio: sprintf;
import std.c.stdlib: exit, EXIT_FAILURE;

size_t isPalindrome(int num) {
   if (num == int.min) // because you can't negate int.min
       return 0;
   if (num < 0)
       num = -num;

   char[20] s = void;
   char* inf = s.ptr;
   int size = sprintf(inf, "%d", num); // atoi is faster
   if (size < 0)
       exit(EXIT_FAILURE); // or you can return 2 as error code
   char* sup = &(s[size - 1]);

   while (inf < sup)
       // you can add a loop invariant test here
       if (*inf++ != *sup--)
           return 0;
   return 1;
}

void main() {
   assert(isPalindrome(0));
   assert(!isPalindrome(10));
   assert(isPalindrome(101));
   assert(!isPalindrome(1010));
   assert(isPalindrome(-1));
   assert(isPalindrome(-101));
   assert(isPalindrome(1234554321));
   assert(isPalindrome(123454321));
}


This function allocates the string on the stack to avoid heap activity, returns a word because it's sometimes more efficient than fiddling with a byte type (the bool).

But it's not the fastest possible you can write, even if you don't touch assembly, because sprintf() is not the most efficient here, you can write your own faster atoi in D:


size_t isPalindrome(int num) {
   if (num == int.min) // because you can't negate int.min
       return 0;
   if (num < 0)
       num = -num;

   int[20] snum = void;
   int* inf = snum.ptr;
   int* sup = inf;

   if (!num) goto END; *sup++ = num % 10; num /= 10;
   if (!num) goto END; *sup++ = num % 10; num /= 10;
   if (!num) goto END; *sup++ = num % 10; num /= 10;
   if (!num) goto END; *sup++ = num % 10; num /= 10;
   if (!num) goto END; *sup++ = num % 10; num /= 10;
   if (!num) goto END; *sup++ = num % 10; num /= 10;
   if (!num) goto END; *sup++ = num % 10; num /= 10;
   if (!num) goto END; *sup++ = num % 10; num /= 10;
   if (!num) goto END; *sup++ = num % 10; num /= 10;
   if (!num) goto END; *sup++ = num % 10; num /= 10;
 END:
   sup--;

   while (inf < sup)
       if (*inf++ != *sup--)
           return 0;
   return 1;
}

void main() {
   assert(isPalindrome(0));
   assert(isPalindrome(5));
   assert(!isPalindrome(10));
   assert(isPalindrome(101));
   assert(!isPalindrome(1010));
   assert(isPalindrome(-1));
   assert(isPalindrome(-101));
   assert(isPalindrome(1234554321));
   assert(isPalindrome(123454321));
   assert(!isPalindrome(int.max));
}


But DMD is not good at coverting %10 and /10 in shifts and the like as better compilers are able to do, so such code fiddly code ends slower than the version with sprintf(). If compiled with LDC this latest version can be faster than the version with sprintf.

In normal D programs you never write code like that unless your profiler has shown you finding the palindrome is time-critical in your program, and even then you may start thinking about 1/2 GB of RAM to pre-compute all palindromes of 32-bit inputs (keeping one bit for each answer), but this is not nice for the CPU caches, so tests and timings are needed.

Bye,
bearophile
Top | Discussion index | About this forum | D home