Thread overview
need help convert c code
Jul 17, 2009
D. Reeds
Jul 17, 2009
Robert Fraser
Jul 17, 2009
D. Reeds
Jul 17, 2009
D. Reeds
Jul 17, 2009
D. Reeds
Jul 17, 2009
bearophile
July 17, 2009
can anybody help me translate this c code into d, im using D1+tango combo. i'm new to D and got stucked on multi-dimension array part.

int levenshtein_distance(char *s,char*t)
//Compute levenshtein distance between s and t
{
  //Step 1
  int k,i,j,n,m,cost,*d,distance;
  n=strlen(s);
  m=strlen(t);
  if(n!=0&&m!=0)
  {
    d=malloc((sizeof(int))*(m+1)*(n+1));
    m++;
    n++;
    //Step 2
    for(k=0;k<n;k++)
	d[k]=k;
    for(k=0;k<m;k++)
      d[k*n]=k;
    //Step 3 and 4
    for(i=1;i<n;i++)
      for(j=1;j<m;j++)
	{
        //Step 5
        if(s[i-1]==t[j-1])
          cost=0;
        else
          cost=1;
        //Step 6
        d[j*n+i]=minimum(d[(j-1)*n+i]+1,d[j*n+i-1]+1,d[(j-1)*n+i-1]+cost);
      }
    distance=d[n*m-1];
    free(d);
    return distance;
  }
  else
    return -1; //a negative return value means that one or both strings are empty.
}

int minimum(int a,int b,int c)
//Gets the minimum of three values
{
  int min=a;
  if(b<min)
    min=b;
  if(c<min)
    min=c;
  return min;
}

it is a levenshtein distance algorithm.
July 17, 2009
D. Reeds wrote:
> can anybody help me translate this c code into d, im using D1+tango combo. i'm new to D and got stucked on multi-dimension array part. 
> 
> int levenshtein_distance(char *s,char*t)
> //Compute levenshtein distance between s and t
> {
>   //Step 1
>   int k,i,j,n,m,cost,*d,distance;
>   n=strlen(s);   m=strlen(t);
>   if(n!=0&&m!=0)
>   {
>     d=malloc((sizeof(int))*(m+1)*(n+1));
>     m++;
>     n++;
>     //Step 2	
>     for(k=0;k<n;k++)
> 	d[k]=k;
>     for(k=0;k<m;k++)
>       d[k*n]=k;
>     //Step 3 and 4	
>     for(i=1;i<n;i++)
>       for(j=1;j<m;j++)
> 	{
>         //Step 5
>         if(s[i-1]==t[j-1])
>           cost=0;
>         else
>           cost=1;
>         //Step 6			         d[j*n+i]=minimum(d[(j-1)*n+i]+1,d[j*n+i-1]+1,d[(j-1)*n+i-1]+cost);
>       }
>     distance=d[n*m-1];
>     free(d);
>     return distance;
>   }
>   else     return -1; //a negative return value means that one or both strings are empty.
> }
> 
> int minimum(int a,int b,int c)
> //Gets the minimum of three values
> {
>   int min=a;
>   if(b<min)
>     min=b;
>   if(c<min)
>     min=c;
>   return min;
> }
> 
> it is a levenshtein distance algorithm.

Ummm... import tango.stdc.stdlib then copy & paste; that code should work the same in D as in C. The only changes you should need are:

int k,i,j,n,m,cost,*d,distance;

Which should be changed to:

int k,i,j,n,m,cost,distance;
int* d;

And sizeof(int) -> int.sizeof
July 17, 2009
Robert Fraser Wrote:

> D. Reeds wrote:
> > can anybody help me translate this c code into d, im using D1+tango combo. i'm new to D and got stucked on multi-dimension array part.
> > 
> > int levenshtein_distance(char *s,char*t)
> > //Compute levenshtein distance between s and t
> > {
> >   //Step 1
> >   int k,i,j,n,m,cost,*d,distance;
> >   n=strlen(s);
> >   m=strlen(t);
> >   if(n!=0&&m!=0)
> >   {
> >     d=malloc((sizeof(int))*(m+1)*(n+1));
> >     m++;
> >     n++;
> >     //Step 2
> >     for(k=0;k<n;k++)
> > 	d[k]=k;
> >     for(k=0;k<m;k++)
> >       d[k*n]=k;
> >     //Step 3 and 4
> >     for(i=1;i<n;i++)
> >       for(j=1;j<m;j++)
> > 	{
> >         //Step 5
> >         if(s[i-1]==t[j-1])
> >           cost=0;
> >         else
> >           cost=1;
> >         //Step 6
> >         d[j*n+i]=minimum(d[(j-1)*n+i]+1,d[j*n+i-1]+1,d[(j-1)*n+i-1]+cost);
> >       }
> >     distance=d[n*m-1];
> >     free(d);
> >     return distance;
> >   }
> >   else
> >     return -1; //a negative return value means that one or both strings are empty.
> > }
> > 
> > int minimum(int a,int b,int c)
> > //Gets the minimum of three values
> > {
> >   int min=a;
> >   if(b<min)
> >     min=b;
> >   if(c<min)
> >     min=c;
> >   return min;
> > }
> > 
> > it is a levenshtein distance algorithm.
> 
> Ummm... import tango.stdc.stdlib then copy & paste; that code should work the same in D as in C. The only changes you should need are:
> 
> int k,i,j,n,m,cost,*d,distance;
> 
> Which should be changed to:
> 
> int k,i,j,n,m,cost,distance;
> int* d;
> 
> And sizeof(int) -> int.sizeof

thank you very much.
July 17, 2009
Robert Fraser Wrote:

> D. Reeds wrote:
> > can anybody help me translate this c code into d, im using D1+tango combo. i'm new to D and got stucked on multi-dimension array part.
> > 
> > int levenshtein_distance(char *s,char*t)
> > //Compute levenshtein distance between s and t
> > {
> >   //Step 1
> >   int k,i,j,n,m,cost,*d,distance;
> >   n=strlen(s);
> >   m=strlen(t);
> >   if(n!=0&&m!=0)
> >   {
> >     d=malloc((sizeof(int))*(m+1)*(n+1));
> >     m++;
> >     n++;
> >     //Step 2
> >     for(k=0;k<n;k++)
> > 	d[k]=k;
> >     for(k=0;k<m;k++)
> >       d[k*n]=k;
> >     //Step 3 and 4
> >     for(i=1;i<n;i++)
> >       for(j=1;j<m;j++)
> > 	{
> >         //Step 5
> >         if(s[i-1]==t[j-1])
> >           cost=0;
> >         else
> >           cost=1;
> >         //Step 6
> >         d[j*n+i]=minimum(d[(j-1)*n+i]+1,d[j*n+i-1]+1,d[(j-1)*n+i-1]+cost);
> >       }
> >     distance=d[n*m-1];
> >     free(d);
> >     return distance;
> >   }
> >   else
> >     return -1; //a negative return value means that one or both strings are empty.
> > }
> > 
> > int minimum(int a,int b,int c)
> > //Gets the minimum of three values
> > {
> >   int min=a;
> >   if(b<min)
> >     min=b;
> >   if(c<min)
> >     min=c;
> >   return min;
> > }
> > 
> > it is a levenshtein distance algorithm.
> 
> Ummm... import tango.stdc.stdlib then copy & paste; that code should work the same in D as in C. The only changes you should need are:
> 
> int k,i,j,n,m,cost,*d,distance;
> 
> Which should be changed to:
> 
> int k,i,j,n,m,cost,distance;
> int* d;
> 
> And sizeof(int) -> int.sizeof

i just realized, this code uses c string and c string function therefor its not utf safe. i do need to convert it to D.
July 17, 2009
D. Reeds wrote:
> Robert Fraser Wrote:
> 
>> D. Reeds wrote:
>>> can anybody help me translate this c code into d, im using D1+tango combo. i'm new to D and got stucked on multi-dimension array part. 
>>>
>>> int levenshtein_distance(char *s,char*t)
>>> //Compute levenshtein distance between s and t
>>> {
>>>   //Step 1
>>>   int k,i,j,n,m,cost,*d,distance;
>>>   n=strlen(s);   m=strlen(t);
>>>   if(n!=0&&m!=0)
>>>   {
>>>     d=malloc((sizeof(int))*(m+1)*(n+1));
>>>     m++;
>>>     n++;
>>>     //Step 2	
>>>     for(k=0;k<n;k++)
>>> 	d[k]=k;
>>>     for(k=0;k<m;k++)
>>>       d[k*n]=k;
>>>     //Step 3 and 4	
>>>     for(i=1;i<n;i++)
>>>       for(j=1;j<m;j++)
>>> 	{
>>>         //Step 5
>>>         if(s[i-1]==t[j-1])
>>>           cost=0;
>>>         else
>>>           cost=1;
>>>         //Step 6			         d[j*n+i]=minimum(d[(j-1)*n+i]+1,d[j*n+i-1]+1,d[(j-1)*n+i-1]+cost);
>>>       }
>>>     distance=d[n*m-1];
>>>     free(d);
>>>     return distance;
>>>   }
>>>   else     return -1; //a negative return value means that one or both strings are empty.
>>> }
>>>
>>> int minimum(int a,int b,int c)
>>> //Gets the minimum of three values
>>> {
>>>   int min=a;
>>>   if(b<min)
>>>     min=b;
>>>   if(c<min)
>>>     min=c;
>>>   return min;
>>> }
>>>
>>> it is a levenshtein distance algorithm.
>> Ummm... import tango.stdc.stdlib then copy & paste; that code should work the same in D as in C. The only changes you should need are:
>>
>> int k,i,j,n,m,cost,*d,distance;
>>
>> Which should be changed to:
>>
>> int k,i,j,n,m,cost,distance;
>> int* d;
>>
>> And sizeof(int) -> int.sizeof
> 
> i just realized, this code uses c string and c string function therefor its not utf safe. i do need to convert it to D.


Fun fact: There is already a Levenshtein distance algorithm in Phobos for D2:
http://www.digitalmars.com/d/2.0/phobos/std_algorithm#levenshteinDistance

I know you said you use D1+Tango, I just thought I should mention it. The source code is available, if you want to check it out:
http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/algorithm.d

-Lars
July 17, 2009
Lars T. Kyllingstad Wrote:
> Fun fact: There is already a Levenshtein distance algorithm in Phobos for D2: http://www.digitalmars.com/d/2.0/phobos/std_algorithm#levenshteinDistance
> 
> I know you said you use D1+Tango, I just thought I should mention it.
> The source code is available, if you want to check it out:
> http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/algorithm.d
> 
> -Lars

awesome, exactly what im looking for. thanks alot.
July 17, 2009
D. Reeds wrote:
> Lars T. Kyllingstad Wrote:
>> Fun fact: There is already a Levenshtein distance algorithm in Phobos for D2:
>> http://www.digitalmars.com/d/2.0/phobos/std_algorithm#levenshteinDistance
>>
>> I know you said you use D1+Tango, I just thought I should mention it. The source code is available, if you want to check it out:
>> http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/algorithm.d
>>
>> -Lars
> 
> awesome, exactly what im looking for. thanks alot.

No problem. :)

Here's a small warning, though: You should be aware that D2 is the experimental branch of the D language. It is very different from D1, so it is unlikely that you can just copy and paste the code into your program.

-Lars
July 17, 2009
Lars T. Kyllingstad:
> Fun fact: There is already a Levenshtein distance algorithm in Phobos for D2: http://www.digitalmars.com/d/2.0/phobos/std_algorithm#levenshteinDistance

For more than a year there's a Levenshtein distance for D1 in my dlibs too, and it's quite faster than the one in Phobos2 (but take a look at the software licence):
http://www.fantascienza.net/leonardo/so/libs_d.zip
See functions "editDistance" and "editDistanceFast" in the "func.d" module.

Bye,
bearophile