February 02, 2012
Alex R. Petersen:

(Sorry for my last blank answer.)

> Because D is a strongly typed language. Casting a string to an int doesn't make sense from a type system perspective.

I think that D being strongly typed is not significant here. When you cast a string to char* you are casting a 2 words struct to a single pointer, when you cast a char* to long on a 32 bit system you are changing type and size, etc. The purpose of cast() is right to break the strongly typed nature of the D type system.

So I think a better answer is that D designers have decided to give different purposes to to!X(y) and cast(X)x. The cast() is meant to be a light and very quick conversion, usually done at compile-time (unless it's a dynamic cast), to throw no exceptions, and generally unsafe. to!() is meant to be safer, to throw exceptions if the conversion fails, and it uses library code, so it's more flexible, and often performs some work at run-time too. Given such design a string->int conversion is better left to to!().

Bye,
bearophile
February 02, 2012
xancorreu:

> I get "segment violation" error with  ./factorial 400000 How can I resolve it?

You are having a stack overflow. DMD currently doesn't print a good message because of this regression that is being worked on:
http://d.puremagic.com/issues/show_bug.cgi?id=6088

On Windows with DMD you increase the stack like this:
dmd -L/STACK:100000000 -run test2.d 400000 > result.txt

If it goes in overflow still, increase the stack some more. But it will take a long time to compute the result even with the latest 2.058head with improved GC because the algorithm you have used to compute the factorial is very bad.


I have rewritten your code like this:

import std.stdio, std.bigint, std.conv, std.exception;

BigInt recFactorial(in int n) {
    if (n == 0)
        return BigInt(1);
    else
        return BigInt(n) * recFactorial(n - 1);
}

void main(string[] args) {
    if (args.length != 2) {
        writeln("Factorial requires a number.");
    } else {
        try {
           writeln(recFactorial(to!int(args[1])));
        } catch (ConvException e) {
           writeln("Error");
        }
    }
}


Note the usage of ConvException, it's a very good practice to never use a generic "gotta catch them all" expression, because it leads to hiding other bugs in your code, and this is a source for troubles.

Bye,
bearophile
February 02, 2012
On 02/02/2012 10:18 AM, bearophile wrote:

> The cast() is meant to be a light and very quick conversion,
> usually done at compile-time (unless it's a dynamic cast),

I first read it as if you were saying that dynamic cast is the only one that is done at runtime. Actually many casts are done at runtime even for fundamental types.

> to throw no exceptions, and generally unsafe.

Just to be complete: You mean it for fundamental types. Of course user types' opCast operators may throw:

import std.exception;

class C
{
    int opCast(T : int)() const
    {
        enforce(false, "Not good.");
        return 42;
    }
}

void main()
{
    auto c = new C;
    auto i = cast(int)c;
}

> to!() is meant to be safer, to throw exceptions if the
> conversion fails, and it uses library code, so it's more
> flexible, and often performs some work at run-time too. Given
> such design a string->int conversion is better left to to!().

Agreed.

>
> Bye,
> bearophile

Ali

February 02, 2012
On Thursday, February 02, 2012 13:18:17 bearophile wrote:
> Alex R. Petersen:
> 
> (Sorry for my last blank answer.)
> 
> > Because D is a strongly typed language. Casting a string to an int doesn't make sense from a type system perspective.
> 
> I think that D being strongly typed is not significant here. When you cast a string to char* you are casting a 2 words struct to a single pointer, when you cast a char* to long on a 32 bit system you are changing type and size, etc. The purpose of cast() is right to break the strongly typed nature of the D type system.
> 
> So I think a better answer is that D designers have decided to give
> different purposes to to!X(y) and cast(X)x. The cast() is meant to be a
> light and very quick conversion, usually done at compile-time (unless it's
> a dynamic cast), to throw no exceptions, and generally unsafe. to!() is
> meant to be safer, to throw exceptions if the conversion fails, and it uses
> library code, so it's more flexible, and often performs some work at
> run-time too. Given such design a string->int conversion is better left to
> to!().

Casts generally reintrepret what they're converting on some level (though not necessarily as exactly as C++'s reinterpret_cast does). The types involved are always similar and any converting that takes place is incredibly straightforward. They don't really convert in the sense of creating one type from another. It's closer to treating a type like another type than actually converting it. Strings are not at all similar to ints. One is an array. The other is a single integral value. So, it makes no sense to treat one like the other.

std.conv.to, on the other hand, outright converts. It does what makes the most sense when trying to take a value of one type and create a value of another type from it. The two types may have absolutely nothing to do with each other. It's closer to constructing a value of one type from a value of another type than treating one like the other.

So, as Bearophile says, the purposes of casting and std.conv.to are very different.

- Jonathan M Davis
February 02, 2012
On Thursday, February 02, 2012 10:36:06 Ali Çehreli wrote:
> Just to be complete: You mean it for fundamental types. Of course user types' opCast operators may throw:
> 
> import std.exception;
> 
> class C
> {
> int opCast(T : int)() const
> {
> enforce(false, "Not good.");
> return 42;
> }
> }
> 
> void main()
> {
> auto c = new C;
> auto i = cast(int)c;
> }

Very true. However, std.conv.to will use a user-defined opCast if there is one, and so it's generally better to use std.conv.to with user-defined opCasts than to cast explicitly. The risk of screwing it up is lower too, since then you don't have to worry about the built-in cast accidentally being used instead of your user-defined opCast if you screwed up - e.g. by declaring

int opcast(T : int)() const { .. }

Though in many cases, the compiler will still catch that for you (not all though).

- Jonathan M Davis
February 02, 2012
Al 02/02/12 19:18, En/na bearophile ha escrit:
> Alex R. Petersen:
>
> (Sorry for my last blank answer.)
>
>> Because D is a strongly typed language. Casting a string to an int
>> doesn't make sense from a type system perspective.
> I think that D being strongly typed is not significant here. When you cast a string to char* you are casting a 2 words struct to a single pointer, when you cast a char* to long on a 32 bit system you are changing type and size, etc. The purpose of cast() is right to break the strongly typed nature of the D type system.
>
> So I think a better answer is that D designers have decided to give different purposes to to!X(y) and cast(X)x. The cast() is meant to be a light and very quick conversion, usually done at compile-time (unless it's a dynamic cast), to throw no exceptions, and generally unsafe. to!() is meant to be safer, to throw exceptions if the conversion fails, and it uses library code, so it's more flexible, and often performs some work at run-time too. Given such design a string->int conversion is better left to to!().
>
> Bye,
> bearophile
Okay, very useful answer!
Can I say "serialize the first, second and third arguments as Class Person"?

I mean, if you define a class Person like:

class Person {
    string name
    uint age
    dead bool
}

could you serialize the input from console, like Std.in.serialize(Person, args(0), args(1), args(2))?

You could do that "manually" checking each paramm, but it's a tedious task.
Thanks,
Xan.
February 02, 2012
Al 02/02/12 19:30, En/na bearophile ha escrit:
> xancorreu:
>
>> I get "segment violation" error with  ./factorial 400000
>> How can I resolve it?
> You are having a stack overflow. DMD currently doesn't print a good message because of this regression that is being worked on:
> http://d.puremagic.com/issues/show_bug.cgi?id=6088
>
> On Windows with DMD you increase the stack like this:
> dmd -L/STACK:100000000 -run test2.d 400000>  result.txt
>
> If it goes in overflow still, increase the stack some more. But it will take a long time to compute the result even with the latest 2.058head with improved GC because the algorithm you have used to compute the factorial is very bad.
>
>
> I have rewritten your code like this:
>
> import std.stdio, std.bigint, std.conv, std.exception;
>
> BigInt recFactorial(in int n) {
>      if (n == 0)
>          return BigInt(1);
>      else
>          return BigInt(n) * recFactorial(n - 1);
> }
>
> void main(string[] args) {
>      if (args.length != 2) {
>          writeln("Factorial requires a number.");
>      } else {
>          try {
>             writeln(recFactorial(to!int(args[1])));
>          } catch (ConvException e) {
>             writeln("Error");
>          }
>      }
> }
>
>
> Note the usage of ConvException, it's a very good practice to never use a generic "gotta catch them all" expression, because it leads to hiding other bugs in your code, and this is a source for troubles.
>
> Bye,
> bearophile
Thank you very much for recode this. But you "only" put a "in" in recFactorial function argument. What this mean? **Why** this is more efficient than mine?

For the other hand, how can increase the stack in linux?


Thanks,
Xan.
February 02, 2012
On 02/02/2012 11:00 AM, xancorreu wrote:
> Al 02/02/12 19:18, En/na bearophile ha escrit:

> Can I say "serialize the first, second and third arguments as Class
> Person"?
>
> I mean, if you define a class Person like:
>
> class Person {
> string name
> uint age
> dead bool
> }
>
> could you serialize the input from console, like
> Std.in.serialize(Person, args(0), args(1), args(2))?

I haven't used it but there is Orange:

  https://github.com/jacob-carlborg/orange

I think it will be included in Phobos.

> You could do that "manually" checking each paramm, but it's a tedious task.

If the input is exactly in the format that a library like Orange expects, then it's easy.

To me, constructing an object from user input is conceptually outside of OO, because there is no object at that point yet. It makes sense to me to read the input and then make an object from the input.

Depending on the design, the input may be rejected by the function that reads the input, by the constructor of the type, or by both.

> Thanks,
> Xan.

Ali

February 02, 2012
On Thursday, February 02, 2012 11:11:28 Ali Çehreli wrote:
> On 02/02/2012 11:00 AM, xancorreu wrote:
> > Al 02/02/12 19:18, En/na bearophile ha escrit:
> > 
> > Can I say "serialize the first, second and third arguments as Class Person"?
> > 
> > I mean, if you define a class Person like:
> > 
> > class Person {
> > string name
> > uint age
> > dead bool
> > }
> > 
> > could you serialize the input from console, like
> > Std.in.serialize(Person, args(0), args(1), args(2))?
> 
> I haven't used it but there is Orange:
> 
> https://github.com/jacob-carlborg/orange
> 
> I think it will be included in Phobos.
> 
> > You could do that "manually" checking each paramm, but it's a tedious
> 
> task.
> 
> If the input is exactly in the format that a library like Orange expects, then it's easy.
> 
> To me, constructing an object from user input is conceptually outside of OO, because there is no object at that point yet. It makes sense to me to read the input and then make an object from the input.
> 
> Depending on the design, the input may be rejected by the function that reads the input, by the constructor of the type, or by both.

I'd be very surprised if Orange could help here (though I've never used it, so I don't know exactly what it can it). Normally, when you talk about serialization, you talk about serializing an object to another format (generally a binary format of some kind) and restoring it later, not creating an object.

It's pretty much expected that if you're going to create an object from user input, you're going to have to do it yourself. That's perfectly normal. What makes the most sense for each application varies too much to really standardize it - especially when you consider error handling. What happens when not enough arguments were passed to the application? What if the values given can't be converted to the desired types? etc.

If you have

class Person
{
 string name;
 uint age;
 bool dead;
}

I'd expect something like:

int main(string[] args)
{
 if(args.length != 4)
 {
 stderr.writeln("Not enough arguments.");
 return -1;
 }

 auto name = args[1];
 uint age;

 try
 age = to!uint(args[2]);
 catch(ConvException ce)
 {
 stderr.writefln("[%s] is not a valid age.", args[2]);
 return -1;
 }

 bool dead;

 try
 dead = to!bool(args[3]);
 catch(ConvException ce)
 {
 stderr.writefln("[%s] is not a valid boolean value. It must be true or
false.", args[3]);
 return -1;
 }

 auto person = Person(name, age, dead);

 //...

 return 0;
}

And whether that's the best way to handle it depends on what you're trying to do in terms of user input and error messages. How on earth is all of that going to be handled generically? It all depends on what the programmer is trying to do. Switching to use command-line switches and getopt would help some, but you still have to deal with the error messages yourself. Creating the Person is the easy part.

- Jonathan M Davis
February 02, 2012
On 02/02/2012 08:04 PM, xancorreu wrote:
> Al 02/02/12 19:30, En/na bearophile ha escrit:
>> xancorreu:
>>
>>> I get "segment violation" error with ./factorial 400000
>>> How can I resolve it?
>> You are having a stack overflow. DMD currently doesn't print a good
>> message because of this regression that is being worked on:
>> http://d.puremagic.com/issues/show_bug.cgi?id=6088
>>
>> On Windows with DMD you increase the stack like this:
>> dmd -L/STACK:100000000 -run test2.d 400000> result.txt
>>
>> If it goes in overflow still, increase the stack some more. But it
>> will take a long time to compute the result even with the latest
>> 2.058head with improved GC because the algorithm you have used to
>> compute the factorial is very bad.
>>
>>
>> I have rewritten your code like this:
>>
>> import std.stdio, std.bigint, std.conv, std.exception;
>>
>> BigInt recFactorial(in int n) {
>> if (n == 0)
>> return BigInt(1);
>> else
>> return BigInt(n) * recFactorial(n - 1);
>> }
>>
>> void main(string[] args) {
>> if (args.length != 2) {
>> writeln("Factorial requires a number.");
>> } else {
>> try {
>> writeln(recFactorial(to!int(args[1])));
>> } catch (ConvException e) {
>> writeln("Error");
>> }
>> }
>> }
>>
>>
>> Note the usage of ConvException, it's a very good practice to never
>> use a generic "gotta catch them all" expression, because it leads to
>> hiding other bugs in your code, and this is a source for troubles.
>>
>> Bye,
>> bearophile
> Thank you very much for recode this. But you "only" put a "in" in
> recFactorial function argument. What this mean? **Why** this is more
> efficient than mine?

It is not. He just added some stylistic changes that don't change the code's semantics in any way.

>
> For the other hand, how can increase the stack in linux?
>
>
> Thanks,
> Xan.

I don't know, but it is best to just rewrite the code so that it does not use recursion.

(This kind of problem is exactly the reason why any language standard should mandate tail call optimization.)