July 08, 2012
On Reddit they are currently discussing again about the Rust language, and about the browser prototype written in Rust, named "Servo" (https://github.com/mozilla/servo ):
http://www.reddit.com/r/programming/comments/w6h7x/the_state_of_servo_a_mozilla_experiment_in/


So I've taken another look at the Rust tutorial:
http://dl.rust-lang.org/doc/tutorial.html

and I've seen Rust is quite more defined compared to the last two times I've read about it. So below I put more extracts from the tutorial, with few comments of mine (but most text you find below is from the tutorial).

On default in Rust types are immutable. If you want the mutable type you need to annotate it with "mut" in some way.

Rust designers seems to love really short keywords, this is in my opinion a bit silly. On the other hand in D you have keywords like "immutable" that are rather long to type. So I prefer a mid way between those two.

Rust has type classes from Haskell (with some simplifications for higher kinds), uniqueness typing, and typestates.

In Haskell typeclasses are very easy to use.

From my limited study, the Rust implementation of uniqueness typing doesn't look hard to understand and use. It statically enforced, it doesn't require lot of annotations and I think its compiler implementation is not too much hard, because it's a pure type system test. Maybe D designers should take a look, maybe for D3.

Macros are planned, but I think they are not fully implemented.

I think in Go the function stack is segmented and growable as in Go. This saves RAM if you need a small stack, and avoids stack overflows where lot of stack is needed.

-------------------------

Instead of the 3 char types of D, Rust has 1 char type:

char  A character is a 32-bit Unicode code point.

-------------------------

And only one string type:

str  String type. A string contains a UTF-8 encoded sequence of characters.

For algorithms that do really need to index by character, there's the option to convert your string to a character vector (using str::chars).

-------------------------

Tuples are rightly built-in. Tuple singletons are not supported (empty tuples are kind of supported with ()):


(T1, T2)  Tuple type. Any arity above 1 is supported.

-------------------------

Despite Walter said that having more than a type of pointer is bad, both Ada and Rust have several pointer types. Rust has three of them (plus their mutable variants).


Rust supports several types of pointers. The simplest is the unsafe pointer, written *T, which is a completely unchecked pointer type only used in unsafe code (and thus, in typical Rust code, very rarely). The safe pointer types are @T for shared, reference-counted boxes, and ~T, for uniquely-owned pointers.

All pointer types can be dereferenced with the * unary operator.

Shared boxes never cross task boundaries.

-------------------------

This seems a bit overkill to me:

It's also possible to avoid any type ambiguity by writing integer literals with a suffix. The suffixes i and u are for the types int and uint, respectively: the literal -3i has type int, while 127u has type uint. For the fixed-size integer types, just suffix the literal with the type name: 255u8, 50i64, etc.

-------------------------

This is very strict, maybe too much strict:

No implicit conversion between integer types happens. If you are adding one to a variable of type uint, saying += 1u8 will give you a type error.

-------------------------

Even more than Go:

++ and -- are missing


And fixes a C problem:

the logical bitwise operators have higher precedence. In C, x & 2 > 0 comes out as x & (2 > 0), in Rust, it means (x & 2) > 0, which is more likely to be what you expect (unless you are a C veteran).

-------------------------

Enums are datatypes that have several different representations. For example, the type shown earlier:

enum shape {
    circle(point, float),
    rectangle(point, point)
}

A value of this type is either a circle, in which case it contains a point record and a float, or a rectangle, in which case it contains two point records. The run-time representation of such a value includes an identifier of the actual form that it holds, much like the 'tagged union' pattern in C, but with better ergonomics.

The above declaration will define a type shape that can be used to refer to such shapes, and two functions, circle and rectangle, which can be used to construct values of the type (taking arguments of the specified types). So circle({x: 0f, y: 0f}, 10f) is the way to create a new circle.

Enum variants do not have to have parameters. This, for example, is equivalent to a C enum:

enum direction {
    north,
    east,
    south,
    west
}

-------------------------

This is probably quite handy:

A powerful application of pattern matching is destructuring, where you use the matching to get at the contents of data types. Remember that (float, float) is a tuple of two floats:

fn angle(vec: (float, float)) -> float {
    alt vec {
      (0f, y) if y < 0f { 1.5 * float::consts::pi }
      (0f, y) { 0.5 * float::consts::pi }
      (x, y) { float::atan(y / x) }
    }
}

- - - - - - - -

Records can be destructured in alt patterns. The basic syntax is {fieldname: pattern, ...}, but the pattern for a field can be omitted as a shorthand for simply binding the variable with the same name as the field.

alt mypoint {
    {x: 0f, y: y_name} { /* Provide sub-patterns for fields */ }
    {x, y}             { /* Simply bind the fields */ }
}

The field names of a record do not have to appear in a pattern in the same order they appear in the type. When you are not interested in all the fields of a record, a record pattern may end with , _ (as in {field1, _}) to indicate that you're ignoring all other fields.

- - - - - - - -

For enum types with multiple variants, destructuring is the only way to get at their contents. All variant constructors can be used as patterns, as in this definition of area:

fn area(sh: shape) -> float {
    alt sh {
        circle(_, size) { float::consts::pi * size * size }
        rectangle({x, y}, {x: x2, y: y2}) { (x2 - x) * (y2 - y) }
    }
}

-------------------------

This is quite desirable in D too:

To a limited extent, it is possible to use destructuring patterns when declaring a variable with let. For example, you can say this to extract the fields from a tuple:

let (a, b) = get_tuple_of_two_ints();

-------------------------

Stack-allocated closures:

There are several forms of closure, each with its own role. The most common, called a stack closure, has type fn& and can directly access local variables in the enclosing scope.

let mut max = 0;
[1, 2, 3].map(|x| if x > max { max = x });

Stack closures are very efficient because their environment is allocated on the call stack and refers by pointer to captured locals. To ensure that stack closures never outlive the local variables to which they refer, they can only be used in argument position and cannot be stored in structures nor returned from functions. Despite the limitations stack closures are used pervasively in Rust code.

-------------------------

Unique closures:

Unique closures, written fn~ in analogy to the ~ pointer type (see next section), hold on to things that can safely be sent between processes. They copy the values they close over, much like boxed closures, but they also 'own' them—meaning no other code can access them. Unique closures are used in concurrent code, particularly for spawning tasks.


There are also heap-allocated closures (so there are 3 kinds of closures).

- - - - - - - -

In contrast to shared boxes, unique boxes are not reference counted. Instead, it is statically guaranteed that only a single owner of the box exists at any time.

let x = ~10;
let y <- x;

This is where the 'move' (<-) operator comes in. It is similar to =, but it de-initializes its source. Thus, the unique box can move from x to y, without violating the constraint that it only has a single owner (if you used assignment instead of the move operator, the box would, in principle, be copied).

Unique boxes, when they do not contain any shared boxes, can be sent to other tasks. The sending task will give up ownership of the box, and won't be able to access it afterwards. The receiving task will become the sole owner of the box.

-------------------------

In D you control this adding "private" before names, but I think a centralized control point at the top of the module is safer and cleaner:

By default, a module exports everything that it defines. This can be restricted with export directives at the top of the module or file.

mod enc {
    export encrypt, decrypt;
    const super_secret_number: int = 10;
    fn encrypt(n: int) -> int { n + super_secret_number }
    fn decrypt(n: int) -> int { n - super_secret_number }
}

-------------------------

This is needed by the uniqueness typing:

Evaluating a swap expression neither changes reference counts nor deeply copies any unique structure pointed to by the moved rval. Instead, the swap expression represents an indivisible exchange of ownership between the right-hand-side and the left-hand-side of the expression. No allocation or destruction is entailed.

An example of three different swap expressions:

x <-> a;
x[i] <-> a[i];
y.z <-> b.c;

-------------------------

For some info on the typestate system, from the Rust manual:

http://dl.rust-lang.org/doc/rust.html#typestate-system

This description is simpler than I have thought. It seems possible to create an experimental D compiler with just a similar typestate system, it looks like a purely additive change (but maybe it's not a small change). It seems to not even require new syntax, beside an assert-like check() that can't be disable and that uses a pure expression/predicate.

Bye,
bearophile
July 08, 2012
Thank for keeping us informed about Rust. i don't like the syntax, but it is definitively an interesting language and something we should look at as D people.


On 08/07/2012 15:49, bearophile wrote:
> On Reddit they are currently discussing again about the Rust language,
> and about the browser prototype written in Rust, named "Servo"
> (https://github.com/mozilla/servo ):
> http://www.reddit.com/r/programming/comments/w6h7x/the_state_of_servo_a_mozilla_experiment_in/
>
>
>
> So I've taken another look at the Rust tutorial:
> http://dl.rust-lang.org/doc/tutorial.html
>
> and I've seen Rust is quite more defined compared to the last two times
> I've read about it. So below I put more extracts from the tutorial, with
> few comments of mine (but most text you find below is from the tutorial).
>
> On default in Rust types are immutable. If you want the mutable type you
> need to annotate it with "mut" in some way.
>
> Rust designers seems to love really short keywords, this is in my
> opinion a bit silly. On the other hand in D you have keywords like
> "immutable" that are rather long to type. So I prefer a mid way between
> those two.
>
> Rust has type classes from Haskell (with some simplifications for higher
> kinds), uniqueness typing, and typestates.
>
> In Haskell typeclasses are very easy to use.
>
>  From my limited study, the Rust implementation of uniqueness typing
> doesn't look hard to understand and use. It statically enforced, it
> doesn't require lot of annotations and I think its compiler
> implementation is not too much hard, because it's a pure type system
> test. Maybe D designers should take a look, maybe for D3.
>
> Macros are planned, but I think they are not fully implemented.
>
> I think in Go the function stack is segmented and growable as in Go.
> This saves RAM if you need a small stack, and avoids stack overflows
> where lot of stack is needed.
>
> -------------------------
>
> Instead of the 3 char types of D, Rust has 1 char type:
>
> char A character is a 32-bit Unicode code point.
>
> -------------------------
>
> And only one string type:
>
> str String type. A string contains a UTF-8 encoded sequence of characters.
>
> For algorithms that do really need to index by character, there's the
> option to convert your string to a character vector (using str::chars).
>
> -------------------------
>
> Tuples are rightly built-in. Tuple singletons are not supported (empty
> tuples are kind of supported with ()):
>
>
> (T1, T2) Tuple type. Any arity above 1 is supported.
>
> -------------------------
>
> Despite Walter said that having more than a type of pointer is bad, both
> Ada and Rust have several pointer types. Rust has three of them (plus
> their mutable variants).
>
>
> Rust supports several types of pointers. The simplest is the unsafe
> pointer, written *T, which is a completely unchecked pointer type only
> used in unsafe code (and thus, in typical Rust code, very rarely). The
> safe pointer types are @T for shared, reference-counted boxes, and ~T,
> for uniquely-owned pointers.
>
> All pointer types can be dereferenced with the * unary operator.
>
> Shared boxes never cross task boundaries.
>
> -------------------------
>
> This seems a bit overkill to me:
>
> It's also possible to avoid any type ambiguity by writing integer
> literals with a suffix. The suffixes i and u are for the types int and
> uint, respectively: the literal -3i has type int, while 127u has type
> uint. For the fixed-size integer types, just suffix the literal with the
> type name: 255u8, 50i64, etc.
>
> -------------------------
>
> This is very strict, maybe too much strict:
>
> No implicit conversion between integer types happens. If you are adding
> one to a variable of type uint, saying += 1u8 will give you a type error.
>
> -------------------------
>
> Even more than Go:
>
> ++ and -- are missing
>
>
> And fixes a C problem:
>
> the logical bitwise operators have higher precedence. In C, x & 2 > 0
> comes out as x & (2 > 0), in Rust, it means (x & 2) > 0, which is more
> likely to be what you expect (unless you are a C veteran).
>
> -------------------------
>
> Enums are datatypes that have several different representations. For
> example, the type shown earlier:
>
> enum shape {
> circle(point, float),
> rectangle(point, point)
> }
>
> A value of this type is either a circle, in which case it contains a
> point record and a float, or a rectangle, in which case it contains two
> point records. The run-time representation of such a value includes an
> identifier of the actual form that it holds, much like the 'tagged
> union' pattern in C, but with better ergonomics.
>
> The above declaration will define a type shape that can be used to refer
> to such shapes, and two functions, circle and rectangle, which can be
> used to construct values of the type (taking arguments of the specified
> types). So circle({x: 0f, y: 0f}, 10f) is the way to create a new circle.
>
> Enum variants do not have to have parameters. This, for example, is
> equivalent to a C enum:
>
> enum direction {
> north,
> east,
> south,
> west
> }
>
> -------------------------
>
> This is probably quite handy:
>
> A powerful application of pattern matching is destructuring, where you
> use the matching to get at the contents of data types. Remember that
> (float, float) is a tuple of two floats:
>
> fn angle(vec: (float, float)) -> float {
> alt vec {
> (0f, y) if y < 0f { 1.5 * float::consts::pi }
> (0f, y) { 0.5 * float::consts::pi }
> (x, y) { float::atan(y / x) }
> }
> }
>
> - - - - - - - -
>
> Records can be destructured in alt patterns. The basic syntax is
> {fieldname: pattern, ...}, but the pattern for a field can be omitted as
> a shorthand for simply binding the variable with the same name as the
> field.
>
> alt mypoint {
> {x: 0f, y: y_name} { /* Provide sub-patterns for fields */ }
> {x, y} { /* Simply bind the fields */ }
> }
>
> The field names of a record do not have to appear in a pattern in the
> same order they appear in the type. When you are not interested in all
> the fields of a record, a record pattern may end with , _ (as in
> {field1, _}) to indicate that you're ignoring all other fields.
>
> - - - - - - - -
>
> For enum types with multiple variants, destructuring is the only way to
> get at their contents. All variant constructors can be used as patterns,
> as in this definition of area:
>
> fn area(sh: shape) -> float {
> alt sh {
> circle(_, size) { float::consts::pi * size * size }
> rectangle({x, y}, {x: x2, y: y2}) { (x2 - x) * (y2 - y) }
> }
> }
>
> -------------------------
>
> This is quite desirable in D too:
>
> To a limited extent, it is possible to use destructuring patterns when
> declaring a variable with let. For example, you can say this to extract
> the fields from a tuple:
>
> let (a, b) = get_tuple_of_two_ints();
>
> -------------------------
>
> Stack-allocated closures:
>
> There are several forms of closure, each with its own role. The most
> common, called a stack closure, has type fn& and can directly access
> local variables in the enclosing scope.
>
> let mut max = 0;
> [1, 2, 3].map(|x| if x > max { max = x });
>
> Stack closures are very efficient because their environment is allocated
> on the call stack and refers by pointer to captured locals. To ensure
> that stack closures never outlive the local variables to which they
> refer, they can only be used in argument position and cannot be stored
> in structures nor returned from functions. Despite the limitations stack
> closures are used pervasively in Rust code.
>
> -------------------------
>
> Unique closures:
>
> Unique closures, written fn~ in analogy to the ~ pointer type (see next
> section), hold on to things that can safely be sent between processes.
> They copy the values they close over, much like boxed closures, but they
> also 'own' them—meaning no other code can access them. Unique closures
> are used in concurrent code, particularly for spawning tasks.
>
>
> There are also heap-allocated closures (so there are 3 kinds of closures).
>
> - - - - - - - -
>
> In contrast to shared boxes, unique boxes are not reference counted.
> Instead, it is statically guaranteed that only a single owner of the box
> exists at any time.
>
> let x = ~10;
> let y <- x;
>
> This is where the 'move' (<-) operator comes in. It is similar to =, but
> it de-initializes its source. Thus, the unique box can move from x to y,
> without violating the constraint that it only has a single owner (if you
> used assignment instead of the move operator, the box would, in
> principle, be copied).
>
> Unique boxes, when they do not contain any shared boxes, can be sent to
> other tasks. The sending task will give up ownership of the box, and
> won't be able to access it afterwards. The receiving task will become
> the sole owner of the box.
>
> -------------------------
>
> In D you control this adding "private" before names, but I think a
> centralized control point at the top of the module is safer and cleaner:
>
> By default, a module exports everything that it defines. This can be
> restricted with export directives at the top of the module or file.
>
> mod enc {
> export encrypt, decrypt;
> const super_secret_number: int = 10;
> fn encrypt(n: int) -> int { n + super_secret_number }
> fn decrypt(n: int) -> int { n - super_secret_number }
> }
>
> -------------------------
>
> This is needed by the uniqueness typing:
>
> Evaluating a swap expression neither changes reference counts nor deeply
> copies any unique structure pointed to by the moved rval. Instead, the
> swap expression represents an indivisible exchange of ownership between
> the right-hand-side and the left-hand-side of the expression. No
> allocation or destruction is entailed.
>
> An example of three different swap expressions:
>
> x <-> a;
> x[i] <-> a[i];
> y.z <-> b.c;
>
> -------------------------
>
> For some info on the typestate system, from the Rust manual:
>
> http://dl.rust-lang.org/doc/rust.html#typestate-system
>
> This description is simpler than I have thought. It seems possible to
> create an experimental D compiler with just a similar typestate system,
> it looks like a purely additive change (but maybe it's not a small
> change). It seems to not even require new syntax, beside an assert-like
> check() that can't be disable and that uses a pure expression/predicate.
>
> Bye,
> bearophile

July 08, 2012
"bearophile" <bearophileHUGS@lycos.com> wrote:

> On default in Rust types are immutable. If you want the mutable type you need to annotate it with "mut" in some way.
> 
> Rust designers seems to love really short keywords, this is in my opinion a bit silly. On the other hand in D you have keywords like "immutable" that are rather long to type. So I prefer a mid way between those two.

Short keywords are only important with barebones editors like a default vi. Nobody would use this for real development.
July 08, 2012
On Sunday, 8 July 2012 at 18:13:49 UTC, Stefan Scholl wrote:
> "bearophile" <bearophileHUGS@lycos.com> wrote:
>
>> On default in Rust types are immutable. If you want the mutable type you
>> need to annotate it with "mut" in some way.
>> 
>> Rust designers seems to love really short keywords, this is in my opinion
>> a bit silly. On the other hand in D you have keywords like "immutable"
>> that are rather long to type. So I prefer a mid way between those two.
>
> Short keywords are only important with barebones editors like a default vi.
> Nobody would use this for real development.

I started I long discussion on Reddit, because I complained that the goal of 5 letter keywords is primitive, and brings back memories of the time the compilers were memory constraint.

For example, I remember in Turbo C 2.0, the identifiers could not be longer than 32 bytes, and it was even possible to specify a lower default limit to get a bit more memory!

As someone that values readable code, I don't understand this desire to turn every programming language into APL.

--
Paulo
July 08, 2012
On 7/8/2012 6:49 AM, bearophile wrote:
> I think in Go the function stack is segmented and growable as in Go. This saves
> RAM if you need a small stack, and avoids stack overflows where lot of stack is
> needed.

The trouble with segmented stacks are:

1. they have a significant runtime penalty

2. interfacing to C code becomes problematic


Also, they do not save RAM, they save address space. RAM is not committed until a stack memory page is actually used.

Segmented stacks are useful for 32 bit address space. However, they are not useful for 64 bit address spaces. Heck, you can allocate 4 billion stacks of 4 billion bytes each! (Remember, allocating address space is not allocating actual memory.)

Given that the programming world is moving rapidly to 64 bit exclusively, I think segmented stacks are a dead end technology. They would have been much more interesting 15 years ago.
July 08, 2012
On Sunday, 8 July 2012 at 13:49:50 UTC, bearophile wrote:
> This seems a bit overkill to me:
>
> It's also possible to avoid any type ambiguity by writing integer literals with a suffix. The suffixes i and u are for the types int and uint, respectively: the literal -3i has type int, while 127u has type uint. For the fixed-size integer types, just suffix the literal with the type name: 255u8, 50i64, etc.
>

Many good ideas... am just singling out this one, as you seem to be of a different opinion in this particular case... I on the contrary wish D would have taken this route as well, because of the ubiquitous 'auto' and 'implicit template instantiation' features... furthermore vector simd types could also benefit.

July 08, 2012
> As someone that values readable code, I don't understand this desire to turn every programming language into APL.

I would expect the abbreviations that rust uses to be perfectly
readable once you know the langauge.
July 08, 2012
Walter Bright:

Thank you for your answers Walter, as you guess I am ignorant about segmented stacks.

> The trouble with segmented stacks are:
>
> 1. they have a significant runtime penalty

> Also, they do not save RAM, they save address space. RAM is not committed until a stack memory page is actually used.

Regarding performance and memory used they say:
http://golang.org/doc/go_faq.html#goroutines

>The result, which we call goroutines, can be very cheap: unless they spend a lot of time in long-running system calls, they cost little more than the memory for the stack, which is just a few kilobytes. To make the stacks small, Go's run-time uses segmented stacks. A newly minted goroutine is given a few kilobytes, which is almost always enough. When it isn't, the run-time allocates (and frees) extension segments automatically. The overhead averages about three cheap instructions per function call. It is practical to create hundreds of thousands of goroutines in the same address space. If goroutines were just threads, system resources would run out at a much smaller number.<


> Segmented stacks are useful for 32 bit address space. However, they are not useful for 64 bit address spaces.

I think Go is meant to be used mostly on 64 bit servers.
Both the designers of Go and Rust are experienced people, and they plan to use their languages on 64 bit systems.


Here they say Go avoid many stack overflows, because stack are limited by the available virtual memory:
http://stackoverflow.com/questions/4226964/how-come-go-doesnt-have-stackoverflows


I think LLVM supports segmented stacks, the example given is on x86-64:
http://llvm.org/releases/3.0/docs/SegmentedStacks.html

Bye,
bearophile
July 09, 2012
On 7/8/2012 2:32 PM, bearophile wrote:
>> Segmented stacks are useful for 32 bit address space. However, they are not
>> useful for 64 bit address spaces.
>
> I think Go is meant to be used mostly on 64 bit servers.
> Both the designers of Go and Rust are experienced people, and they plan to use
> their languages on 64 bit systems.

I think you misunderstood. I meant there is no point to segmented stacks on a 64 bit system.
July 09, 2012
"jerro" <a@a.com> wrote:
> I would expect the abbreviations that rust uses to be perfectly readable once you know the langauge.

There is a lot of noise (lot of special characters) in Rust code. Together with short keywords like "fn" for function definition.

It's hard to see a structure in it. You can read JAPHs, too, if you know Perl. But your brain parses it character for character. Rust is a bit better, though.
« First   ‹ Prev
1 2 3 4 5 6 7
Top | Discussion index | About this forum | D home