View mode: basic / threaded / horizontal-split · Log in · Help
July 08, 2012
Rust updates
On Reddit they are currently discussing again about the Rust 
language, and about the browser prototype written in Rust, named 
"Servo" (https://github.com/mozilla/servo ):
http://www.reddit.com/r/programming/comments/w6h7x/the_state_of_servo_a_mozilla_experiment_in/


So I've taken another look at the Rust tutorial:
http://dl.rust-lang.org/doc/tutorial.html

and I've seen Rust is quite more defined compared to the last two 
times I've read about it. So below I put more extracts from the 
tutorial, with few comments of mine (but most text you find below 
is from the tutorial).

On default in Rust types are immutable. If you want the mutable 
type you need to annotate it with "mut" in some way.

Rust designers seems to love really short keywords, this is in my 
opinion a bit silly. On the other hand in D you have keywords 
like "immutable" that are rather long to type. So I prefer a mid 
way between those two.

Rust has type classes from Haskell (with some simplifications for 
higher kinds), uniqueness typing, and typestates.

In Haskell typeclasses are very easy to use.

From my limited study, the Rust implementation of uniqueness 
typing doesn't look hard to understand and use. It statically 
enforced, it doesn't require lot of annotations and I think its 
compiler implementation is not too much hard, because it's a pure 
type system test. Maybe D designers should take a look, maybe for 
D3.

Macros are planned, but I think they are not fully implemented.

I think in Go the function stack is segmented and growable as in 
Go. This saves RAM if you need a small stack, and avoids stack 
overflows where lot of stack is needed.

-------------------------

Instead of the 3 char types of D, Rust has 1 char type:

char  A character is a 32-bit Unicode code point.

-------------------------

And only one string type:

str  String type. A string contains a UTF-8 encoded sequence of 
characters.

For algorithms that do really need to index by character, there's 
the option to convert your string to a character vector (using 
str::chars).

-------------------------

Tuples are rightly built-in. Tuple singletons are not supported 
(empty tuples are kind of supported with ()):


(T1, T2)  Tuple type. Any arity above 1 is supported.

-------------------------

Despite Walter said that having more than a type of pointer is 
bad, both Ada and Rust have several pointer types. Rust has three 
of them (plus their mutable variants).


Rust supports several types of pointers. The simplest is the 
unsafe pointer, written *T, which is a completely unchecked 
pointer type only used in unsafe code (and thus, in typical Rust 
code, very rarely). The safe pointer types are @T for shared, 
reference-counted boxes, and ~T, for uniquely-owned pointers.

All pointer types can be dereferenced with the * unary operator.

Shared boxes never cross task boundaries.

-------------------------

This seems a bit overkill to me:

It's also possible to avoid any type ambiguity by writing integer 
literals with a suffix. The suffixes i and u are for the types 
int and uint, respectively: the literal -3i has type int, while 
127u has type uint. For the fixed-size integer types, just suffix 
the literal with the type name: 255u8, 50i64, etc.

-------------------------

This is very strict, maybe too much strict:

No implicit conversion between integer types happens. If you are 
adding one to a variable of type uint, saying += 1u8 will give 
you a type error.

-------------------------

Even more than Go:

++ and -- are missing


And fixes a C problem:

the logical bitwise operators have higher precedence. In C, x & 2 
> 0 comes out as x & (2 > 0), in Rust, it means (x & 2) > 0, 
which is more likely to be what you expect (unless you are a C 
veteran).

-------------------------

Enums are datatypes that have several different representations. 
For example, the type shown earlier:

enum shape {
    circle(point, float),
    rectangle(point, point)
}

A value of this type is either a circle, in which case it 
contains a point record and a float, or a rectangle, in which 
case it contains two point records. The run-time representation 
of such a value includes an identifier of the actual form that it 
holds, much like the 'tagged union' pattern in C, but with better 
ergonomics.

The above declaration will define a type shape that can be used 
to refer to such shapes, and two functions, circle and rectangle, 
which can be used to construct values of the type (taking 
arguments of the specified types). So circle({x: 0f, y: 0f}, 10f) 
is the way to create a new circle.

Enum variants do not have to have parameters. This, for example, 
is equivalent to a C enum:

enum direction {
    north,
    east,
    south,
    west
}

-------------------------

This is probably quite handy:

A powerful application of pattern matching is destructuring, 
where you use the matching to get at the contents of data types. 
Remember that (float, float) is a tuple of two floats:

fn angle(vec: (float, float)) -> float {
    alt vec {
      (0f, y) if y < 0f { 1.5 * float::consts::pi }
      (0f, y) { 0.5 * float::consts::pi }
      (x, y) { float::atan(y / x) }
    }
}

- - - - - - - -

Records can be destructured in alt patterns. The basic syntax is 
{fieldname: pattern, ...}, but the pattern for a field can be 
omitted as a shorthand for simply binding the variable with the 
same name as the field.

alt mypoint {
    {x: 0f, y: y_name} { /* Provide sub-patterns for fields */ }
    {x, y}             { /* Simply bind the fields */ }
}

The field names of a record do not have to appear in a pattern in 
the same order they appear in the type. When you are not 
interested in all the fields of a record, a record pattern may 
end with , _ (as in {field1, _}) to indicate that you're ignoring 
all other fields.

- - - - - - - -

For enum types with multiple variants, destructuring is the only 
way to get at their contents. All variant constructors can be 
used as patterns, as in this definition of area:

fn area(sh: shape) -> float {
    alt sh {
        circle(_, size) { float::consts::pi * size * size }
        rectangle({x, y}, {x: x2, y: y2}) { (x2 - x) * (y2 - y) }
    }
}

-------------------------

This is quite desirable in D too:

To a limited extent, it is possible to use destructuring patterns 
when declaring a variable with let. For example, you can say this 
to extract the fields from a tuple:

let (a, b) = get_tuple_of_two_ints();

-------------------------

Stack-allocated closures:

There are several forms of closure, each with its own role. The 
most common, called a stack closure, has type fn& and can 
directly access local variables in the enclosing scope.

let mut max = 0;
[1, 2, 3].map(|x| if x > max { max = x });

Stack closures are very efficient because their environment is 
allocated on the call stack and refers by pointer to captured 
locals. To ensure that stack closures never outlive the local 
variables to which they refer, they can only be used in argument 
position and cannot be stored in structures nor returned from 
functions. Despite the limitations stack closures are used 
pervasively in Rust code.

-------------------------

Unique closures:

Unique closures, written fn~ in analogy to the ~ pointer type 
(see next section), hold on to things that can safely be sent 
between processes. They copy the values they close over, much 
like boxed closures, but they also 'own' them—meaning no other 
code can access them. Unique closures are used in concurrent 
code, particularly for spawning tasks.


There are also heap-allocated closures (so there are 3 kinds of 
closures).

- - - - - - - -

In contrast to shared boxes, unique boxes are not reference 
counted. Instead, it is statically guaranteed that only a single 
owner of the box exists at any time.

let x = ~10;
let y <- x;

This is where the 'move' (<-) operator comes in. It is similar to 
=, but it de-initializes its source. Thus, the unique box can 
move from x to y, without violating the constraint that it only 
has a single owner (if you used assignment instead of the move 
operator, the box would, in principle, be copied).

Unique boxes, when they do not contain any shared boxes, can be 
sent to other tasks. The sending task will give up ownership of 
the box, and won't be able to access it afterwards. The receiving 
task will become the sole owner of the box.

-------------------------

In D you control this adding "private" before names, but I think 
a centralized control point at the top of the module is safer and 
cleaner:

By default, a module exports everything that it defines. This can 
be restricted with export directives at the top of the module or 
file.

mod enc {
    export encrypt, decrypt;
    const super_secret_number: int = 10;
    fn encrypt(n: int) -> int { n + super_secret_number }
    fn decrypt(n: int) -> int { n - super_secret_number }
}

-------------------------

This is needed by the uniqueness typing:

Evaluating a swap expression neither changes reference counts nor 
deeply copies any unique structure pointed to by the moved rval. 
Instead, the swap expression represents an indivisible exchange 
of ownership between the right-hand-side and the left-hand-side 
of the expression. No allocation or destruction is entailed.

An example of three different swap expressions:

x <-> a;
x[i] <-> a[i];
y.z <-> b.c;

-------------------------

For some info on the typestate system, from the Rust manual:

http://dl.rust-lang.org/doc/rust.html#typestate-system

This description is simpler than I have thought. It seems 
possible to create an experimental D compiler with just a similar 
typestate system, it looks like a purely additive change (but 
maybe it's not a small change). It seems to not even require new 
syntax, beside an assert-like check() that can't be disable and 
that uses a pure expression/predicate.

Bye,
bearophile
July 08, 2012
Re: Rust updates
Thank for keeping us informed about Rust. i don't like the syntax, but 
it is definitively an interesting language and something we should look 
at as D people.


On 08/07/2012 15:49, bearophile wrote:
> On Reddit they are currently discussing again about the Rust language,
> and about the browser prototype written in Rust, named "Servo"
> (https://github.com/mozilla/servo ):
> http://www.reddit.com/r/programming/comments/w6h7x/the_state_of_servo_a_mozilla_experiment_in/
>
>
>
> So I've taken another look at the Rust tutorial:
> http://dl.rust-lang.org/doc/tutorial.html
>
> and I've seen Rust is quite more defined compared to the last two times
> I've read about it. So below I put more extracts from the tutorial, with
> few comments of mine (but most text you find below is from the tutorial).
>
> On default in Rust types are immutable. If you want the mutable type you
> need to annotate it with "mut" in some way.
>
> Rust designers seems to love really short keywords, this is in my
> opinion a bit silly. On the other hand in D you have keywords like
> "immutable" that are rather long to type. So I prefer a mid way between
> those two.
>
> Rust has type classes from Haskell (with some simplifications for higher
> kinds), uniqueness typing, and typestates.
>
> In Haskell typeclasses are very easy to use.
>
>  From my limited study, the Rust implementation of uniqueness typing
> doesn't look hard to understand and use. It statically enforced, it
> doesn't require lot of annotations and I think its compiler
> implementation is not too much hard, because it's a pure type system
> test. Maybe D designers should take a look, maybe for D3.
>
> Macros are planned, but I think they are not fully implemented.
>
> I think in Go the function stack is segmented and growable as in Go.
> This saves RAM if you need a small stack, and avoids stack overflows
> where lot of stack is needed.
>
> -------------------------
>
> Instead of the 3 char types of D, Rust has 1 char type:
>
> char A character is a 32-bit Unicode code point.
>
> -------------------------
>
> And only one string type:
>
> str String type. A string contains a UTF-8 encoded sequence of characters.
>
> For algorithms that do really need to index by character, there's the
> option to convert your string to a character vector (using str::chars).
>
> -------------------------
>
> Tuples are rightly built-in. Tuple singletons are not supported (empty
> tuples are kind of supported with ()):
>
>
> (T1, T2) Tuple type. Any arity above 1 is supported.
>
> -------------------------
>
> Despite Walter said that having more than a type of pointer is bad, both
> Ada and Rust have several pointer types. Rust has three of them (plus
> their mutable variants).
>
>
> Rust supports several types of pointers. The simplest is the unsafe
> pointer, written *T, which is a completely unchecked pointer type only
> used in unsafe code (and thus, in typical Rust code, very rarely). The
> safe pointer types are @T for shared, reference-counted boxes, and ~T,
> for uniquely-owned pointers.
>
> All pointer types can be dereferenced with the * unary operator.
>
> Shared boxes never cross task boundaries.
>
> -------------------------
>
> This seems a bit overkill to me:
>
> It's also possible to avoid any type ambiguity by writing integer
> literals with a suffix. The suffixes i and u are for the types int and
> uint, respectively: the literal -3i has type int, while 127u has type
> uint. For the fixed-size integer types, just suffix the literal with the
> type name: 255u8, 50i64, etc.
>
> -------------------------
>
> This is very strict, maybe too much strict:
>
> No implicit conversion between integer types happens. If you are adding
> one to a variable of type uint, saying += 1u8 will give you a type error.
>
> -------------------------
>
> Even more than Go:
>
> ++ and -- are missing
>
>
> And fixes a C problem:
>
> the logical bitwise operators have higher precedence. In C, x & 2 > 0
> comes out as x & (2 > 0), in Rust, it means (x & 2) > 0, which is more
> likely to be what you expect (unless you are a C veteran).
>
> -------------------------
>
> Enums are datatypes that have several different representations. For
> example, the type shown earlier:
>
> enum shape {
> circle(point, float),
> rectangle(point, point)
> }
>
> A value of this type is either a circle, in which case it contains a
> point record and a float, or a rectangle, in which case it contains two
> point records. The run-time representation of such a value includes an
> identifier of the actual form that it holds, much like the 'tagged
> union' pattern in C, but with better ergonomics.
>
> The above declaration will define a type shape that can be used to refer
> to such shapes, and two functions, circle and rectangle, which can be
> used to construct values of the type (taking arguments of the specified
> types). So circle({x: 0f, y: 0f}, 10f) is the way to create a new circle.
>
> Enum variants do not have to have parameters. This, for example, is
> equivalent to a C enum:
>
> enum direction {
> north,
> east,
> south,
> west
> }
>
> -------------------------
>
> This is probably quite handy:
>
> A powerful application of pattern matching is destructuring, where you
> use the matching to get at the contents of data types. Remember that
> (float, float) is a tuple of two floats:
>
> fn angle(vec: (float, float)) -> float {
> alt vec {
> (0f, y) if y < 0f { 1.5 * float::consts::pi }
> (0f, y) { 0.5 * float::consts::pi }
> (x, y) { float::atan(y / x) }
> }
> }
>
> - - - - - - - -
>
> Records can be destructured in alt patterns. The basic syntax is
> {fieldname: pattern, ...}, but the pattern for a field can be omitted as
> a shorthand for simply binding the variable with the same name as the
> field.
>
> alt mypoint {
> {x: 0f, y: y_name} { /* Provide sub-patterns for fields */ }
> {x, y} { /* Simply bind the fields */ }
> }
>
> The field names of a record do not have to appear in a pattern in the
> same order they appear in the type. When you are not interested in all
> the fields of a record, a record pattern may end with , _ (as in
> {field1, _}) to indicate that you're ignoring all other fields.
>
> - - - - - - - -
>
> For enum types with multiple variants, destructuring is the only way to
> get at their contents. All variant constructors can be used as patterns,
> as in this definition of area:
>
> fn area(sh: shape) -> float {
> alt sh {
> circle(_, size) { float::consts::pi * size * size }
> rectangle({x, y}, {x: x2, y: y2}) { (x2 - x) * (y2 - y) }
> }
> }
>
> -------------------------
>
> This is quite desirable in D too:
>
> To a limited extent, it is possible to use destructuring patterns when
> declaring a variable with let. For example, you can say this to extract
> the fields from a tuple:
>
> let (a, b) = get_tuple_of_two_ints();
>
> -------------------------
>
> Stack-allocated closures:
>
> There are several forms of closure, each with its own role. The most
> common, called a stack closure, has type fn& and can directly access
> local variables in the enclosing scope.
>
> let mut max = 0;
> [1, 2, 3].map(|x| if x > max { max = x });
>
> Stack closures are very efficient because their environment is allocated
> on the call stack and refers by pointer to captured locals. To ensure
> that stack closures never outlive the local variables to which they
> refer, they can only be used in argument position and cannot be stored
> in structures nor returned from functions. Despite the limitations stack
> closures are used pervasively in Rust code.
>
> -------------------------
>
> Unique closures:
>
> Unique closures, written fn~ in analogy to the ~ pointer type (see next
> section), hold on to things that can safely be sent between processes.
> They copy the values they close over, much like boxed closures, but they
> also 'own' them—meaning no other code can access them. Unique closures
> are used in concurrent code, particularly for spawning tasks.
>
>
> There are also heap-allocated closures (so there are 3 kinds of closures).
>
> - - - - - - - -
>
> In contrast to shared boxes, unique boxes are not reference counted.
> Instead, it is statically guaranteed that only a single owner of the box
> exists at any time.
>
> let x = ~10;
> let y <- x;
>
> This is where the 'move' (<-) operator comes in. It is similar to =, but
> it de-initializes its source. Thus, the unique box can move from x to y,
> without violating the constraint that it only has a single owner (if you
> used assignment instead of the move operator, the box would, in
> principle, be copied).
>
> Unique boxes, when they do not contain any shared boxes, can be sent to
> other tasks. The sending task will give up ownership of the box, and
> won't be able to access it afterwards. The receiving task will become
> the sole owner of the box.
>
> -------------------------
>
> In D you control this adding "private" before names, but I think a
> centralized control point at the top of the module is safer and cleaner:
>
> By default, a module exports everything that it defines. This can be
> restricted with export directives at the top of the module or file.
>
> mod enc {
> export encrypt, decrypt;
> const super_secret_number: int = 10;
> fn encrypt(n: int) -> int { n + super_secret_number }
> fn decrypt(n: int) -> int { n - super_secret_number }
> }
>
> -------------------------
>
> This is needed by the uniqueness typing:
>
> Evaluating a swap expression neither changes reference counts nor deeply
> copies any unique structure pointed to by the moved rval. Instead, the
> swap expression represents an indivisible exchange of ownership between
> the right-hand-side and the left-hand-side of the expression. No
> allocation or destruction is entailed.
>
> An example of three different swap expressions:
>
> x <-> a;
> x[i] <-> a[i];
> y.z <-> b.c;
>
> -------------------------
>
> For some info on the typestate system, from the Rust manual:
>
> http://dl.rust-lang.org/doc/rust.html#typestate-system
>
> This description is simpler than I have thought. It seems possible to
> create an experimental D compiler with just a similar typestate system,
> it looks like a purely additive change (but maybe it's not a small
> change). It seems to not even require new syntax, beside an assert-like
> check() that can't be disable and that uses a pure expression/predicate.
>
> Bye,
> bearophile
July 08, 2012
Re: Rust updates
"bearophile" <bearophileHUGS@lycos.com> wrote:

> On default in Rust types are immutable. If you want the mutable type you
> need to annotate it with "mut" in some way.
> 
> Rust designers seems to love really short keywords, this is in my opinion
> a bit silly. On the other hand in D you have keywords like "immutable"
> that are rather long to type. So I prefer a mid way between those two.

Short keywords are only important with barebones editors like a default vi.
Nobody would use this for real development.
July 08, 2012
Re: Rust updates
On Sunday, 8 July 2012 at 18:13:49 UTC, Stefan Scholl wrote:
> "bearophile" <bearophileHUGS@lycos.com> wrote:
>
>> On default in Rust types are immutable. If you want the 
>> mutable type you
>> need to annotate it with "mut" in some way.
>> 
>> Rust designers seems to love really short keywords, this is in 
>> my opinion
>> a bit silly. On the other hand in D you have keywords like 
>> "immutable"
>> that are rather long to type. So I prefer a mid way between 
>> those two.
>
> Short keywords are only important with barebones editors like a 
> default vi.
> Nobody would use this for real development.

I started I long discussion on Reddit, because I complained that 
the goal of 5 letter keywords is primitive, and brings back 
memories of the time the compilers were memory constraint.

For example, I remember in Turbo C 2.0, the identifiers could not 
be longer than 32 bytes, and it was even possible to specify a 
lower default limit to get a bit more memory!

As someone that values readable code, I don't understand this 
desire to turn every programming language into APL.

--
Paulo
July 08, 2012
Re: Rust updates
On 7/8/2012 6:49 AM, bearophile wrote:
> I think in Go the function stack is segmented and growable as in Go. This saves
> RAM if you need a small stack, and avoids stack overflows where lot of stack is
> needed.

The trouble with segmented stacks are:

1. they have a significant runtime penalty

2. interfacing to C code becomes problematic


Also, they do not save RAM, they save address space. RAM is not committed until 
a stack memory page is actually used.

Segmented stacks are useful for 32 bit address space. However, they are not 
useful for 64 bit address spaces. Heck, you can allocate 4 billion stacks of 4 
billion bytes each! (Remember, allocating address space is not allocating actual 
memory.)

Given that the programming world is moving rapidly to 64 bit exclusively, I 
think segmented stacks are a dead end technology. They would have been much more 
interesting 15 years ago.
July 08, 2012
Re: Rust updates
On Sunday, 8 July 2012 at 13:49:50 UTC, bearophile wrote:
> This seems a bit overkill to me:
>
> It's also possible to avoid any type ambiguity by writing 
> integer literals with a suffix. The suffixes i and u are for 
> the types int and uint, respectively: the literal -3i has type 
> int, while 127u has type uint. For the fixed-size integer 
> types, just suffix the literal with the type name: 255u8, 
> 50i64, etc.
>

Many good ideas... am just singling out this one, as you seem to 
be of a different opinion in this particular case... I on the 
contrary wish D would have taken this route as well, because of 
the ubiquitous 'auto' and 'implicit template instantiation' 
features... furthermore vector simd types could also benefit.
July 08, 2012
Re: Rust updates
> As someone that values readable code, I don't understand this 
> desire to turn every programming language into APL.

I would expect the abbreviations that rust uses to be perfectly
readable once you know the langauge.
July 08, 2012
Re: Rust updates
Walter Bright:

Thank you for your answers Walter, as you guess I am ignorant 
about segmented stacks.

> The trouble with segmented stacks are:
>
> 1. they have a significant runtime penalty

> Also, they do not save RAM, they save address space. RAM is not 
> committed until a stack memory page is actually used.

Regarding performance and memory used they say:
http://golang.org/doc/go_faq.html#goroutines

>The result, which we call goroutines, can be very cheap: unless 
>they spend a lot of time in long-running system calls, they cost 
>little more than the memory for the stack, which is just a few 
>kilobytes. To make the stacks small, Go's run-time uses 
>segmented stacks. A newly minted goroutine is given a few 
>kilobytes, which is almost always enough. When it isn't, the 
>run-time allocates (and frees) extension segments automatically. 
>The overhead averages about three cheap instructions per 
>function call. It is practical to create hundreds of thousands 
>of goroutines in the same address space. If goroutines were just 
>threads, system resources would run out at a much smaller 
>number.<


> Segmented stacks are useful for 32 bit address space. However, 
> they are not useful for 64 bit address spaces.

I think Go is meant to be used mostly on 64 bit servers.
Both the designers of Go and Rust are experienced people, and 
they plan to use their languages on 64 bit systems.


Here they say Go avoid many stack overflows, because stack are 
limited by the available virtual memory:
http://stackoverflow.com/questions/4226964/how-come-go-doesnt-have-stackoverflows


I think LLVM supports segmented stacks, the example given is on 
x86-64:
http://llvm.org/releases/3.0/docs/SegmentedStacks.html

Bye,
bearophile
July 09, 2012
Re: Rust updates
On 7/8/2012 2:32 PM, bearophile wrote:
>> Segmented stacks are useful for 32 bit address space. However, they are not
>> useful for 64 bit address spaces.
>
> I think Go is meant to be used mostly on 64 bit servers.
> Both the designers of Go and Rust are experienced people, and they plan to use
> their languages on 64 bit systems.

I think you misunderstood. I meant there is no point to segmented stacks on a 64 
bit system.
July 09, 2012
Re: Rust updates
"jerro" <a@a.com> wrote:
> I would expect the abbreviations that rust uses to be perfectly
> readable once you know the langauge.

There is a lot of noise (lot of special characters) in Rust code. Together
with short keywords like "fn" for function definition. 

It's hard to see a structure in it. You can read JAPHs, too, if you know
Perl. But your brain parses it character for character. Rust is a bit
better, though.
« First   ‹ Prev
1 2 3 4 5
Top | Discussion index | About this forum | D home