November 04, 2009
On Wed, 04 Nov 2009 14:03:42 -0300, Leandro Lucarella wrote:

> I think safe should be the default, as it should be the most used flavor in user code, right? What about:
> 
> module s;             // interface: safe     impl.: safe
> module (trusted) t;   // interface: safe     impl.: unsafe
> module (unsafe) u;    // interface: unsafe   impl.: unsafe
> 
> * s can import other safe or trusted modules (no unsafe for s). * t can
> import any kind of module, but he guarantee not to corrupt your
>   memory if you use it (that's why s can import it).
> * u can import any kind of modules and makes no guarantees (C bindings
>   use this).
> 
>> That's a pretty clean design. How would it interact with a -safe command-line flag?
> 
> I'll use safe by default. If you want to use broken stuff (everything
> should be correctly marked as safe (default), trusted or unsafe) and let
> it compile anyway, add a compiler flag -no-safe (or whatever).
> 
> But people should never use it, unless you are using some broken library or you are to lazy to mark your modules correctly.
> 
> 
> Is this too crazy?

I have no problem with safe as default, most of my code is safe. I also like the module (trusted) - it really pictures it meanings, better than "system".

But I think there is no reason no use -no-safe compiler flag ... for what reason one would want to force safer program to compile as less safer :)

As I'm thinking more about it, I don't see any reason to have any compiler flag for safety at all.
November 04, 2009
Michal Minich wrote:
> On Wed, 04 Nov 2009 14:03:42 -0300, Leandro Lucarella wrote:
> 
>> I think safe should be the default, as it should be the most used flavor
>> in user code, right? What about:
>>
>> module s;             // interface: safe     impl.: safe module (trusted) t;   // interface: safe     impl.: unsafe
>> module (unsafe) u;    // interface: unsafe   impl.: unsafe
>>
>> * s can import other safe or trusted modules (no unsafe for s). * t can
>> import any kind of module, but he guarantee not to corrupt your
>>   memory if you use it (that's why s can import it).
>> * u can import any kind of modules and makes no guarantees (C bindings
>>   use this).
>>
>>> That's a pretty clean design. How would it interact with a -safe
>>> command-line flag?
>> I'll use safe by default. If you want to use broken stuff (everything
>> should be correctly marked as safe (default), trusted or unsafe) and let
>> it compile anyway, add a compiler flag -no-safe (or whatever).
>>
>> But people should never use it, unless you are using some broken library
>> or you are to lazy to mark your modules correctly.
>>
>>
>> Is this too crazy?
> 
> I have no problem with safe as default, most of my code is safe. I also like the module (trusted) - it really pictures it meanings, better than "system". 
> 
> But I think there is no reason no use -no-safe compiler flag ... for what reason one would want to force safer program to compile as less safer :)

Efficiency (e.g. remove array bounds checks).

> As I'm thinking more about it, I don't see any reason to have any compiler flag for safety at all.

That would be a great turn of events!!!


Andrei
November 04, 2009
Michal Minich, el  4 de noviembre a las 18:58 me escribiste:
> As I'm thinking more about it, I don't see any reason to have any compiler flag for safety at all.

That was exacly my point.

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Be nice to nerds
Chances are you'll end up working for one
November 04, 2009
On Wed, 04 Nov 2009 13:12:54 -0600, Andrei Alexandrescu wrote:

>> But I think there is no reason no use -no-safe compiler flag ... for what reason one would want to force safer program to compile as less safer :)
> 
> Efficiency (e.g. remove array bounds checks).
> 
>> As I'm thinking more about it, I don't see any reason to have any compiler flag for safety at all.
> 
> That would be a great turn of events!!!
> 
> 
> Andrei

Memory safety is pretty specific thing, If you want it, you want it all, not just some part of it - then you cannot call it memory safety. The idea of safe module, which under some compiler switch is not safe does not appeal to me. But efficiency is also important, and if you want it, why not move the code subjected to bounds checks to trusted/system module - I hope they are not checked for bounds in release mode. Moving parts of the code to trusted modules is more semantically describing, compared to crude tool of ad-hoc compiler switch.

One thing I'm concerned with, whether there is compiler switch or not, is that module numbers will increase, as you will probably want to split some modules in two, because some part may be safe, and some not. I'm wondering why the safety is not discussed on function level, similarly as pure and nothrow currently exists. I'm not sure this would be good, just wondering. Was this topic already discussed?
November 04, 2009
Michal Minich wrote:
> On Wed, 04 Nov 2009 13:12:54 -0600, Andrei Alexandrescu wrote:
> 
>>> But I think there is no reason no use -no-safe compiler flag ... for
>>> what reason one would want to force safer program to compile as less
>>> safer :)
>> Efficiency (e.g. remove array bounds checks).
>>
>>> As I'm thinking more about it, I don't see any reason to have any
>>> compiler flag for safety at all.
>> That would be a great turn of events!!!
>>
>>
>> Andrei
> 
> Memory safety is pretty specific thing, If you want it, you want it all, not just some part of it - then you cannot call it memory safety.

I agree and always did.

> The idea of safe module, which under some compiler switch is not safe does not appeal to me.

Absolutely. Notice that if you thought I proposed that, there was a misunderstanding.

> But efficiency is also important, and if you want it, why not move the code subjected to bounds checks to trusted/system module - I hope they are not checked for bounds in release mode. Moving parts of the code to trusted modules is more semantically describing, compared to crude tool of ad-hoc compiler switch.

Well it's not as simple as that. Trusted code is not unchecked code - it's code that may drop redundant checks here and there, leaving code correct, even though the compiler cannot prove it. So no, there's no complete removal of bounds checking. But a trusted module is allowed to replace this:

foreach (i; 0 .. a.length) ++a[i];

with

foreach (i; 0 .. a.length) ++a.ptr[i];

The latter effectively escapes checks because it uses unchecked pointer arithmetic. The code is still correct, but this time it's the human vouching for it, not the compiler.

> One thing I'm concerned with, whether there is compiler switch or not, is that module numbers will increase, as you will probably want to split some modules in two, because some part may be safe, and some not. I'm wondering why the safety is not discussed on function level, similarly as pure and nothrow currently exists. I'm not sure this would be good, just wondering. Was this topic already discussed?

This is a relatively new topics, and you pointed out some legit kinks. One possibility I discussed with Walter is to have version(safe) vs. version(system) or so. That would allow a module to expose different interfaces depending on the command line switches.


Andrei
November 04, 2009
Andrei Alexandrescu wrote:
>> module name;                  // interface: unsafe   impl.: unsafe
>> module (system) name;         // interface: safe     impl.: unsafe
>> module (safe) name;           // interface: safe     impl.: safe
>>
>> so you can call system modules (io, network...) from safe code.
> 
> That's a pretty clean design. How would it interact with a -safe command-line flag?

'-safe' turns on runtime safety checks, which can be and should be mostly orthogonal to the module safety level.


-- 
Rainer Deyke - rainerd@eldwood.com
November 04, 2009
Rainer Deyke wrote:
> Andrei Alexandrescu wrote:
>>> module name;                  // interface: unsafe   impl.: unsafe
>>> module (system) name;         // interface: safe     impl.: unsafe
>>> module (safe) name;           // interface: safe     impl.: safe
>>>
>>> so you can call system modules (io, network...) from safe code.
>> That's a pretty clean design. How would it interact with a -safe
>> command-line flag?
> 
> '-safe' turns on runtime safety checks, which can be and should be
> mostly orthogonal to the module safety level.

Runtime vs. compile-time is immaterial. There's one goal - no undefined behavior - that can be achieved through a mix of compile- and run-time checks.

My understanding of a good model suggested by this discussion:

module name;         // does whatever, just like now
module(safe) name;   // submits to extra checks
module(system) name; // encapsulates unsafe stuff in a safe interface

No dedicated compile-time switches.


Andrei
November 04, 2009
On Wed, 04 Nov 2009 14:24:47 -0600, Andrei Alexandrescu wrote:

>> But efficiency is also important, and if you want it, why not move the code subjected to bounds checks to trusted/system module - I hope they are not checked for bounds in release mode. Moving parts of the code to trusted modules is more semantically describing, compared to crude tool of ad-hoc compiler switch.
> 
> Well it's not as simple as that. Trusted code is not unchecked code - it's code that may drop redundant checks here and there, leaving code correct, even though the compiler cannot prove it. So no, there's no complete removal of bounds checking. But a trusted module is allowed to replace this:
> 
> foreach (i; 0 .. a.length) ++a[i];
> 
> with
> 
> foreach (i; 0 .. a.length) ++a.ptr[i];
> 
> The latter effectively escapes checks because it uses unchecked pointer arithmetic. The code is still correct, but this time it's the human vouching for it, not the compiler.
> 
>> One thing I'm concerned with, whether there is compiler switch or not, is that module numbers will increase, as you will probably want to split some modules in two, because some part may be safe, and some not. I'm wondering why the safety is not discussed on function level, similarly as pure and nothrow currently exists. I'm not sure this would be good, just wondering. Was this topic already discussed?
> 
> This is a relatively new topics, and you pointed out some legit kinks. One possibility I discussed with Walter is to have version(safe) vs. version(system) or so. That would allow a module to expose different interfaces depending on the command line switches.
> 
> 
> Andrei

Sorry for the long post, but it should explain how safety specification should work (and how not).

Consider these 3 ways of specifying memory safety:

safety specification at module level (M)
safety specification at function level (F)
safety specification using version switching (V)

I see a very big difference between these things:
while the M and F are "interface" specification, V is implementation
detail.

This difference applies only to library/module users, it causes no difference for library/module writer - he must always decide if he writes safe, unsafe or trusted code

Imagine scenario with M safety for library user:
Library user wants to make memory safe application. He marks his main
module as safe, and can be sure (and/or trust), that his application is
safe from this point on; because safety is explicit in "interface" he
cannot import and use unsafe code.

scenario with V safety:
Library user wants to make memory safe application. He can import any
module. He can use -safe switch on compiler so compiler will use safe
version of code - if available! User can be never sure if his application
is safe or not. Safety is implementation detail!

For this reason, I think V safety is very unsuitable option. Absolutely useless.

But there are also problems with M safety.
Imagine module for string manipulation with 10 independent functions. The
module is marked safe. Library writer then decides add another function,
which is unsafe. He can now do following:

Option 1: He can mark the module trusted, and implement the function in unsafe way. Compatibility with safe clients using this module will remain. Bad thing: there are 10 provably safe functions, which are not checked by compiler. Also the trust level of module is lower in eyes of user. Library may end us with all modules as trusted (no safe).

Option 2: He will implement this in separate unsafe module. This has negative impact on library structure.

Option 3: He will implement this in separate trusted module and publicly import this trusted module in original safe module.

The thirds options is transparent for module user, and probably the best solution, but I have a feeling that many existing modules will end having their unsafe twin. I see this pattern to emerge:

module(safe) std.string
module(trusted) std.string_trusted // do not import, already exposed by
std.string

Therefore I propose to use F safety.

It is in fact the same beast as pure and nothrow - they also guarantee some kind of safety, and they are also part of function interface (signature). Compiler also needs to perform stricter check as normally.

Just imagine marking entire module pure or nothrow. If certainly possible, is it practical? You would find yourself splitting your functions into separate modules with specific check, or not using pure and nothrow entirely.

This way, if you mark your main function safe, you can be sure(and/or trust) your application is safe. More usually - you can use safe only for some functions and this requirement will propagate to all called functions, the same way as for pure or nothrow.

One think to figure out remains how to turn of runtime bounds checking for trusted code (and probably safe too). This is legitimate requirement, because probably all the standard library will be safe or trusted, and users which are not concerned with safety and want speed, need to have this compiler switch.
November 04, 2009
Michal Minich wrote:
[snip]
> Therefore I propose to use F safety. 
[snip]

I think you've made an excellent case.

Andrei
November 05, 2009
Andrei Alexandrescu wrote:
> Rainer Deyke wrote:
>> '-safe' turns on runtime safety checks, which can be and should be mostly orthogonal to the module safety level.
> 
> Runtime vs. compile-time is immaterial.

The price of compile-time checks is that you are restricted to a subset of the language, which may or may not allow you to do what you need to do.

The price of runtime checks is runtime performance.

Safety is always good.  To me, the question is never if I want safety, but if I can afford it.  If I can't afford to pay the price of runtime checks, I may still want the compile-time checks.  If I can't afford to pay the price of compile-time checks, I may still want the runtime checks.  Thus, to me, the concepts of runtime and compile-time checks are orthogonal.

A module either passes the compile-time checks or it does not.  It makes no sense make the compile-time checks optional for some modules.  If the module is written to pass the compile-time checks (i.e. uses the safe subset of the language), then the compile-time checks should always be performed for that module.


-- 
Rainer Deyke - rainerd@eldwood.com