August 31, 2016 Re: DIP1000 | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Tuesday, 30 August 2016 at 16:12:19 UTC, Andrei Alexandrescu wrote: > I'd like to initiate collaboration on an effort to do DIP1000 rigorously. > > First we need to reduce D to a bare subset that only has integers, structs, pointers, and functions. That's a working subset of actual D code. The grammar I have in mind is at http://erdani.com/d/DIP1000.html. > > There is no type deduction, member functions, classes, arrays, constructors, loops, etc. etc. etc. Only the ability to create arbitrarily complex graphs containing integers and pointers to other nodes. > > On this language we need to define typing rules and evaluation semantics. Once we have those, we need to prove what we want: for scope variables the accessibility never outlives lifetime. As a consequence we're good to deallocate them early etc. > > The model for typing, evaluation, and proofs is at https://www.cis.upenn.edu/~bcpierce/papers/fj-toplas.pdf. It would be great if those interested in helping could give the paper a close read. > > Once we get this done we'll have a fantastic model for any other language changes. > > > Andrei BTW, the Rust Mid-level IR (MIR) has a similar purpose: it lowers the complex AST to a sufficiently simpler one so that one can more easily do things like lifetime analysis / borrow checking on it. Similarly, Swift have a similar step in their compiler pipeline for ARC. I don't know if such intermediate representation in the DMD FE is worth pursuing (obviously would be big development effort), but it may be worth having a look to see how such IR analysis is done in practice (on actual non-toy, non-purely academic languages), even if only for checking that your theoretical model captures sufficient information. Here are some links, which I hope you would find helpful: https://blog.rust-lang.org/2016/04/19/MIR.html (high-level non-technical introduction) https://github.com/rust-lang/rfcs/blob/master/text/1211-mir.md (Rust RFC ~ equivalent of DIP) https://news.ycombinator.com/item?id=10487502 (Presentation slides about Swift's compiler pipeline) |
September 12, 2016 Re: DIP1000 | ||||
---|---|---|---|---|
| ||||
Posted in reply to Kagamin | On 08/31/2016 08:05 AM, Kagamin wrote:
> On Tuesday, 30 August 2016 at 16:12:19 UTC, Andrei Alexandrescu wrote:
>> http://erdani.com/d/DIP1000.html
>
> Return values can't have `scope` annotation? If the container has
> trusted opAssign, use after free in this case is not accounted for?
I don't think scope returns are supported, but I'll think of it. Thanks. -- Andrei
|
September 12, 2016 Re: DIP1000 | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Monday, 12 September 2016 at 14:00:53 UTC, Andrei Alexandrescu wrote:
> On 08/31/2016 08:05 AM, Kagamin wrote:
>> On Tuesday, 30 August 2016 at 16:12:19 UTC, Andrei Alexandrescu wrote:
>>> http://erdani.com/d/DIP1000.html
That link leads to to a grammar.
Isn't DIP1000 the scope and @safe DIP ?
|
September 14, 2016 Re: DIP1000 | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | Some progress with the DIP1000 semantics. So I made a few changes to the grammar of the mini-language (call it MiniD1000), which is at http://erdani.com/d/DIP1000-grammar.html. Notable aspects: * Programs are a sequence of function definitions, struct definitions, and global variable definitions. A pass is supposed to go through all and create the initial typing environment. * Functions now only have one parameter. This is to reduce the size of the language (two parameters increase the numbers of typing and evaluation rules). Aliasing can be studied with globals. I'm unclear whether we need the second parameter for investigating things like cross-parameter aliasing; if so, I'll add it back. * All functions are required to end with a return statement. This avoids any need for flow control. * There are no rvalues in MinbiD1000, which I thought is quite clever. That means `null` and integral literals are not expressions; they are explicitly present in assignment statements. This simplifies MiniD1000 a great deal, but again will require us to be careful when we "port" whatever conclusions we draw back to D. * The `if` statement has no `else` clause. It only increases the size of the language without an increase in expressiveness. Next, I defined a baseline set of typing rules at http://erdani.com/d/DIP1000-typing-baseline.html for "bad" MiniD1000: all ref, scope and return attributes are ignored, so there's no special protection against bad escapes. Based on these rules we'll later define better rules that enforce safety. A few notes: * For a general understanding on how to read these rules, see the Featherweight Java (FJ) paper or see e.g. http://www.hedonisticlearning.com/posts/understanding-typing-judgments.html. * Like in FJ, sequences of zero or more things are shown with an overbar. * In order to identify the current function being compiled, I introduce the symbol f_{crt}. For example, in rule S-VarDeclaration, a prerequisite of the judgment is that f_{crt} exists, i.e. the rule S-VarDeclaration only applies inside a function, not outside. For the outside rule see D-GlobalVarDeclaration. * S-VarDeclaration introduces a new name visible to the statements that follow. This is done by typing not only the variable declaration, but also all following statements in its scope. The typing environment for the statements is the union of the preexisting typing environment and the new variable. * In fact the typing rules of all statements carry the appendix "and here are the following statements in the scope" which is workable but possibly a bit goofy. Any cleverer idea? * For that matter I need the rule S-NoStmts: "a sequence of zero statements types just fine". * In D-Struct, I vaguely state that a struct shall not be a transitive field of itself. Is there a better way to enforce that succintly? Please let me know of anything I missed or got wrong. Once the baseline typing is in place, the hard part starts: defining the enhanced typing that enforces safe scoping rules. Thanks, Andrei |
September 15, 2016 Re: DIP1000 | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On 14.09.2016 21:49, Andrei Alexandrescu wrote: > Some progress with the DIP1000 semantics. So I made a few changes to the > grammar of the mini-language (call it MiniD1000), which is at > http://erdani.com/d/DIP1000-grammar.html. Notable aspects: > > * Programs are a sequence of function definitions, struct definitions, > and global variable definitions. A pass is supposed to go through all > and create the initial typing environment. > ... There probably should be typing rules for that. > * Functions now only have one parameter. This is to reduce the size of > the language (two parameters increase the numbers of typing and > evaluation rules). Aliasing can be studied with globals. I'm unclear > whether we need the second parameter for investigating things like > cross-parameter aliasing; if so, I'll add it back. > ... I would support an arbitrary number of parameters. Leaving that out does not buy much in terms of simplicity, and it makes miniD quite a lot less relevant for arguing about the soundness of the full language. > * All functions are required to end with a return statement. This avoids > any need for flow control. > ... How so? You still allow return statements at arbitrary places in the function. > * There are no rvalues in MinbiD1000, Yes there are. Function calls might be rvalues because 'ref' is optional on them. > which I thought is quite clever. > That means `null` and integral literals are not expressions; they are > explicitly present in assignment statements. This simplifies MiniD1000 a > great deal, but again will require us to be careful when we "port" > whatever conclusions we draw back to D. > ... That should be fine. I think a better way is to special-case only the lhs though. I.e. x=e, *e₁=e₂, f(e₁,…,eₙ)=eₙ₊₁. > * The `if` statement has no `else` clause. It only increases the size of > the language without an increase in expressiveness. > > Next, I defined a baseline set of typing rules at > http://erdani.com/d/DIP1000-typing-baseline.html for "bad" MiniD1000: > all ref, scope and return attributes are ignored, so there's no special > protection against bad escapes. Based on these rules we'll later define > better rules that enforce safety. A few notes: > Looking good, except: - struct types are not checked for existence. E.g. the following program type checks: void main(){ Undefined x; Undefined* y=&x; } - 'fields' is not defined. (And how is it supposed to know what the fields of some syntactic type are?) Probably struct declarations should be part of the environment, then 'fields' can take the environment as an additional argument. I think it would make sense if your grammar enforced that all struct declarations come before all global variables which come before all function declarations. - Functions cannot call each other, because they cannot look up each other's names. > * For a general understanding on how to read these rules, see the > Featherweight Java (FJ) paper or see e.g. > http://www.hedonisticlearning.com/posts/understanding-typing-judgments.html. > > > * Like in FJ, sequences of zero or more things are shown with an overbar. > > * In order to identify the current function being compiled, I introduce > the symbol f_{crt}. For example, in rule S-VarDeclaration, a > prerequisite of the judgment is that f_{crt} exists, i.e. the rule > S-VarDeclaration only applies inside a function, not outside. For the > outside rule see D-GlobalVarDeclaration. > > * S-VarDeclaration introduces a new name visible to the statements that > follow. This is done by typing not only the variable declaration, but > also all following statements in its scope. The typing environment for > the statements is the union of the preexisting typing environment and > the new variable. > > * In fact the typing rules of all statements carry the appendix "and > here are the following statements in the scope" which is workable but > possibly a bit goofy. Any cleverer idea? > ... I think it is fine. > * For that matter I need the rule S-NoStmts: "a sequence of zero > statements types just fine". > > * In D-Struct, I vaguely state that a struct shall not be a transitive > field of itself. Well, the notation does not really make a lot of sense. > Is there a better way to enforce that succintly? > ... Typing rules for structs need to state that all struct declarations together are fine if each struct is fine individually given all other structs present in the environment. Then define something like transitiveFieldTypes(Γ,S) := μX. {T | ∃x. T x; ∈ (fields(Γ,S) ∪ ⋃{ fields(Γ,T') | T'∈X }) } (In English: transitiveFieldTypes(Γ,S) is the smallest set such that all field types of S and field types of other types in the set are also in the set.) You can then require that S∉transitiveFieldTypes(Γ,S) > Please let me know of anything I missed or got wrong. Once the baseline > typing is in place, the hard part starts: defining the enhanced typing > that enforces safe scoping rules. > ... Maybe we should define the semantics first (it is easier and it is necessary for providing formal counter-examples for the soundness of proposed typing rules). |
September 15, 2016 Re: DIP1000 | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | Well, aliasing can be reproduced with locals S s; int* r = getPayload(s); freePayload(s); int v = *r; //UAF Multiparameter functions can be declared to be equivalent to struct P { S* s; int* r; } P p; p.s = &s; p.r = getPayload(s); f(p); //as if f(S*,int*) |
September 15, 2016 Re: DIP1000 | ||||
---|---|---|---|---|
| ||||
Posted in reply to Kagamin | On 09/15/2016 04:44 AM, Kagamin wrote: > Well, aliasing can be reproduced with locals > > S s; > int* r = getPayload(s); > freePayload(s); > int v = *r; //UAF Nit: in MiniD1000 you'd need to declare vars first, assign them second: S s; int* r; r = getPayload(s); freePayload(s); int v; v = *r; (and there are no comments :o)) > Multiparameter functions can be declared to be equivalent to > > struct P { S* s; int* r; } > P p; > p.s = &s; > p.r = getPayload(s); > f(p); //as if f(S*,int*) Thanks for making this point. Andrei |
September 15, 2016 Re: DIP1000 | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On 15.09.2016 14:57, Andrei Alexandrescu wrote:
>
>> Multiparameter functions can be declared to be equivalent to
>>
>> struct P { S* s; int* r; }
>> P p;
>> p.s = &s;
>> p.r = getPayload(s);
>> f(p); //as if f(S*,int*)
>
> Thanks for making this point.
This fails when considering scope and return annotations though. Parameters can be annotated or not annotated with scope/return scope independently, but the struct can either be annotated or not annotated, the fields cannot be treated independently (which is one of the weaknesses of DIP1000).
|
September 15, 2016 Re: DIP1000 | ||||
---|---|---|---|---|
| ||||
Posted in reply to Timon Gehr | On 9/15/16 5:14 PM, Timon Gehr wrote:
> struct can either be annotated or not annotated, the fields cannot be
> treated independently (which is one of the weaknesses of DIP1000).
I was planning to allow scope annotations for struct fields. -- Andrei
|
September 16, 2016 Re: DIP1000 | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On 16.09.2016 01:07, Andrei Alexandrescu wrote:
> On 9/15/16 5:14 PM, Timon Gehr wrote:
>> struct can either be annotated or not annotated, the fields cannot be
>> treated independently (which is one of the weaknesses of DIP1000).
>
> I was planning to allow scope annotations for struct fields. -- Andrei
I see. In that case, single-parameter functions only is probably okay.
|
Copyright © 1999-2021 by the D Language Foundation