Software Assurance Reference Dataset (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Software Assurance Reference Dataset (page 2)

June 26, 2014

Re: Software Assurance Reference Dataset

Posted by Walter Bright
in reply to Araq

Walter Bright

Posted in reply to Araq

On 6/26/2014 2:01 PM, Araq wrote:
>>
>> Spark is a research language that does not work, as I've discovered and
>> discussed with you before. It cannot be determined the max stack usage at
>> compile time, again, this is the halting problem.
>>
>
> What?! It's easily solvable: Forbid recursion and indirect
> function calls

Oh well, there goes about 90% of D programs and pretty much all use of the D runtime library!

> and it's guaranteed that the program only requires
> a fixed size stack and you can compute an upper bound of the
> required stack size at compile-time. Which is BTW exactly what
> OpenCL does as GPUs tend to have no stacks.
>
> In what way is Spark a "research language that does not work"?

A while back, bearophile posted here some advocacy that Spark was using its contracts to prove things about the code. I experimented with it a bit and discovered that such proofs did not go beyond the trivial. As I recall, bearophile then agreed that it was a great idea that the implementation fell far short of.

> And how many language design issues need to be discovered until
> you admit that Safe-D is a "research language that does not work"?

I recommend that all such issues you discover be put on bugzilla, and marked with the 'safe' keyword, so they can be addressed.

I think there's a big difference between "only works for trivial cases" with no idea how to handle the rest and "does not work for all cases" and there being reasonable paths to handle them.

I admit that Safe D does not yet handle all the cases.

June 26, 2014

Re: Software Assurance Reference Dataset

Posted by Timon Gehr
in reply to Walter Bright

Timon Gehr

Posted in reply to Walter Bright

On 06/26/2014 10:29 PM, Walter Bright wrote:
> On 6/26/2014 2:52 AM, Timon Gehr wrote:
>> On 06/26/2014 11:35 AM, Walter Bright wrote:
>>> On 6/26/2014 2:19 AM, bearophile wrote:
>>>> One kind of problem left is to avoid stack overflows.
>>>
>>> In general, stack overflow checking at compile time is the halting
>>> problem.
>>
>> That is irrelevant to his point because he is not suggesting to solve the
>> general problem precisely. Analogously: In general, checking whether some
>> variable in e.g. Python is ever assigned a string value is undecidable
>> as well,
>> but this does not imply we cannot have 'int' variables in D.
>
> When you're dealing with security issues, which is what this about,

This is about _avoiding stack overflows_. It's written down literally in the quoted passage.

> you'll need a guarantee about stack overflow. Adding annotations is not
> helpful with this because they are not checkable.
> ...

Not true. Basic facts:

- In practice, programs are constructed explicitly to fulfill a particular purpose and, in particular, if they do never overflow the stack, they usually do so for a reason.

- Reasoning can be formalized and formal proofs can be written down in such a way that a machine can check whether the proof is valid.

- Annotations can include a formal proof.

We've had similar discussions before. Why do we _still_ have to argue about the _possibility_ of having a system that is helpful for proving stack overflows (or other bad behaviours) absent?

You can say "out of scope" or "not a priority" or "this should be realized in third-party tool support" but not "it is impossible".


> Again, what WORKS is a runtime check.

A runtime check will not avoid the problem, which is exhaustion of stack space.

June 26, 2014

Re: Software Assurance Reference Dataset

Posted by Andrei Alexandrescu
in reply to Araq

Andrei Alexandrescu

Posted in reply to Araq

On 6/26/14, 2:01 PM, Araq wrote:
>>
>> Spark is a research language that does not work, as I've discovered
>> and discussed with you before. It cannot be determined the max stack
>> usage at compile time, again, this is the halting problem.
>>
>
> What?! It's easily solvable: Forbid recursion and indirect
> function calls

I do agree that a useful subset of a language can be conservatively defined that doesn't require solving the halting problem. But that's not easy at all - it requires interprocedural analysis.

Andrei

June 26, 2014

Re: Software Assurance Reference Dataset

Posted by Andrei Alexandrescu
in reply to Timon Gehr

Andrei Alexandrescu

Posted in reply to Timon Gehr

On 6/26/14, 4:16 PM, Timon Gehr wrote:
> - Annotations can include a formal proof.

If a function can be annotated with what other functions it calls
(non-transitively), I agree that it can be guaranteed with local
semantic checking that a program won't overflow. However requiring such
annotations would be onerous, and deducing them would require whole
program analysis.

> We've had similar discussions before. Why do we _still_ have to argue
> about the _possibility_ of having a system that is helpful for
> proving stack overflows (or other bad behaviours) absent?
>
> You can say "out of scope" or "not a priority" or "this should be
> realized in third-party tool support" but not "it is impossible".

I also seem to reckon Walter is in the other extreme (he asserts it can't be done at all, period). My understanding is that it can be done but only with annotations or whole program analysis.

Andrei

June 27, 2014

Re: Software Assurance Reference Dataset

Posted by H. S. Teoh
in reply to Andrei Alexandrescu

H. S. Teoh

Posted in reply to Andrei Alexandrescu

On Thu, Jun 26, 2014 at 04:43:33PM -0700, Andrei Alexandrescu via Digitalmars-d wrote:
> On 6/26/14, 2:01 PM, Araq wrote:
> >>
> >>Spark is a research language that does not work, as I've discovered and discussed with you before. It cannot be determined the max stack usage at compile time, again, this is the halting problem.
> >>
> >
> >What?! It's easily solvable: Forbid recursion and indirect function calls
> 
> I do agree that a useful subset of a language can be conservatively defined that doesn't require solving the halting problem. But that's not easy at all - it requires interprocedural analysis.
[...]

And it may not be feasible for a compiler to automatically prove.

One possible approach is to have the user supply the proof of eventual termination, which can be mechanically verified by the compiler. (Checking a supplied proof for correctness is more tractable than coming up with the proof in the first place.) But to handle proofs of non-trivial code beyond just toy examples, you might end up needing a full-scale deductive proof subsystem in the compiler, which may or may not be practical for D's needs!

T

-- 
That's not a bug; that's a feature!

June 27, 2014

Re: Software Assurance Reference Dataset

Posted by Timon Gehr
in reply to Andrei Alexandrescu

Timon Gehr

Posted in reply to Andrei Alexandrescu

On 06/27/2014 01:47 AM, Andrei Alexandrescu wrote:
> On 6/26/14, 4:16 PM, Timon Gehr wrote:
>> ...
>>
>> You can say "out of scope" or "not a priority" or "this should be
>> realized in third-party tool support" but not "it is impossible".
>
> I also seem to reckon Walter is in the other extreme (he asserts it
> can't be done at all, period).

I don't think it makes sense to imply that I am defending an extreme position. I wasn't discussing design choices, truth is not a continuum and I was just objecting to the line of reasoning that went like "undecidability of the halting problem implies that formal reasoning is pointless.", which is clearly untrue.

> My understanding is that it can be done
> but only with annotations or whole program analysis.
> ...

Any way it is done, it doesn't come for free.

June 27, 2014

Re: Software Assurance Reference Dataset

Posted by H. S. Teoh
in reply to Andrei Alexandrescu

H. S. Teoh

Posted in reply to Andrei Alexandrescu

On Thu, Jun 26, 2014 at 04:47:24PM -0700, Andrei Alexandrescu via Digitalmars-d wrote:
> On 6/26/14, 4:16 PM, Timon Gehr wrote:
> >- Annotations can include a formal proof.
> 
> If a function can be annotated with what other functions it calls (non-transitively), I agree that it can be guaranteed with local semantic checking that a program won't overflow. However requiring such annotations would be onerous, and deducing them would require whole program analysis.
> 
> >We've had similar discussions before. Why do we _still_ have to argue about the _possibility_ of having a system that is helpful for proving stack overflows (or other bad behaviours) absent?
> >
> >You can say "out of scope" or "not a priority" or "this should be realized in third-party tool support" but not "it is impossible".
> 
> I also seem to reckon Walter is in the other extreme (he asserts it can't be done at all, period). My understanding is that it can be done but only with annotations or whole program analysis.
[...]

I think the compiler should be able to tell, at least for the simplest cases, whether something will definitely stop recursing. Proving that something will *not* stop recursing is unsolvable in the general case, but even if we restrict it to a solvable subset, it's still far from trivial for the compiler to invent such a proof (unless we restrict it so much that it excludes too many useful algorithms).

One approach is to have the user supply a proof that the compiler can then check -- in general, if a proof exists at all, it should be checkable. Such checks furthermore can probably be done fast enough so as to not adversely affect current compilation times (unless the proof is ridiculously complex, which for practical real-world applications I don't think will happen).

However, absence of proof is not proof of absence; just because neither the compiler nor the user is able to come up with a proof that something will stop recursing, doesn't mean that it definitely won't stop recursing. So the compiler cannot definitively reject the code as definitely overflowing the stack. But we *can* make it so that the user tells the compiler "please stop compilation if neither of us can supply a proof that this function will stop recursing". But it has to be opt-in, because there will be many real-world applications that cannot be proven to stop recursing, even though in practice they always will, so we cannot make this a language default.

One particular example that comes to mind is the compiler itself: as it parses the input program, there is in theory no guarantee that the input won't have an arbitrarily deep nesting, such that the AST generated by the parser will be infinitely deep, because you cannot statically prove that the input will terminate. You don't know if the input file is actually connected to the output pipe of a program that prints an infinite series of ever-deeper nested structs, for example.  However, in practice, such input never occurs, so we generally don't worry about such contrived possibilities.  But this does mean that there can be no proof of termination of recursion, even user-supplied ones -- because there *are* cases where the parser *will* recurse forever, even if it never happens in practice. Not accounting for that possibility invalidates the proof, so any proof will always be rejected.  Therefore, it is impossible to prove that recursion in the compiler is finite, even though in practice it always is.

T

-- 
"Life is all a great joke, but only the brave ever get the point." -- Kenneth Rexroth

June 27, 2014

Re: Software Assurance Reference Dataset

Posted by Walter Bright
in reply to Timon Gehr

Walter Bright

Posted in reply to Timon Gehr

On 6/26/2014 4:16 PM, Timon Gehr wrote:
> On 06/26/2014 10:29 PM, Walter Bright wrote:
>>> When you're dealing with security issues, which is what this about,
> This is about _avoiding stack overflows_. It's written down literally in the
> quoted passage.

Check the title of this thread, the linked issues, and bearophile's comment bringing up stack overflows.

It's about security, not niceness.

>> you'll need a guarantee about stack overflow. Adding annotations is not
>> helpful with this because they are not checkable.
>> ...
>
> Not true. Basic facts:
>
> - In practice, programs are constructed explicitly to fulfill a particular
> purpose and, in particular, if they do never overflow the stack, they usually do
> so for a reason.
>
> - Reasoning can be formalized and formal proofs can be written down in such a
> way that a machine can check whether the proof is valid.
>
> - Annotations can include a formal proof.
>
> We've had similar discussions before. Why do we _still_ have to argue about the
> _possibility_ of having a system that is helpful for proving stack overflows (or
> other bad behaviours) absent?
>
> You can say "out of scope" or "not a priority" or "this should be realized in
> third-party tool support" but not "it is impossible".

In the general case, it is impossible. And I suspect the subset of D programs that don't have indirection or recursion is so small as to not be worth the bother.

I do know there are a class of applications, for example in critical embedded systems, were recursion and indirection, and even allocation, are not allowed. Using D in such cases would require eschewing using Phobos, and some other care taken, but that isn't the issue here, which is security vulnerabilities.

>> Again, what WORKS is a runtime check.
> A runtime check will not avoid the problem, which is exhaustion of stack space.

We disagree on the problem. The problem I initiated this thread on is "security vulnerabilities". Terminating a program that overflows its stack is not a security vulnerability.

As for formal proofs, I'll let slip a guilty secret - I know so little about computer science proofs I wouldn't even recognize one if I saw one, let alone have the ability to craft one. So when you advocate formal proofs, directing it at me won't accomplish anything. To get formal proofs for D in the spec, in the code, in the compiler, I cannot deliver that. People like you are going to have to step forward and do them.

June 27, 2014

Re: Software Assurance Reference Dataset

Posted by Andrei Alexandrescu
in reply to Timon Gehr

Andrei Alexandrescu

Posted in reply to Timon Gehr

On 6/26/14, 5:29 PM, Timon Gehr wrote:
> On 06/27/2014 01:47 AM, Andrei Alexandrescu wrote:
>> On 6/26/14, 4:16 PM, Timon Gehr wrote:
>>> ...
>>>
>>> You can say "out of scope" or "not a priority" or "this should be
>>> realized in third-party tool support" but not "it is impossible".
>>
>> I also seem to reckon Walter is in the other extreme (he asserts it
>> can't be done at all, period).
>
> I don't think it makes sense to imply that I am defending an extreme
> position. I wasn't discussing design choices, truth is not a continuum
> and I was just objecting to the line of reasoning that went like
> "undecidability of the halting problem implies that formal reasoning is
> pointless.", which is clearly untrue.

I agree.

>> My understanding is that it can be done
>> but only with annotations or whole program analysis.
>> ...
>
> Any way it is done, it doesn't come for free.

That's not a fair characterization. Interprocedural analysis is quite a different ball game from classic semantic checking.


Andrei

June 27, 2014

Re: Software Assurance Reference Dataset

Posted by deadalnix
in reply to bearophile

deadalnix

Posted in reply to bearophile

On Thursday, 26 June 2014 at 09:19:05 UTC, bearophile wrote:
> Walter Bright:
>
>> It's an interesting list, and an opportunity for D. I once said that my job was to put Coverity out of business. The more of these issues D can automatically prevent with @safe, the better.
>
> One kind of problem left is to avoid stack overflows. I have had several such cases in my D code (perhaps because I use fixed-sized arrays a lot).
>
> I think they can be caused by:
> 1) Too much large allocations in stack frames;

If generated by the compiler, they be made @safe

> 2) By alloca();

it is @system

> 3) By recursion and co-recursion chains;

We should have a page reserved at the end of the stack so we can throw when reaching it. The compiler can ensure we don't bypass it in case 1.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation