Jump to page: 1 2 3
Thread overview
D array expansion and non-deterministic re-allocation
Nov 16, 2009
Bartosz Milewski
Nov 16, 2009
Tim Matthews
Nov 16, 2009
Leandro Lucarella
Nov 16, 2009
Walter Bright
Nov 16, 2009
Nick Sabalausky
Nov 16, 2009
Walter Bright
Nov 16, 2009
Denis Koroskin
Nov 16, 2009
Walter Bright
Nov 17, 2009
Denis Koroskin
Nov 17, 2009
Sean Kelly
Nov 17, 2009
Walter Bright
Nov 16, 2009
grauzone
Nov 16, 2009
Walter Bright
Nov 16, 2009
Rainer Deyke
Nov 16, 2009
Walter Bright
Nov 16, 2009
Nick Sabalausky
Nov 16, 2009
Walter Bright
Nov 17, 2009
BCS
Nov 17, 2009
Walter Bright
Nov 18, 2009
Bartosz Milewski
Nov 18, 2009
Bill Baxter
Nov 18, 2009
Bartosz Milewski
Nov 16, 2009
Ali Cehreli
Nov 16, 2009
dsimcha
Nov 23, 2009
Michel Fortin
November 16, 2009
I read Andrei's chapter on arrays and there's one thing that concerns me. When a slice is extended, the decision to re-allocate, and therefore to cut its connection to other slices, is non-deterministic. How does that influence program testing and can it be exploited to attack a buggy system?

Here's a piece of code I came up with--it's not realistic but it describes a certain class of scenarios:

void f(byte[] code)
{
    doSomeChecks(code);
    execute(code);
}

void doSomeChecks(byte[] a)
{
   …
  // to avoid special-casing empty arrays
  a.length += 1; // extension
   …
  // get byte b (or a whole bunch) from somewhere
   a[0] = b; // write
  …
  // check what happens when a[0] = b
}

Of course this code is incorrect because doSomeChecks assumes that it holds a unique array, and the caller failed to make a copy before sending it to doSomeChecks. Bad bad programmer!

However imagine this scenario: Under testing conditions, the array extension just happens to always cause reallocation, so the write is NOT visible in the calling routine. All tests pass with flying colors. But since the re-allocation is, by language definition, non-deterministic, it might happen that, after the code has been deployed, the re-allocation stopped happening, say, on Sundays. If a hacker learns about that, he may be able to inject malicious code or data into the program.

In this case the non-determinism built into the language helped camouflage a serious bug. Should we, or shouldn't we, worry about it in practice?
November 16, 2009
Bartosz Milewski wrote:
> I read Andrei's chapter on arrays and there's one thing that concerns me. When a slice is extended, the decision to re-allocate, and therefore to cut its connection to other slices, is non-deterministic. How does that influence program testing and can it be exploited to attack a buggy system?
> 
> Here's a piece of code I came up with--it's not realistic but it describes a certain class of scenarios:
> 
> void f(byte[] code)
> {
>     doSomeChecks(code);
>     execute(code);
> }
> 
> void doSomeChecks(byte[] a)
> {
>    …   // to avoid special-casing empty arrays
>   a.length += 1; // extension
> >   // get byte b (or a whole bunch) from somewhere
>    a[0] = b; // write
> >   // check what happens when a[0] = b
> }
> 
> Of course this code is incorrect because doSomeChecks assumes that it holds a unique array, and the caller failed to make a copy before sending it to doSomeChecks. Bad bad programmer!
> 
> However imagine this scenario: Under testing conditions, the array extension just happens to always cause reallocation, so the write is NOT visible in the calling routine. All tests pass with flying colors. But since the re-allocation is, by language definition, non-deterministic, it might happen that, after the code has been deployed, the re-allocation stopped happening, say, on Sundays. If a hacker learns about that, he may be able to inject malicious code or data into the program. 
> 
> In this case the non-determinism built into the language helped camouflage a serious bug. Should we, or shouldn't we, worry about it in practice?

Isn't the new kind of arrays created to fix this already? See T[new].

PS: Are you still working on D's concurrency?
November 16, 2009
Bartosz Milewski wrote:
> I read Andrei's chapter on arrays and there's one thing that concerns
> me. When a slice is extended, the decision to re-allocate, and
> therefore to cut its connection to other slices, is
> non-deterministic.

It is not non-deterministic. Try it - you'll get the same results every time. It is implementation-defined behavior.


> How does that influence program testing and can it
> be exploited to attack a buggy system?

Exploitations rely on undefined-behavior, such as buffer overflows, not implementation-defined behavior. This issue is no more conducive to an "exploit" than any other garden variety programming issue. It's entirely different than a buffer overflow attack.
November 16, 2009
Tim Matthews, el 16 de noviembre a las 15:05 me escribiste:
> Isn't the new kind of arrays created to fix this already? See T[new].

T[new] is gone.

-- 
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Parece McGuevara's o CheDonald's
November 16, 2009
"Walter Bright" <newshound1@digitalmars.com> wrote in message news:hdqea8$2mte$1@digitalmars.com...
> Bartosz Milewski wrote:
>> I read Andrei's chapter on arrays and there's one thing that concerns me. When a slice is extended, the decision to re-allocate, and therefore to cut its connection to other slices, is non-deterministic.
>
> It is not non-deterministic. Try it - you'll get the same results every time. It is implementation-defined behavior.
>
>
>> How does that influence program testing and can it
>> be exploited to attack a buggy system?
>
> Exploitations rely on undefined-behavior, such as buffer overflows, not implementation-defined behavior. This issue is no more conducive to an "exploit" than any other garden variety programming issue. It's entirely different than a buffer overflow attack.

Definition used by Implementation "Foo":
"Resize if there's room, otherwise reallocate" (sounds fairly realistic to
me)

Code in program using Implementation "Foo":
myArrayOnHeap ~= getUserInput();

Resizes or Reallocates?:
Deterministically dependant on "Is there room?" which is deterministically
dependant on two things:
1. The size of the user input (non-deterministic).
2. The amount of freespace after the current end of myArrayOnHeap (dependent
on the specific actions of the GC, which in turn, is dependent on other
behaviors of the program, which, for almost any useful program, are
dependent on non-deterministic runtime conditions, such as other user
input).

If 'A' ("resize or reallocate?") is deterministically determined by 'B', and 'B' ("is there room?") is non-deterministic ("dependent on things like user input"), then 'A' is, in effect, non-deterministic.

Simplified example of the same basic principle in action:

void main(char[][] args)
{
    if(args.length < 10)
        showPrettyPicture();
    else
        blowUpEarth();
}

Deterministic? Only in the same sense that "resize or realloc upon appending" is deterministic. Safe? Fuck no.


November 16, 2009
Nick Sabalausky wrote:
> Deterministic? Only in the same sense that "resize or realloc upon appending" is deterministic.

It's deterministic in the sense that if you run the program again with the same inputs, you will get the same result. This is a highly useful attribute for testing and debugging.

A non-deterministic problem will give you a different result every time you run it. Threading problems are an example of this, as are any dependencies on uninitialized data.

This particular issue is implementation-defined.

> Safe? Fuck no.

It's safe as in memory safe. This is as opposed to undefined-behavior, which is not memory safe. A buffer overflow is an example of undefined-behavior.
November 16, 2009
On Mon, 16 Nov 2009 09:24:13 +0300, Walter Bright <newshound1@digitalmars.com> wrote:

> Nick Sabalausky wrote:
>> Deterministic? Only in the same sense that "resize or realloc upon appending" is deterministic.
>
> It's deterministic in the sense that if you run the program again with the same inputs, you will get the same result. This is a highly useful attribute for testing and debugging.
>

It is *non*-deterministic. The decision to reallocate depends (or will depend) on LRU and it may be cleared by another thread (e.g. another thread may reset it manually or via a GC cycle run).
November 16, 2009
Denis Koroskin wrote:
> It is *non*-deterministic. The decision to reallocate depends (or will depend) on LRU and it may be cleared by another thread (e.g. another thread may reset it manually or via a GC cycle run).

The LRU is thread local.
November 16, 2009
Walter Bright wrote:
>> Safe? Fuck no.
> 
> It's safe as in memory safe. This is as opposed to undefined-behavior, which is not memory safe. A buffer overflow is an example of undefined-behavior.

Even if it's memory safe, it could cause a bug. Look at Bartosz example again; if the memory block gets copied really depends from the input data length.
November 16, 2009
grauzone wrote:
> Even if it's memory safe, it could cause a bug. Look at Bartosz example again; if the memory block gets copied really depends from the input data length.

Yes, it could cause a bug. But it isn't undefined-behavior, it isn't memory corruption, and it isn't non-deterministic.

    i = ++i;

can also cause a bug.
« First   ‹ Prev
1 2 3