January 09, 2013
On Wednesday, 9 January 2013 at 20:16:04 UTC, Andrei Alexandrescu wrote:
> On 1/9/13 12:09 PM, Mehrdad wrote:
>> It's memory-safe too. What am I missing here?
>
> What you're missing is that you define a store that doesn't model object references with object addresses. That's what I meant by "references are part of the language". If store is modeled by actual memory (i.e. accessing an object handle takes you to the object), you must have GC for the language to be safe. If store is actually indirected and gives up on the notion of address, then sure you can implement safety checks. The thing is everybody wants for references to model actual object addresses; indirect handles as the core abstraction are uninteresting.
>
>
> Andrei

Quote from OpenBSD's malloc implementation:
"On a call to free, memory is released and unmapped from the process address space using munmap."

I don't see why this approach is less safe than a GC... in fact, I claim it's safer, because it's far simpler to implement, and thus less likely to contain bugs and in addition it's easy to make performance vs safety trade-offs, simply by linking with another memory-allocator.
January 09, 2013
On Wed, Jan 09, 2013 at 08:40:28PM +0100, Rob T wrote: [...]
> According to my definition of memory safety, a memory leak is still a memory leak no matter how it happens. I can however see an alternate definition which is likely what you are suggesting, where so long as you are not accessing memory that is not allocated, you are memory safe. There must be more to it than that, so if you can supply a more correct definition, that would be welcome.
[...]

And here we finally get to the root of the problem. Walter's definition
of memory-safe (or what I understand it to be) is that you can't:
(1) Access memory that's been freed
(2) Access memory that was never allocated
(3) As a result of the above, read garbage values or other data that you
  aren't supposed to be able to access from memory.

The context of this definition, from what I understand, is security breaches that exploit buffer overruns, stack overruns, and pointer arithmetic to read stuff that one isn't supposed to be able to read or write stuff into places where one shouldn't be able to write to. A good number of security holes are caused by being able to do such things, due to the lack of memory safety in C/C++.

Running out of memory is moot, because the OS will just kill your app (or an exception will be thrown and the runtime will terminate), so that presents no exploit path.

Dereferencing null is also moot, because you'll just get an exception or a segfault, which is no help for a potential expoit.

Memory leak isn't something directly exploitable (though it *can* be used in a DOS attack), so it doesn't fall under the definition of "memory safety" either.

If you want to address memory leaks or dereferencing nulls, that's a different kettle o' fish.


T

-- 
Those who've learned LaTeX swear by it. Those who are learning LaTeX swear at it. -- Pete Bleackley
January 09, 2013
On Wednesday, 9 January 2013 at 08:52:37 UTC, Benjamin Thaut wrote:
> Am 09.01.2013 00:21, schrieb deadalnix:
>>
>> That is a real misrepresentation of the reality. Such people avoid the
>> GC, but simply because they avoid all kind of allocation altogether,
>> preferring allocating up-front.
>
> But in the end they still don't want a GC, correct?
>
> Kind Regards
> Benjamin Thaut


If everything is preallocated and reused, does it really matter whether there is a GC or not ?
January 09, 2013
On 1/9/13 1:11 PM, Tove wrote:
> On Wednesday, 9 January 2013 at 20:16:04 UTC, Andrei Alexandrescu wrote:
>> On 1/9/13 12:09 PM, Mehrdad wrote:
>>> It's memory-safe too. What am I missing here?
>>
>> What you're missing is that you define a store that doesn't model
>> object references with object addresses. That's what I meant by
>> "references are part of the language". If store is modeled by actual
>> memory (i.e. accessing an object handle takes you to the object), you
>> must have GC for the language to be safe. If store is actually
>> indirected and gives up on the notion of address, then sure you can
>> implement safety checks. The thing is everybody wants for references
>> to model actual object addresses; indirect handles as the core
>> abstraction are uninteresting.
>>
>>
>> Andrei
>
> Quote from OpenBSD's malloc implementation:
> "On a call to free, memory is released and unmapped from the process
> address space using munmap."
>
> I don't see why this approach is less safe than a GC... in fact, I claim
> it's safer, because it's far simpler to implement, and thus less likely
> to contain bugs and in addition it's easy to make performance vs safety
> trade-offs, simply by linking with another memory-allocator.

No. When you allocate again and remap memory, you may get the same address range for an object of different type.

Andrei
January 09, 2013
On 1/9/13 1:09 PM, Mehrdad wrote:
> On Wednesday, 9 January 2013 at 20:16:04 UTC, Andrei Alexandrescu wrote:
>> What you're missing is that you define a store that doesn't model
>> object references with object addresses. That's what I meant by
>> "references are part of the language". If store is modeled by actual
>> memory (i.e. accessing an object handle takes you to the object), you
>> must have GC for the language to be safe. If store is actually
>> indirected and gives up on the notion of address, then sure you can
>> implement safety checks. The thing is everybody wants for references
>> to model actual object addresses; indirect handles as the core
>> abstraction are uninteresting.
>>
>> Andrei
>
>
>
> But why can't Reference hold the actual address too?

If it holds the actual address you can't implement memory reclamation and keep it safe.

Andrei
January 09, 2013
On Wednesday, January 09, 2013 22:14:15 SomeDude wrote:
> If everything is preallocated and reused, does it really matter whether there is a GC or not ?

It would if the GC were running in the background (which isn't currently the case for D's default GC), but other than that, it would just affect program shutdown, because that's when the GC would actually run. If the GC isn't run, it can't affect anything.

- Jonathan M Davis
January 09, 2013
On Wednesday, 9 January 2013 at 21:14:56 UTC, Andrei Alexandrescu wrote:
> If it holds the actual address you can't implement memory reclamation and keep it safe.
>
> Andrei


You mean because of circular references, or something else?

And are you considering reference counting to be garbage collection like Walter does, or are you claiming refcounting won't solve this problem but GC will?
January 09, 2013

On 09.01.2013 19:57, Benjamin Thaut wrote:
> Am 09.01.2013 16:49, schrieb Andrei Alexandrescu:
>> On 1/9/13 4:25 AM, Benjamin Thaut wrote:
>>> The compiler is not shared-lib ready. At least not on windows. It does
>>> not support exporting data symbols. E.g.
>>>
>>> export uint g_myGlobal;
>>>
>>> This is mostly a problem for "hidden" data symbols, like vtables, module
>>> info objects, type info objects and other stuff D relies on.
>>
>> Are there bugzilla entries for this?
>>
>> Andrei
>
> Yes its pretty old too. If you read through the discussion in the ticket
> and through the code Rainer Schuetze provided you will have a list of
> all the issues that need to be fixed for shared dlls to work:
> http://d.puremagic.com/issues/show_bug.cgi?id=4071

I doubt it is easily mergeable now, but the major points are listed in the bug report. Some of the patches are meant as tests whether the approach is feasable and should be discussed (like the -exportall switch which might be exporting a bit to much, but could not be implemented by a def file due to optlink limitations).

>
> In the following patch:
> http://d.puremagic.com/issues/attachment.cgi?id=601&action=edit
> Rainer Schuetze does manual patching for data symbols. But this is
> hardcoded to only work for his phobos shared dll. The function it is
> done in is called dll_patchImportRelocations. If I understand DLLs
> correctly this should usually be done by the import library that is
> created by the compiler for a shared dll. Maybe Rainer can shed some mor
> light on this.

The import library can only help with function calls by providing the call target and creating an indirect jump through the import table to the actual function in the other DLL.
Data accesses need another indirection through the import table in the code if they want to access the actual data in another DLL. This indirection is not generated by the compiler. That's why a pass is made to patch all relocation into the import table to their respective targets (which also eliminates the call indirections). It also has the benefit of being able to use the same object files for static or dynamic linking.

The hardcoding of the DLL name was meant for testing purposes. What's needed is a method to figure out whether the target DLL is written in D and that data relocations are actually wrong. That would support sharing data between multiple DLLs aswell (It currently only allows sharing objects created in the D runtime).


Just to make it clear: I distinguish 2 kinds of DLLs written in D:

1. A DLL that contains the statically linked D runtime and interfaces to other DLLs and the application without sharing ownership (as if all other DLLs are written in C). This works pretty well on Windows (Visual D is such a DLL).

2. A DLL that shares ownership of memory, objects, threads, etc with the executable and other DLLs if they are also written in D. This is realized by placing the D runtime into its own DLL that is implicitely loaded with the other binary. (In contrast to some rumors that I remember that on posix systems the runtime would be linked into the application image.) This is what the patches in the bugzilla entry implement.

January 09, 2013
On 1/9/2013 1:23 PM, Mehrdad wrote:
> And are you considering reference counting to be garbage collection like Walter
> does,

It's not something I came up with. It is generally accepted that ref counting is a form of garbage collection.
January 09, 2013
On 1/9/2013 1:11 PM, H. S. Teoh wrote:
> Walter's definition of memory-safe

It is not *my* definition. It is a common programming term with a generally accepted definition.