March 04, 2012
> A misbehaving plugin could easily corrupt your process. Destroying data
> is always much worse than crashing.

At this point I usually say memory corruption is not an option for type safe languages but D doesn't really provide runtime type safety guarantees, or does it?

I think in the future (D 4.0 or something) we could seriously consider something like proof carrying code etc to take memory/type safety to the next level. People interested in this will be aware of Google's effort in this direction NaCl ( http://code.google.com/p/nativeclient/ )
March 04, 2012
> 1. SEH isn't portable. There's no way to make it work under non-Windows systems.

Ok after some digging around it appears (prima facie) that Linux doesn't have anything close to SEH. I am aware of POSIX signals but I am not sure if they work for individual threads in a process. Last I checked the whole process has to be hosed when you receive a segfault and there isn't much you can do about it. I am a Linux newbie but I am almost seriously considering implementing SEH for linux (in the kernel). Any Linux Gurus here who think this is a good idea?
March 04, 2012
> If you're dealing with plugins from an unknown source, it's a good design to separate plugins and such as entirely separate processes. Then, when one goes down, it cannot bring down anyone else, since there is no shared address space.
>
> They can communicate with the OS-supplied interprocess communications API.

Yes I think this is a good idea in general but the process/IPC overhead can be substantial if you have a lot of (small) plugins. I think Google chrome uses this trick (among others) to good effect in providing fault tolerance ( http://www.geekosystem.com/google-chrome-hacking-prize/ ).
March 05, 2012
On Saturday, 3 March 2012 at 02:51:41 UTC, Walter Bright wrote:
> Adding in software checks for null pointers will dramatically slow things down.

What about the debug/release difference? Isn't the point of debug mode to allow checks such as assert, RangeError, etc? "Segmentation fault: 11" prevents memory from corrupting, but it isn't helpful in locating a bug.
March 05, 2012
On 03/03/2012 02:06 PM, Walter Bright wrote:
> On 3/3/2012 2:13 AM, bearophile wrote:
>> Walter:
>>
>>> Adding in software checks for null pointers will dramatically slow
>>> things
>>> down.
>>
>> Define this use of "dramatically" in a more quantitative and objective
>> way,
>> please. Is Java "dramatically" slower than C++/D here?
>
> You can try it for yourself. Take some OOP code of yours, and insert a
> null check in front of every dereference of the class handle.

I have a hard time buying this as a valid reason to avoid inserting such checks.  I do think they should be optional, but they should be available, if not default, with optimizations for signal handlers and such taken in the cases where they apply.

Even if it slows my code down 4x, it'll be a huge win for me to avoid this stuff.  Because you know what pisses me off a helluva lot more than slightly slower code?  Spending hours trying to figure out what made my program say "Segmentation fault".  That's what.

I hate hate HATE vague error messages that don't help me.  I really want to emphasize how super dumb and counterproductive this is.

If I find that my code is too slow all of a sudden, then let me turn off the extra checks.  Otherwise, I expect my crashes to give me some indication of what happened.

This is reminding me that I can't do stuff like this:

class Bar
{
	int foo;
}

void main()
{
	Bar bar;
	try {
		bar.foo = 5;
	} catch ( Exception e ) {
		writefln("%s",e);
	}
}

DMD 2.057 on Gentoo Linux, compiled with "-g -debug".  It prints this:
Segmentation fault

Very frustrating!
(And totally NOT worth whatever optimization this buys me.)
March 05, 2012
On Monday, 5 March 2012 at 02:32:12 UTC, Chad J wrote:
> I hate hate HATE vague error messages that don't help me.

In a lot of cases, getting more info is very, very easy:

$ dmd -g -debug test9
$ ./test9
Segmentation fault
$ gdb ./test9
GNU gdb (GDB) 7.1
[...]
(gdb) r
Starting program: /home/me/test9
[Thread debugging using libthread_db enabled]

Program received signal SIGSEGV, Segmentation fault.
0x08067a57 in _Dmain () at test9.d:12
12                      bar.foo = 5;
(gdb) where
#0  0x08067a57 in _Dmain () at test9.d:12
#1  0x0806eaf8 in _D2rt6dmain24mainUiPPaZi7runMainMFZv ()
#2  0x0806e605 in _D2rt6dmain24mainUiPPaZi7tryExecMFMDFZvZv ()
#3  0x0806eb3f in _D2rt6dmain24mainUiPPaZi6runAllMFZv ()
#4  0x0806e605 in _D2rt6dmain24mainUiPPaZi7tryExecMFMDFZvZv ()
#5  0x0806e5b4 in main ()
(gdb) print bar
$1 = (struct test9.Bar *) 0x0



My gdb is out of the box unmodified; you don't need anything
special to get basic info like this.




There's two cases where null annoys me though:

1) if it is stored somewhere where it isn't supposed to be.
Then, the location of the dereference doesn't help - the
question is how it got there in the first place.

2) Segfaults in the middle of a web app, where running it under
the same conditions again in the debugger is a massive pain in
the butt.


I've trained myself to use assert (or functions with assert
in out contracts/invariants) a lot to counter these.
March 05, 2012
On Mon, Mar 05, 2012 at 03:43:15AM +0100, Adam D. Ruppe wrote: [...]
> There's two cases where null annoys me though:
> 
> 1) if it is stored somewhere where it isn't supposed to be. Then, the location of the dereference doesn't help - the question is how it got there in the first place.

And having the compiler insert explicit null checks doesn't help here either.


> 2) Segfaults in the middle of a web app, where running it under the same conditions again in the debugger is a massive pain in the butt.

I've come to the conclusion after years of fighting with making the debugger work over the network to debug embedded apps, that fprintf is a lot less painful than using a debugger. (Yes I heard that groan.) A well-placed fprintf can narrow down the location of the problem considerably. A nicely-wrapped multiprocess-safe fprintf that appends to a debug file complete with getpid() information is even better. Especially as a debug library optionally linked into the app. :-) The only downside is that if your app takes a long time to build (or takes too much effort to install) then a debugger is the better ticket.


[...]
> I've trained myself to use assert (or functions with assert in out contracts/invariants) a lot to counter these.

Yeah, asserts and DbC is extremely useful in detecting the problem at its source rather than who knows how long later down the road where all traces to the source is practically already non-existent. Thing is, you have to consistently do this, everywhere in your code. And everyone else on the project as well. Leave out one place, and it will just be that very place that eventually causes problems. Murphy's law at work. :-)


T

-- 
Having a smoking section in a restaurant is like having a peeing section in a swimming pool. -- Edward Burr
March 05, 2012
On 03/04/2012 09:43 PM, Adam D. Ruppe wrote:
> On Monday, 5 March 2012 at 02:32:12 UTC, Chad J wrote:
>> I hate hate HATE vague error messages that don't help me.
>
> In a lot of cases, getting more info is very, very easy:
>
> $ dmd -g -debug test9
> $ ./test9
> Segmentation fault
> $ gdb ./test9
> GNU gdb (GDB) 7.1
> [...]
> (gdb) r
> Starting program: /home/me/test9
> [Thread debugging using libthread_db enabled]
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x08067a57 in _Dmain () at test9.d:12
> 12 bar.foo = 5;
> (gdb) where
> #0 0x08067a57 in _Dmain () at test9.d:12
> #1 0x0806eaf8 in _D2rt6dmain24mainUiPPaZi7runMainMFZv ()
> #2 0x0806e605 in _D2rt6dmain24mainUiPPaZi7tryExecMFMDFZvZv ()
> #3 0x0806eb3f in _D2rt6dmain24mainUiPPaZi6runAllMFZv ()
> #4 0x0806e605 in _D2rt6dmain24mainUiPPaZi7tryExecMFMDFZvZv ()
> #5 0x0806e5b4 in main ()
> (gdb) print bar
> $1 = (struct test9.Bar *) 0x0
>
>
>
> My gdb is out of the box unmodified; you don't need anything
> special to get basic info like this.
>

News to me.  I've had bad runs with that back in the day, but maybe things have improved a bit.

>
>
>
> There's two cases where null annoys me though:
>
> 1) if it is stored somewhere where it isn't supposed to be.
> Then, the location of the dereference doesn't help - the
> question is how it got there in the first place.
>

True, but that's a different problem space to me.  Non-nullable types would be really cool right about now.

> 2) Segfaults in the middle of a web app, where running it under
> the same conditions again in the debugger is a massive pain in
> the butt.
>

THIS.  This is why I expect what I expect.  It's not web apps in my case.  It's that I simply cannot expect users to run my code in a debugger.  That is just /not acceptable/.

>
> I've trained myself to use assert (or functions with assert
> in out contracts/invariants) a lot to counter these.

*quiver*
It's not that I don't like assertions, contracts, or invariants.  These are very cool.  The problem is that they don't help me when I missed a spot and didn't use assertions, contracts, or invariants.  Back to spending a bunch of time inserting writefln statements to do something that I should be able to accomplish with my eyeballs and a stack trace pretty much instantaneously.
March 05, 2012
On Monday, 5 March 2012 at 03:24:32 UTC, Chad J wrote:
> News to me.  I've had bad runs with that back in the day, but maybe things have improved a bit.

Strangely, I've never had a problem with gdb and D,
as far back as 2007.
(at least for the basic stack trace kind of stuff).

But, yeah, they've been improving a lot of things
recently too.

> Non-nullable types would be really cool right about now.

Huh, I thought there was one in phobos by now.

You could spin your own with something like this:

struct NotNull(T) {
  T t;
  alias t this;
  @disable this();
  @disable this(typeof(null));
  this(T value) {
     assert(value !is null);
     t = value;
  }

  @disable typeof(this) opAssign(typeof(null));
  typeof(this) opAssign(T rhs) {
      assert(rhs !is null);
      t = rhs;
      return this;
  }
}


This will catch usages of the null literal at
compile time, and other null references at runtime
as soon as you try to use it.

With the disabled default constructor, you are forced
to provide an initializer when you use it, so no
accidental null will slip in.

The alias this means NotNull!T is substitutable for T,
so you can drop it into existing apis.

> It's that I simply cannot expect users to run my code in a debugger.

:) I'm lucky if I can get more from my users than
"the site doesn't work"!

> The problem is that they don't help me when I missed a spot and didn't use assertions, contracts, or invariants.

Aye, I've had it happen. The not null types might help,
though tbh I've never used anything like this in practice
so maybe not. I don't really know.
March 05, 2012
On 03/04/2012 11:39 PM, Adam D. Ruppe wrote:
> On Monday, 5 March 2012 at 03:24:32 UTC, Chad J wrote:
>> News to me. I've had bad runs with that back in the day, but maybe
>> things have improved a bit.
>
> Strangely, I've never had a problem with gdb and D,
> as far back as 2007.
> (at least for the basic stack trace kind of stuff).
>
> But, yeah, they've been improving a lot of things
> recently too.
>
>> Non-nullable types would be really cool right about now.
>
> Huh, I thought there was one in phobos by now.
>
> You could spin your own with something like this:
>
> struct NotNull(T) {
> T t;
> alias t this;
> @disable this();
> @disable this(typeof(null));
> this(T value) {
> assert(value !is null);
> t = value;
> }
>
> @disable typeof(this) opAssign(typeof(null));
> typeof(this) opAssign(T rhs) {
> assert(rhs !is null);
> t = rhs;
> return this;
> }
> }
>
>
> This will catch usages of the null literal at
> compile time, and other null references at runtime
> as soon as you try to use it.
>
> With the disabled default constructor, you are forced
> to provide an initializer when you use it, so no
> accidental null will slip in.
>
> The alias this means NotNull!T is substitutable for T,
> so you can drop it into existing apis.
>

That's cool.  Maybe someone should stick it in Phobos?  I haven't had time to try it yet though.  I also didn't know about @disabled; that's a nifty addition.

>> It's that I simply cannot expect users to run my code in a debugger.
>
> :) I'm lucky if I can get more from my users than
> "the site doesn't work"!
>

Ugh!

This sort of thing has happened in non-web code at work.  This is on an old OpenVMS system with a DIBOL derivative language and people accessing it from character-based terminals.  Once I finally got the damn system capable of broadcasting emails reliably (!!) and without using disk IO (!!), I started having it send me stack traces of things before it dies.  The only thing left that's really annoying about this is I still have no way of determining whether an exception is going to be caught or not before I send out the email, so I can't use it in cases where things are expected to throw sometimes (ex: end of file exception, key not found exception).  So I can only do this effectively for errors that are pretty much guaranteed to be bad news.

I hope Phobos will have (or already have) the ability to print stack traces without crashing from an exception.  There are (surprisingly frequent) times when something abnormal happens and I want to know why, but it is safe to continue running the program and the last thing I want to do is crash on the user.  In those cases it is very useful for me to grab a stacktrace and send it to myself in an email.

I can definitely see web stuff being a lot less cut-and-dry than this though, and also having a lot of blind-spots in technologies that you can't control very easily.

>> The problem is that they don't help me when I missed a spot and didn't
>> use assertions, contracts, or invariants.
>
> Aye, I've had it happen. The not null types might help,
> though tbh I've never used anything like this in practice
> so maybe not. I don't really know.