February 05, 2016
On Friday, 5 February 2016 at 00:56:16 UTC, Chris Wright wrote:
> True. That works if this is baked into your compiler, or if your compiler has plugin support. And you'd have to compile with this plugin or the relevant options turned on by default in order for you not to duplicate work.

On Friday, 5 February 2016 at 00:56:28 UTC, Ola Fosheim Grøstad wrote:
> Not sure what you mean by adding a warning. You can probably find sanitizers that do it, but the standard does not require warnings for anything (AFAIK). That is up to compiler vendors.

Quoting myself (emphasis added):

On Thursday, 4 February 2016 at 22:57:00 UTC, tsbockman wrote:
> Actually, I'm surprised that this works even in C - I would have expected at least a COMPILER (or linker?) warning; this seems like it should be easy to detect automatically.

All along I have been saying this is something that *compilers* should warn about. As far as I can recall, I never suggested using linters, sanitizers, changing the C standard - or even compiler plugins.

(I did suggest the linker as an alternative, but you all have already explained why that can't work for C.)
February 05, 2016
On Thursday, 4 February 2016 at 22:57:00 UTC, tsbockman wrote:
> The first place entry is particularly ridiculous; is there any modern language that would make it so easy to commit such an awful "mistake"?

D allows that. This is why I recommend putting `static assert(foo.sizeof == expectation);` in code that interfaces with external things, like C code, or D .di stuff.

#include <math.h> /* sqrt */

that line is an interesting one too: the trick is depending on namespace pollution by the include. In D, you might write `import core.stdc.math : sqrt;` and make that misleading comment part of the code.... though then you could perhaps exploit that module bug (314?).

February 05, 2016
On Friday, 5 February 2016 at 01:14:05 UTC, Adam D. Ruppe wrote:
> D allows that. This is why I recommend putting `static assert(foo.sizeof == expectation);` in code that interfaces with external things, like C code, or D .di stuff.
>
> #include <math.h> /* sqrt */

D *doesn't* allow that though - at least, not in a monolithic, idiomatic D program: there wouldn't be any duplicate declaration of `spectral_contrast()` to mess up.

Yes, you can force the matter using `extern(C)` like anonymous demonstrated earlier - but using `extern(C)` for internal linkage in an all-D program would certainly attract scrutiny from reviewers; it would score poorly on the "underhanded-ness" test.

As to the ".di" stuff - I've not used them. Care to educate me? How can they cause similar problems?

> that line is an interesting one too: the trick is depending on namespace pollution by the include. In D, you might write `import core.stdc.math : sqrt;` and make that misleading comment part of the code.... though then you could perhaps exploit that module bug (314?).

314 definitely has potential. Should we start an "Underhanded D" contest? Sounds like bad marketing, but a lot of fun :-P
February 04, 2016
On 2/4/2016 3:10 PM, H. S. Teoh via Digitalmars-d wrote:
> The C preprocessor accepts all sorts of nasty, nonsensical things.

The preprocessor makes C++ into an inherently unreliable, unsafe programming language. I've talked to some C++ committee members about this, about why there is no push to rid (at least deprecate) all use of the preprocessor. The general reaction I get is it is unimportant to do so.

February 05, 2016
On Fri, 05 Feb 2016 01:10:53 +0000, tsbockman wrote:

> All along I have been saying this is something that *compilers* should warn about.

The compiler doesn't have all the information you need. You could add it to the build system or the linker as well as the compiler. Adding it to the linker is almost identical to my previous suggestion of adding optional name mangling to C.
February 05, 2016
On Friday, 5 February 2016 at 03:46:37 UTC, Chris Wright wrote:
> On Fri, 05 Feb 2016 01:10:53 +0000, tsbockman wrote:
> The compiler doesn't have all the information you need. You could add it to the build system or the linker as well as the compiler. Adding it to the linker is almost identical to my previous suggestion of adding optional name mangling to C.

What information, specifically, is the compiler missing?

The compiler already computes the name and type signature of each function. As far as I can see, all that is necessary is to:

1) Insert that information (together with what file and line number it came from) into a big list in a temporary file.
2) After all modules have been compiled, go back and sort the list by function name.
3) Finally, scan the list for entries that share the same name, but have incompatible type signatures. Emit warning messages as needed. (The compiler should be used for this step, because it already has a lot of information about C's type system built into it that can help define "incompatible" sensibly.)

As far as I can see, this requires an extra pass, but no additional information. What am I missing?
February 05, 2016
On Friday, 5 February 2016 at 01:33:14 UTC, tsbockman wrote:
> As to the ".di" stuff - I've not used them. Care to educate me? How can they cause similar problems?

Well, technically, a .di file is just a .d file renamed, but it tends to have the bodies stripped out. Separate compliation is a supported feature of D.

The way you'd do it is something like this:

struct Foo {
   float a;
   float b;
}

void bar(Foo* f) {
   f.b = whatever;
}


Then compile it with -lib and make a "header" file manually:

struct Foo {
   double a;
   double b;
}
void bar(Foo*);


You can now create D modules that import this and link against the compiled library. Very similar to C's model...

But I redefined Foo! The name mangling won't catch this. bar will be mangled to take `Foo` as an argument and the linker will catch if we change that, but it doesn't know what Foo actually is.

By changing that, we introduce the problem.

> 314 definitely has potential. Should we start an "Underhanded D" contest? Sounds like bad marketing, but a lot of fun :-P

it might be :)
February 05, 2016
On Friday, 5 February 2016 at 04:25:09 UTC, Adam D. Ruppe wrote:
> On Friday, 5 February 2016 at 01:33:14 UTC, tsbockman wrote:
>> As to the ".di" stuff - I've not used them. Care to educate me? How can they cause similar problems?
>
> Well, technically, a .di file is just a .d file renamed, but it tends to have the bodies stripped out. Separate compliation is a supported feature of D.
>
> The way you'd do it is something like this:
>
> struct Foo {
>    float a;
>    float b;
> }
>
> void bar(Foo* f) {
>    f.b = whatever;
> }
>
>
> Then compile it with -lib and make a "header" file manually:
>
> struct Foo {
>    double a;
>    double b;
> }
> void bar(Foo*);
>
>
> You can now create D modules that import this and link against the compiled library. Very similar to C's model...
>
> But I redefined Foo! The name mangling won't catch this. bar will be mangled to take `Foo` as an argument and the linker will catch if we change that, but it doesn't know what Foo actually is.
>
> By changing that, we introduce the problem.
>
>> 314 definitely has potential. Should we start an "Underhanded D" contest? Sounds like bad marketing, but a lot of fun :-P
>
> it might be :)

Thanks for the explanation. That does sound basically the same as the C issue.

Since .di files are normally generated automatically, this seems like an easily solvable problem:

1) When compiling a library and its attendant .di file(s), generate a unique version identifier (such as a UUID or a hash of the completed binary) and append it to both the library and each .di file.

2) Whenever someone tries to link against the library, verify that the version ID matches. If it does not, issue a prominent warning.

Problem solved? Or is this harder than it looks?

(Of course there are various details to consider, such as how to efficiently share one set of .di files across many platforms/compiler settings; this is just a rough sketch.)
February 05, 2016
On Fri, 05 Feb 2016 04:02:41 +0000, tsbockman wrote:

> On Friday, 5 February 2016 at 03:46:37 UTC, Chris Wright wrote:
>> On Fri, 05 Feb 2016 01:10:53 +0000, tsbockman wrote:
>> The compiler doesn't have all the information you need. You could add
>> it to the build system or the linker as well as the compiler. Adding it
>> to the linker is almost identical to my previous suggestion of adding
>> optional name mangling to C.
> 
> What information, specifically, is the compiler missing?

It doesn't know what targets I'm ultimately creating, and it doesn't know what files have been modified that I'm about to compile (but haven't compiled yet).

Example 1:

I compile one .c file referencing a function:
void foo(int);

That's going to end up in libfoo.so.

I compile another .c file in the same directory defining a function:
void foo(float);

That's going to end up in libbar.so.

No bug here. (The linker should tell us if someone depends on foo from libbar and foo from libfoo in the same executable.)

How does your putative compiler plugin handle it? Either I have to define a build rule for every source file to specify where to put this symbol cache (and you need to add parameters for the plugin to look for multiple caches, because libfoo and libbar share a lot of source files), or the plugin gives me false positives.

Example 2:

I compile a.c:
int foo(int i) { return i + 1; }

In the course of refactoring, I delete that function from a.c and add it
to b.c with modifications:
int foo(int i, int increment) { return i + increment; }

My build script recompiles b.c before it recompiles a.c. Your compiler plugin produces a build error, halting my build. I have to make clean && make in order to proceed -- and that's assuming I know your tool doesn't work well with incremental compilation.

The first problem might be uncommon, but the second would crop up constantly. They have the same fix: collect the information when you compile, evaluate it when you link.
February 04, 2016
On Fri, Feb 05, 2016 at 12:14:11AM +0000, tsbockman via Digitalmars-d wrote: [...]
> This isn't even a particularly expensive (in compile-time costs) check to perform anyway; all that is necessary is to store a temporary table of symbol signatures somewhere (it doesn't need to be in RAM), and check that any duplicate entries are consistent with each other before linking.

That's a lot more expensive than you think. There's a reason most modern linkers do not do full cross-referencing of symbols -- because doing so would be excruciatingly slow and consume gobs of memory. Even a 32GB machine would not be able to hold *all* the symbols in some very large software projects, and looking things up on disk is unacceptably slow for software of those sizes. Most modern linkers instead use faster algorithms that rely on clever scheduling of the order of symbol resolution, just so they *don't* have to cross-reference all symbols at once.

Besides, all this is unnecessary work. All you need to do is to have C compilers mangle function names.  Mission accomplished.

(However, this *will* break a lot of existing inter-language code that rely on being able to spell out symbols explicitly. So it probably will not fly.  But, in theory, it *is* possible...)

And to paraphrase one of my favorite Walter quotes: fixing inconsistent function signatures is only plugging one hole in a cheese grater. C has far more dangerous gotchas than just function signature mismatches.


T

-- 
They say that "guns don't kill people, people kill people." Well I think the gun helps. If you just stood there and yelled BANG, I don't think you'd kill too many people. -- Eddie Izzard, Dressed to Kill