Thread overview
Separate compilation identifier mess and the bug trail that ensued
Jul 04, 2018
Atila Neves
Jul 25, 2018
kinke
Jul 26, 2018
Atila Neves
Jul 26, 2018
kinke
Jul 26, 2018
kinke
July 04, 2018
Once upon a time, one could not use __traits(getUnitTests) and separate compilation, and it begat https://issues.dlang.org/show_bug.cgi?id=16995. The problem was that there was a counter being used to name unittests, and it varied between running the compiler on all files at once (or in groups) versus compiling files separately.

Due to a suggestion in the PR, I emulated what was done for naming lambdas and delayed naming the unittest until semantic analysis. This begat https://issues.dlang.org/show_bug.cgi?id=18097 since now one could not longer take the address of a unittest function. Well, one could but it wouldn't link because it'd be the wrong name by the time semantic was done with it.

I went back and undid the semantic setting of the identifier and used the file name in the unittest name instead. That begat https://issues.dlang.org/show_bug.cgi?id=18111. The name of the file isn't the absolute path, and so depending on how the compiler is invoked, one still gets linker errors.

I realised the stupidity of my ways - D already mangles according to module, and those names are always the same no matter how one compiles. I went back to a simpler time, but without the counter, relying on line/column numbers instead. That begat https://issues.dlang.org/show_bug.cgi?id=18880. Using `static foreach` or mixing in unittests with strings meant that the different unittests shared line and column numbers. Oops.

In trying to fix https://issues.dlang.org/show_bug.cgi?id=18868 Johan got a comment about the bug mentioned above and added a fix for that as well. Which breaks __traits(getUnitTests) and separate compilation, since it uses a counter to disambiguate between identical identifiers. But counters can't and won't work, because identifiers don't have fully qualified names until semantic, and when they're created they might have the exact same identifier as a symbol in a different module. Those don't need to be disambiguated of course, but there's no way for the compiler to know ahead of time. Chaos ensues.

I've been racking my brain thinking about to solve this. I can't use file names or rely on the module name (after the symbol has a module attached to it), since that has caused bugs in the past and would again. The file name doesn't work unless I somehow always get the absolute path. Using the module information means that at different points of compilation symbols change names.

Does anyone have a good idea of a way out?
July 23, 2018
On Wednesday, 4 July 2018 at 22:15:36 UTC, Atila Neves wrote:
> Once upon a time, one could not use __traits(getUnitTests) and separate compilation, and it begat
>
> Does anyone have a good idea of a way out?

using a hash? (has to also include loop variable for mixin but otherwise should be reliable)
July 25, 2018
On Wednesday, 4 July 2018 at 22:15:36 UTC, Atila Neves wrote:
> Using `static foreach` or mixing in unittests with strings meant that the different unittests shared line and column numbers. Oops.

To make those corner cases unique and independent from compiler invokations, maybe a stack of module-line-column infos could be used instead of a single one. During semantic, each mixin would push its Loc, and each static foreach iteration would push an artificial one incl. iteration index.
July 26, 2018
On Wednesday, 25 July 2018 at 22:19:41 UTC, kinke wrote:
> On Wednesday, 4 July 2018 at 22:15:36 UTC, Atila Neves wrote:
>> Using `static foreach` or mixing in unittests with strings meant that the different unittests shared line and column numbers. Oops.
>
> To make those corner cases unique and independent from compiler invokations, maybe a stack of module-line-column infos could be used instead of a single one. During semantic, each mixin would push its Loc, and each static foreach iteration would push an artificial one incl. iteration index.

If I do it during semantic analysis it would be easy: keep a counter per module. But I can't, due to the bugs reported above.

Basically there are situations in which the symbol is used (like taking the address of a unittest) before semantic analysis, where the symbol name would change.

Atila
July 26, 2018
On Thursday, 26 July 2018 at 09:09:07 UTC, Atila Neves wrote:
> If I do it during semantic analysis it would be easy: keep a counter per module.

Ah, I thought you wanted to go further (identical names, independent of defined versions for compiler invokation etc.). I looked at a PR of yours, where you changed back from finalizing the identifier in semantic, assuming that it already was like that originally, but I now see that you moved it to semantic shortly before. ;)

So if it has to be done before semantic and we have no clue about the D module at that time, can't we use a counter per Loc (filename, line, column)? Only creating one if there's a unittest/invariant/... declaration to be uniquely named of course, to keep the number of counters reasonably small.
July 26, 2018
On Thursday, 26 July 2018 at 18:43:24 UTC, kinke wrote:
> On Thursday, 26 July 2018 at 09:09:07 UTC, Atila Neves wrote:
>> If I do it during semantic analysis it would be easy: keep a counter per module.
>
> Ah, I thought you wanted to go further (identical names, independent of defined versions for compiler invokation etc.). I looked at a PR of yours, where you changed back from finalizing the identifier in semantic, assuming that it already was like that originally, but I now see that you moved it to semantic shortly before. ;)
>
> So if it has to be done before semantic and we have no clue about the D module at that time, can't we use a counter per Loc (filename, line, column)? Only creating one if there's a unittest/invariant/... declaration to be uniquely named of course, to keep the number of counters reasonably small.

E.g., via a static `int[Loc]` map in `generateIdWithLoc()`, and replacing the current counter per <base identifier, line, column> by a counter per <filename, line, column>, thereby isolating each module via unique path (const char* pointer comparison is most likely enough => bitwise Loc comparison). [The identifier could optionally be used as further key component.]