October 30, 2012
On Tue, 30 Oct 2012, bearophile wrote:
> Andrei Alexandrescu:
> 
> > Why can't the linking code be built into dmd? I am baffled :o).
> 
> This is possible, but a better question is how much work is required to do this?
> 
> Walter was very slowly translating the current linker from disassembly to C. If and once that program is all C, it's probably not too much hard to convert it to D, merge it with the dmd binary, and improve it in some ways.
> 
> Bye,
> bearophile

Built in?  Absolutely not.  There's no way that it's architectually wise to have the linker as a part of the compiler binary.

Able to usefully interact with the linker?  Absolutely.

To be clear, I'm certain that Andrei was kidding / making a joke at Walter's expense.
October 30, 2012
On Tue, 30 Oct 2012, Jonathan M Davis wrote:
> On Tuesday, October 30, 2012 01:45:31 bearophile wrote:
> > Andrei Alexandrescu:
> > > Why can't the linking code be built into dmd? I am baffled :o).
> > 
> > This is possible, but a better question is how much work is required to do this?
> > 
> > Walter was very slowly translating the current linker from disassembly to C. If and once that program is all C, it's probably not too much hard to convert it to D, merge it with the dmd binary, and improve it in some ways.
> 
> Depending, it should be fairly easy to just wrap the linker call and have dmd process its output and present something saner when there's an error. That could be a bit fragile though, since it would likely depend on the exact formatting of linker error messages. Better integration than that could be quite a bit more work.
> 
> I think that it's fairly clear that in the long run, we want something like this, but I don't know if it's worth doing right now or not.
> 
> - Jonathan M Davis

If someone wants to work on it, I'm sure no one would stop them.  In fact,
someone did a specific case already.

But for the Top Men to engage on?  Almost certainly not.

I was working on "recognize that there's room for improvement" and "improvement is important for adoption" not "get working on it now".

--

If someone wanted to take on an ambitious task, one of the key problems with output munging is the parseability of the output (which applies to the compiler, linker, etc.. all the compiler chain tools).  Few (none?) of them output text that's designed for parsability (though some make it relatively easy).  It would be interesting to design a structured format and write scripts to sit between the various components to handle adapting the output.

Restated via an example:

today:
  compiler invokes tools and just passes on output

ideal (_an_ ideal, don't nitpick):
  compiler invokes tool which returns structured output and uses that

intermediate that's likely easier to achieve:
  compiler invokes script that invokes tool (passing args) and fixes
output to match structured output

pro:
 + compiler only needs to understand one format
 + one script per tool (also a con, but on the pro side, each script is
focused in what it needs to understand and care about)
 + no need to tear into each tool to restructure it's i/o code

cons:
 - will likely force some form of lowest common denominator
 - more overhead due to extra parsing and processes

I used the term script, but don't read much into that, just implying that it's small and doesn't have to do much.

Now that I've written it up.. might actually be fun to do, but I've got too many in-flight projects as it is, so I'll resist starting on it.

Later,
Brad
October 30, 2012
Brad Roberts:

> To be clear, I'm certain that Andrei was kidding / making a joke at Walter's expense.

Oh, I see, I have missed the joke again, sorry :-)

Bye,
bearophile
October 30, 2012
On 10/29/2012 4:37 PM, Andrei Alexandrescu wrote:> On 10/29/12 6:13 PM, Walter Bright wrote:
>> On 10/29/2012 3:11 PM, Brad Roberts wrote:
>>> It's friction. It needs to be reduced.
>>
>> Short of building the linking code into dmd, the options are fairly
>> limited.
>
> Why can't the linking code be built into dmd? I am baffled :o).

No need for :o), it's a fair question.

The linking process itself is pretty simple. The problems come from designers who can't resist making things as complicated as possible. Just look at the switches for the various linkers, and what they purport to do. Then, look at all the complicated file formats it deals with:

res files
def files
linker script files
dwarf
codeview
magic undocumented formats
pe files
shared libraries
eh formats

And that's just the start.
October 30, 2012
On 10/29/2012 6:43 PM, Brad Roberts wrote:
> To be clear, I'm certain that Andrei was kidding / making a joke at
> Walter's expense.


Andrei never jokes about programming :-)

October 30, 2012
On 10/29/12 10:57 PM, Walter Bright wrote:
> On 10/29/2012 6:43 PM, Brad Roberts wrote:
>> To be clear, I'm certain that Andrei was kidding / making a joke at
>> Walter's expense.
>
>
> Andrei never jokes about programming :-)

The question was fair. The "backatcha" baffling was one of my better jokes. Well I have my moments.

Andrei

October 30, 2012
"Walter Bright" <newshound2@digitalmars.com> wrote in message news:k6mun3$a8h$1@digitalmars.com...
>
> The object file format does not support line numbers for symbol references and definitions. None of the 4 supported ones (OMF, ELF, Mach-O, MsCoff) have that. Even the symbolic debug info doesn't have line numbers for references, just for definitions.

While this is true, you could scan the relocations for matching symbols, then use the debug information to get line numbers.  This would work for all function calls at least.


October 30, 2012
On 10/29/2012 9:51 PM, Daniel Murphy wrote:> "Walter Bright" <newshound2@digitalmars.com> wrote in message
> news:k6mun3$a8h$1@digitalmars.com...
>>
>> The object file format does not support line numbers for symbol references
>> and definitions. None of the 4 supported ones (OMF, ELF, Mach-O, MsCoff)
>> have that. Even the symbolic debug info doesn't have line numbers for
>> references, just for definitions.
>
> While this is true, you could scan the relocations for matching symbols,
> then use the debug information to get line numbers.  This would work for all
> function calls at least.


If the symbol is undefined, then there is no debug info for it.
October 30, 2012
"Walter Bright" <newshound2@digitalmars.com> wrote in message news:k6npgi$1hsr$1@digitalmars.com...
> On 10/29/2012 9:51 PM, Daniel Murphy wrote:> "Walter Bright" <newshound2@digitalmars.com> wrote in message
> > news:k6mun3$a8h$1@digitalmars.com...
> >>
> >> The object file format does not support line numbers for symbol
> >> references
> >> and definitions. None of the 4 supported ones (OMF, ELF, Mach-O,
> >> MsCoff)
> >> have that. Even the symbolic debug info doesn't have line numbers for
> >> references, just for definitions.
> >
> > While this is true, you could scan the relocations for matching symbols,
> > then use the debug information to get line numbers.  This would work for
> > all
> > function calls at least.
>
>
> If the symbol is undefined, then there is no debug info for it.

There will be debug information for the call site if it is in the user's program.

eg

void foo();

void main()
{
   foo();
}

>dmd testx -g
DMD v2.061 DEBUG
OPTLINK (R) for Win32  Release 8.00.12
Copyright (C) Digital Mars 1989-2010  All rights reserved.
http://www.digitalmars.com/ctg/optlink.html
testx.obj(testx)
 Error 42: Symbol Undefined _D5testx3fooFZv
--- errorlevel 1

>objconv -dr testx.obj

Dump of file: testx.obj, type: OMF32
Checksums are zero

LEDATA, LIDATA, COMDAT and FIXUPP records:
  LEDATA: segment $$SYMBOLS, Offset 0x0, Size 0x4B
  FIXUPP:
   Direct farword 32+16 bit, Offset 0x30, group FLAT. Symbol __Dmain (T6),
inlin
e 0x0:0x0
  COMDAT: name , Offset 0x0, Size 0xD, Attrib 0x00, Align 0, Type 0, Base 0
  FIXUPP:
   Relatv 32 bit, Offset 0x4, group FLAT. Symbol _D5testx3fooFZv (T6),
inline 0x
1000E
  LEDATA: segment _DATA, Offset 0x0, Size 0xE
  LEDATA: segment FM, Offset 0x0, Size 0x4
  FIXUPP:
   Direct 32 bit, Offset 0x0, group FLAT. Segment _DATA (T4), inline 0x0
  LEDATA: segment $$TYPES, Offset 0x0, Size 0x16

The FIXUPP record gives Offset 0x4 for the address _D5testx3fooFZv, and the debug information for main will give the line number of that offset.

I wouldn't want to implement it in assembly though.


October 30, 2012
On 10/29/2012 11:08 PM, Daniel Murphy wrote:
> void foo();

There will be no line information for the above.

> void main()
> {
>    foo();

For this, yes, but that is not what is being asked for.