August 14, 2005
In article <ddluoo$nm1$1@digitaldaemon.com>, John Reimer says...
>
>Jarrett Billingsley wrote:
>> Since my project nonagon isn't open-source (though that may change in the next few months), I distribute it as a lib with "headers" which just have declarations for all the classes and functions and whatnot.  In order to generate these "headers," I basically just have to remove all function bodies and replace them with semicolons.
>> 
>> The problem is that I did this manually at first, and it took me maybe 5-10 minutes to do it.  Now that nonagon has grown so much, It would probably take me close to an hour to replace all the function bodies with semicolons.
>> 
>> I wrote a "dumb" tool which basically counts the number of left and right braces, and based on the nesting level, it either outputs the text it reads or it doesn't.  The problem with this is that once I start adding things in like version blocks and nested classes, it'll start stripping things out that it's not supposed to.  It also doesn't strip out global-level function bodies.
>> 
>> Is there any kind of lexer/parser tool for D that will allow me to quickly strip out the function bodies?  Surely someone has run across this problem before.  Normally, I'd look into using the frontend, but, well, it's completely undocumented and written in C++.  And I really don't feel like spending a week and a half figuring out how to use the thing and then writing a tool.
>> 
>> Wasn't there some kind of D lexer tool written in D several months ago?  I can't find the NG thread..
>> 
>> 
>
>
>These kind of "strip" tools just amount to the same thing as C headers for a library.  D should and could do better.  Let's ditch the headers/import idea completely (aka "stripping") and create a tool (integrated into build perhaps?) that just reads the symbols directly from the *.lib file that the project links with (silent "stripping"). That way shipping the library itself would be the only thing necessary for closed projects.  This has been discussed before.  We don't really need more header files to mess with.  We're drifting back into the C/C++ age again if we go that route.
>
>-JJR

Great idea, and I think you're right, D really does need to do better in this area before it will be taken seriously by a lot of people as a 'commercial quality' tool (IMHO). Not saying they have the right attitude, but that's just the way it will be.

How about just integrate the 'library stripping tool' right into the reference compiler (and therefore the language spec.)? If an import couldn't be found in the import path, the compiler would 'strip' the libraries - and/or object files specified on the command line - for symbols?

After a quick glance at obj2asm output, the major challenge as-is looks to be member variable declarations and template code (enough info. for top-level function and variable declarations, classes, and structs looks to be available).

- Dave


August 14, 2005
> D should and could do better.  Let's ditch the headers/import idea completely (aka "stripping") and create a tool (integrated into build perhaps?) that just reads the symbols directly from the *.lib file that the project links with (silent "stripping"). That way shipping the library itself would be the only thing necessary for closed projects.  This has been discussed before.  We don't really need more header files to mess with.  We're drifting back into the C/C++ age again if we go that route.

The problem with that is the current lib files are very low level and dangerous. To distribute a lib file you have to force users to use the same compiler version and any other libs you use have to be for that same version. Say you want to use libs from 2 different sources and both rely on a different compiler version. Well, now you're pretty much screwed. Even if they successfully link to a program, there still can be hidden problems, such as access violations because there was supposed to be a pointer to something at a certain location but now it's somewhere else. If D could have a new form of lib file that knew more about the language and didn't depend on compiler version as much, hopefully not even depend on compiler brand (implementation) much, and possibly use an intermediate code form, it would be a lot safer and better in many respects.
August 15, 2005
In article <op.svitpec7l2lsvj@esi>, Vathix says...
>
>> D should and could do better.  Let's ditch the headers/import idea completely (aka "stripping") and create a tool (integrated into build perhaps?) that just reads the symbols directly from the *.lib file that the project links with (silent "stripping"). That way shipping the library itself would be the only thing necessary for closed projects. This has been discussed before.  We don't really need more header files to mess with.  We're drifting back into the C/C++ age again if we go that route.
>
>The problem with that is the current lib files are very low level and dangerous. To distribute a lib file you have to force users to use the same compiler version and any other libs you use have to be for that same version. Say you want to use libs from 2 different sources and both rely on a different compiler version. Well, now you're pretty much screwed. Even if they successfully link to a program, there still can be hidden problems, such as access violations because there was supposed to be a pointer to something at a certain location but now it's somewhere else.

Its funny that that you brought this up.

I'm working on a runtime loader/linker for DMD's OMF .obj files.  For two day's work, I now have a crude OMF parser that can digest (most) of DMD's output

http://www.dsource.org/forums/viewtopic.php?p=5691#5691

Its not useful yet, but it does shed some light at what can be accomplished via this route.

The big problem you mention, having to wield fixup data and RVA's, is certainly not a trivial task.  It happens to be where I'm presently spending most of my time on my OMF loader; it'll take a week or more before I can really work the major kinks out.  However, all the needed information *is* in any given object format, and can be made to work. ;)

You're right about compiler dependencies and such, as there is a huge amount of ground to cover to be inclusive to everyone.  Just look at the mess GNU Binutils is (which is by no means complete) and you'll see just how bad it is out there: 400+ files and I'm *still* stuck writing a custom loader.  BFD indeed.

However, the advantages of runtime linking, and library introspection is obvious as has been proven by .NET and Java.  Even if you're looking to just merely write a tool to create headers on the fly,

>If D could have a new form of lib file that knew more about the language and didn't depend on compiler version as much, hopefully not even depend on compiler brand (implementation) much, and possibly use an intermediate code form, it would be a lot safer and better in many respects.

I agree completely.

If we're all seriously looking for an inclusive and hassle-free binary format, then ELF will work for just about anything you want to do.  They seem easier to parse (anything has to be easier than OMF) and can be made to operate the same way in windows as well as in linux (and probably apple once they go x86).

http://www.skyfree.org/linux/references/ELF_Format.pdf

I'm looking to support this as well as OMF for my project.

Its supported under GCC, MINGW and probably a whole mess of others.  If that doesn't float your boat, then one can easily write a converter between COFF and ELF (if there isn't one out there already) by way of GNU Binutils.

- EricAnderton at yahoo
August 15, 2005
Heh... Eric, I was just about to mention you on this one.  Good timing! ;-)

The direction you are taking IS the solution to many of these issues.

Vathix, good points in regards to questionable safety in parsing for symbols in the current lib/object formats.  But I think the problem is solvable with enough careful planning.  Finding a solution, at least, would be well worth the effort for D's sake.  Otherwise D stripped imports are no better than C headers: D can not claim any superiority in this area at present (as Walter has, at times, been prone to do).

-JJR

pragma wrote:
> In article <op.svitpec7l2lsvj@esi>, Vathix says...
> 
>>>D should and could do better.  Let's ditch the headers/import idea  completely (aka "stripping") and create a tool (integrated into build  perhaps?) that just reads the symbols directly from the *.lib file that  the project links with (silent "stripping"). That way shipping the  library itself would be the only thing necessary for closed projects.   This has been discussed before.  We don't really need more header files  to mess with.  We're drifting back into the C/C++ age again if we go  that route.
>>
>>The problem with that is the current lib files are very low level and  dangerous. To distribute a lib file you have to force users to use the  same compiler version and any other libs you use have to be for that same  version. Say you want to use libs from 2 different sources and both rely  on a different compiler version. Well, now you're pretty much screwed.  Even if they successfully link to a program, there still can be hidden  problems, such as access violations because there was supposed to be a  pointer to something at a certain location but now it's somewhere else.
> 
> 
> Its funny that that you brought this up.
> 
> I'm working on a runtime loader/linker for DMD's OMF .obj files.  For two day's
> work, I now have a crude OMF parser that can digest (most) of DMD's output 
> 
> http://www.dsource.org/forums/viewtopic.php?p=5691#5691
> 
> Its not useful yet, but it does shed some light at what can be accomplished via
> this route. 
> 
> The big problem you mention, having to wield fixup data and RVA's, is certainly
> not a trivial task.  It happens to be where I'm presently spending most of my
> time on my OMF loader; it'll take a week or more before I can really work the
> major kinks out.  However, all the needed information *is* in any given object
> format, and can be made to work. ;)
> 
> You're right about compiler dependencies and such, as there is a huge amount of
> ground to cover to be inclusive to everyone.  Just look at the mess GNU Binutils
> is (which is by no means complete) and you'll see just how bad it is out there:
> 400+ files and I'm *still* stuck writing a custom loader.  BFD indeed.
> 
> However, the advantages of runtime linking, and library introspection is obvious
> as has been proven by .NET and Java.  Even if you're looking to just merely
> write a tool to create headers on the fly, 
> 
> 
>>If  D could have a new form of lib file that knew more about the language and  didn't depend on compiler version as much, hopefully not even depend on  compiler brand (implementation) much, and possibly use an intermediate  code form, it would be a lot safer and better in many respects.
> 
> 
> I agree completely.
> 
> If we're all seriously looking for an inclusive and hassle-free binary format,
> then ELF will work for just about anything you want to do.  They seem easier to
> parse (anything has to be easier than OMF) and can be made to operate the same
> way in windows as well as in linux (and probably apple once they go x86).  
> 
> http://www.skyfree.org/linux/references/ELF_Format.pdf
> 
> I'm looking to support this as well as OMF for my project.
> 
> Its supported under GCC, MINGW and probably a whole mess of others.  If that
> doesn't float your boat, then one can easily write a converter between COFF and
> ELF (if there isn't one out there already) by way of GNU Binutils.
> 
> - EricAnderton at yahoo
August 15, 2005
Vathix wrote:
>> D should and could do better.  Let's ditch the headers/import idea  completely (aka "stripping") and create a tool (integrated into build  perhaps?) that just reads the symbols directly from the *.lib file that  the project links with (silent "stripping"). That way shipping the  library itself would be the only thing necessary for closed projects.   This has been discussed before.  We don't really need more header files  to mess with.  We're drifting back into the C/C++ age again if we go  that route.
> 
> 
> The problem with that is the current lib files are very low level and  dangerous. To distribute a lib file you have to force users to use the  same compiler version and any other libs you use have to be for that same  version. Say you want to use libs from 2 different sources and both rely  on a different compiler version. Well, now you're pretty much screwed.  Even if they successfully link to a program, there still can be hidden  problems, such as access violations because there was supposed to be a  pointer to something at a certain location but now it's somewhere else. If  D could have a new form of lib file that knew more about the language and  didn't depend on compiler version as much, hopefully not even depend on  compiler brand (implementation) much, and possibly use an intermediate  code form, it would be a lot safer and better in many respects.

I really don't know anything about lib files, so I don't know how much sense am I gonna make here, but here's an idea:
How about an intermediate file (between .d and .lib) that contains both stripped declarations and .lib content (whatever that is), so that it's easy for any compiler/linker/whatever to extract both declarations and/or .lib from that file (let's just call that file "dlib" for now).

In other words, embed stripped declarations directly into lib files.

What I'm proposing is that instead of having the compiler produce lib files, it produces the "dlib" file, or have it only produce that file on a special compiler switch.

is that a reasonable proposal? or is it extreemly stupid?
August 15, 2005
Hasan Aljudy wrote:
> Vathix wrote:
> 
>>> D should and could do better.  

> ...  that contains both
> stripped declarations and .lib content

doesn't Burton Radon "make" tool do that already?
(I might be mixing up things...)

Antonio
August 15, 2005
In article <ddpan3$6fg$1@digitaldaemon.com>, Hasan Aljudy says...
>
>Vathix wrote:
>>> D should and could do better.  Let's ditch the headers/import idea completely (aka "stripping") and create a tool (integrated into build perhaps?) that just reads the symbols directly from the *.lib file that  the project links with (silent "stripping"). That way shipping the  library itself would be the only thing necessary for closed projects.   This has been discussed before.  We don't really need more header files  to mess with.  We're drifting back into the C/C++ age again if we go  that route.
>> 
>> 
>> The problem with that is the current lib files are very low level and dangerous. To distribute a lib file you have to force users to use the same compiler version and any other libs you use have to be for that same  version. Say you want to use libs from 2 different sources and both rely  on a different compiler version. Well, now you're pretty much screwed.  Even if they successfully link to a program, there still can be hidden  problems, such as access violations because there was supposed to be a  pointer to something at a certain location but now it's somewhere else. If  D could have a new form of lib file that knew more about the language and  didn't depend on compiler version as much, hopefully not even depend on  compiler brand (implementation) much, and possibly use an intermediate  code form, it would be a lot safer and better in many respects.
>
>I really don't know anything about lib files, so I don't know how much
>sense am I gonna make here, but here's an idea:
>How about an intermediate file (between .d and .lib) that contains both
>stripped declarations and .lib content (whatever that is), so that it's
>easy for any compiler/linker/whatever to extract both declarations
>and/or .lib from that file (let's just call that file "dlib" for now).
>
>In other words, embed stripped declarations directly into lib files.
>
>What I'm proposing is that instead of having the compiler produce lib files, it produces the "dlib" file, or have it only produce that file on a special compiler switch.
>
>is that a reasonable proposal? or is it extreemly stupid?

I think that is very reasonable and from what I gather it can actually be done.

When the compiler tries to resolve an import that isn't in the path, it looks in the library and/or .obj files, right?

I believe it can also be done in such a way so that other languages could use the same libraries (they would just ignore the section of the object files containing the D symbols).

Just think:

- Import code and libs. in one file (end of version issues between headers and
libs.)

- One file per library to distribute.

- Reflection/introspection w/o having to strip executable library code (This would make runtime loading/linking much easier and portable, right Pragma?).

- Same library file could be used by other languages (they would have to provide their own forward declarations).

If you also store some type of implementation specified binary symbols of all of the code instead of just stripped declarations:

- You could potentially distribute closed-source libraries made up of just template code in binary format using the current D instantiation model (w/o implicit function template instantiation).

- The extra info. could be used by the compiler, e.g.: to inline functions. The info. not referenced would not make it into .exe's (stipped by most modern linkers) so it wouldn't bloat applications.

Now all that would be a big step forward, IMO!

- Dave


September 15, 2005
(Sorry for top-posting, I'm in the middle of something right now, and I just got a raw idea to throw at you guys!)

The different lib and dll formats specify things quite meticulously. Now, if we wanted to make a "file type" that is compatible _and_ contains data that we need, then:

why not just slap our own data at the end of the lib/dll?

So, for example, if a file format only describes entry points and function names, we might slap a description of return values, function parameters, and whatever else we consider important (e.g. compiler version, or some such) simply at the end of that file.

This has of course to be tested, so no tools (binutils or other) crash with these files. With any good luck, this might be an easy "chewing gum and cardboard" solution, that we can use for the time being.

(Just a Temporary Solution(TM) that the FAA airliner disaster investigators will find 5 years from now still being used. :-)


Dave wrote:
> In article <ddpan3$6fg$1@digitaldaemon.com>, Hasan Aljudy says...
> 
>>Vathix wrote:
>>
>>>>D should and could do better.  Let's ditch the headers/import idea  completely (aka "stripping") and create a tool (integrated into build  perhaps?) that just reads the symbols directly from the *.lib file that  the project links with (silent "stripping"). That way shipping the  library itself would be the only thing necessary for closed projects.   This has been discussed before.  We don't really need more header files  to mess with.  We're drifting back into the C/C++ age again if we go  that route.
>>>
>>>
>>>The problem with that is the current lib files are very low level and  dangerous. To distribute a lib file you have to force users to use the  same compiler version and any other libs you use have to be for that same  version. Say you want to use libs from 2 different sources and both rely  on a different compiler version. Well, now you're pretty much screwed.  Even if they successfully link to a program, there still can be hidden  problems, such as access violations because there was supposed to be a  pointer to something at a certain location but now it's somewhere else. If  D could have a new form of lib file that knew more about the language and  didn't depend on compiler version as much, hopefully not even depend on  compiler brand (implementation) much, and possibly use an intermediate  code form, it would be a lot safer and better in many respects.
>>
>>I really don't know anything about lib files, so I don't know how much sense am I gonna make here, but here's an idea:
>>How about an intermediate file (between .d and .lib) that contains both stripped declarations and .lib content (whatever that is), so that it's easy for any compiler/linker/whatever to extract both declarations and/or .lib from that file (let's just call that file "dlib" for now).
>>
>>In other words, embed stripped declarations directly into lib files.
>>
>>What I'm proposing is that instead of having the compiler produce lib files, it produces the "dlib" file, or have it only produce that file on a special compiler switch.
>>
>>is that a reasonable proposal? or is it extreemly stupid?
> 
> 
> I think that is very reasonable and from what I gather it can actually be done.
> 
> When the compiler tries to resolve an import that isn't in the path, it looks in
> the library and/or .obj files, right?
> 
> I believe it can also be done in such a way so that other languages could use
> the same libraries (they would just ignore the section of the object files
> containing the D symbols).
> 
> Just think:
> 
> - Import code and libs. in one file (end of version issues between headers and
> libs.)
> 
> - One file per library to distribute.
> 
> - Reflection/introspection w/o having to strip executable library code (This
> would make runtime loading/linking much easier and portable, right Pragma?).
> 
> - Same library file could be used by other languages (they would have to provide
> their own forward declarations).
> 
> If you also store some type of implementation specified binary symbols of all of
> the code instead of just stripped declarations:
> 
> - You could potentially distribute closed-source libraries made up of just
> template code in binary format using the current D instantiation model (w/o
> implicit function template instantiation).
> 
> - The extra info. could be used by the compiler, e.g.: to inline functions. The
> info. not referenced would not make it into .exe's (stipped by most modern
> linkers) so it wouldn't bloat applications.
> 
> Now all that would be a big step forward, IMO!
> 
> - Dave
> 
> 
September 16, 2005
Why do some people get all up in arms about top-posting?  It's retarded.  It absolutely does not change the meaning or flow of conversation in a newsgroup.  This might be true if each post were a single sentence, but mostly they're not.  Besides, it's not hard to draw the lines between posts... Older ones are prefixed in '>'s everywhere, and the new stuff is not.  Personally, I don't enjoy scrolling to the very bottom to find a new post appended in reply to an older post, especially if that new post happens to be rather long.  For the longest time, I thought top-posting meant posting a new thread, since it was on the top level of discussion.  It must've sounded ridiculous of me to ask how to not top-post. =P

Sorry for interrupting the flow here Georg... ;)

An idea to extend existing library/binary file formats with extra information for reflection?  Sounds kind of hairy to me.

I would advise to check the specs of most popular executable/linkable formats (ELF, COFF, OMF, etc.) to see if such a thing is already supported without modification; sort of like a "miscellaneous" section.  If such a thing were allowed, then a specification for a custom format containing necessary reflection information for D should be drawn up, agreed upon, and implemented.  It should store its information within that miscellaneous section, with an obvious marker so as to not confuse with any other potential information stored within that section.

Georg Wrede wrote:
> (Sorry for top-posting, I'm in the middle of something right now, and I just got a raw idea to throw at you guys!)
> 
> The different lib and dll formats specify things quite meticulously. Now, if we wanted to make a "file type" that is compatible _and_ contains data that we need, then:
> 
> why not just slap our own data at the end of the lib/dll?
> 
> So, for example, if a file format only describes entry points and function names, we might slap a description of return values, function parameters, and whatever else we consider important (e.g. compiler version, or some such) simply at the end of that file.
> 
> This has of course to be tested, so no tools (binutils or other) crash with these files. With any good luck, this might be an easy "chewing gum and cardboard" solution, that we can use for the time being.
> 
> (Just a Temporary Solution(TM) that the FAA airliner disaster investigators will find 5 years from now still being used. :-)
> 
> 
> Dave wrote:
> 
>> In article <ddpan3$6fg$1@digitaldaemon.com>, Hasan Aljudy says...
>>
>>> Vathix wrote:
>>>
>>>>> D should and could do better.  Let's ditch the headers/import idea  completely (aka "stripping") and create a tool (integrated into build  perhaps?) that just reads the symbols directly from the *.lib file that  the project links with (silent "stripping"). That way shipping the  library itself would be the only thing necessary for closed projects.   This has been discussed before.  We don't really need more header files  to mess with.  We're drifting back into the C/C++ age again if we go  that route.
>>>>
>>>>
>>>>
>>>> The problem with that is the current lib files are very low level and  dangerous. To distribute a lib file you have to force users to use the  same compiler version and any other libs you use have to be for that same  version. Say you want to use libs from 2 different sources and both rely  on a different compiler version. Well, now you're pretty much screwed.  Even if they successfully link to a program, there still can be hidden  problems, such as access violations because there was supposed to be a  pointer to something at a certain location but now it's somewhere else. If  D could have a new form of lib file that knew more about the language and  didn't depend on compiler version as much, hopefully not even depend on  compiler brand (implementation) much, and possibly use an intermediate  code form, it would be a lot safer and better in many respects.
>>>
>>>
>>> I really don't know anything about lib files, so I don't know how much sense am I gonna make here, but here's an idea:
>>> How about an intermediate file (between .d and .lib) that contains both stripped declarations and .lib content (whatever that is), so that it's easy for any compiler/linker/whatever to extract both declarations and/or .lib from that file (let's just call that file "dlib" for now).
>>>
>>> In other words, embed stripped declarations directly into lib files.
>>>
>>> What I'm proposing is that instead of having the compiler produce lib files, it produces the "dlib" file, or have it only produce that file on a special compiler switch.
>>>
>>> is that a reasonable proposal? or is it extreemly stupid?
>>
>>
>>
>> I think that is very reasonable and from what I gather it can actually be done.
>>
>> When the compiler tries to resolve an import that isn't in the path, it looks in
>> the library and/or .obj files, right?
>>
>> I believe it can also be done in such a way so that other languages could use
>> the same libraries (they would just ignore the section of the object files
>> containing the D symbols).
>>
>> Just think:
>>
>> - Import code and libs. in one file (end of version issues between headers and
>> libs.)
>>
>> - One file per library to distribute.
>>
>> - Reflection/introspection w/o having to strip executable library code (This
>> would make runtime loading/linking much easier and portable, right Pragma?).
>>
>> - Same library file could be used by other languages (they would have to provide
>> their own forward declarations).
>>
>> If you also store some type of implementation specified binary symbols of all of
>> the code instead of just stripped declarations:
>>
>> - You could potentially distribute closed-source libraries made up of just
>> template code in binary format using the current D instantiation model (w/o
>> implicit function template instantiation).
>>
>> - The extra info. could be used by the compiler, e.g.: to inline functions. The
>> info. not referenced would not make it into .exe's (stipped by most modern
>> linkers) so it wouldn't bloat applications.
>>
>> Now all that would be a big step forward, IMO!
>>
>> - Dave
>>
>>
September 16, 2005
In article <dgd52g$2ff3$1@digitaldaemon.com>, James Dunne says...
>
>Why do some people get all up in arms about top-posting?

This is  Alf P. Steinbach's signature file (from the c++ newsgroups):

---
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?