May 10, 2021
> On Sunday, 9 May 2021 at 22:41:23 UTC, Adam D. Ruppe wrote in github:
>This appears completely useless to me. First, it doesn't "just work" since you need to run cpp over the file before you can use it, and since that expands macros, it must be done for each different target, meaning it must be part of the build system. This offers no advantage over the (not great) existing solutions and shares many of their problems including creating huge files.

>Even if that just worked, C bindings tend to rely on the preprocessor to do common and necessary tasks like defining constants. Without that, you can't even reasonably use real world C headers. Possibly a limited processor could translate them, but even that data is gone after cpp goes through it, so there's no real hope to recover it into a usable form.

>I think trying to run the cpp over a whole D file like dpp does is problematic... but like something must be done. dstep's approach actually does a reasonably good job, at the cost of requiring a bit of hand editing and some setting up of other files too.

>I do think there's some potential in integrating with dmd... but this PR as it is looks like more harm than good. And I'm not sure it is worth the time sink it will take to get it to a positive point.
The advantage of having this built into the compiler itself is that you don't have to rely on a third party library for importing c functions into your d code. Accessibility is the key advantage here.

- Alex
May 09, 2021
On 5/9/2021 4:27 PM, Max Haughton wrote:
> `inline` in the C standard (although given as an example C11 footnote 138) technically *only* has guaranteed implications other than actually inlining the function, is ignoring it valid?

C11 6.7.4-6 says:

"A function declared with an inline function specifier is an inline function. Making a function an inline function suggests that calls to the function be as fast as possible. The extent to which such suggestions are effective is implementation-defined."

Ignoring it is not only valid, it's what modern C compilers do anyway.
May 09, 2021
On 5/9/2021 4:41 PM, Max Haughton wrote:
> is very common in C, D doesn't allow such a concept, therefore using the D AST seems like it could be a problem.

I've deferred altering the D semantic code for now, as it is not necessary to demonstrate the concept of ImportC, and the PR is big enough already.

I suggest this discussion be more meta in nature, rather than oriented toward the (plenty) of bugs in the implementation.
May 10, 2021
On Monday, 10 May 2021 at 00:47:08 UTC, 12345swordy wrote:
> The advantage of having this built into the compiler itself is that you don't have to rely on a third party library for importing c functions into your d code. Accessibility is the key advantage here.

That could be easily done with the existing dpp approach. You could even just make dmd recognize the .dpp extension then shell out to dpp and users would never know.

Or heck even if the deimos repo was bundled with the download users would never know and it would work even better for a lot of cases.


Anyway, I wrote this on chat and was goinna clean up for my blog but I'll repeat it here instead:


my problems with dpp: 1) it is slow and always will be. it doesn't know what actually needs to be translated, so it just takes all the C and converts it to D and dumps it out. so dpp wastes time on huge amounts of useless work, then dmd wastes huge amounts of time sifting through it. I tried to optimize this by recognizing certain filename patterns and shifting it to imports, but it still has to do tons of work because of macros which brings me to:

2) preprocessor macros are bad and should be kept far away from D code. but since C headers often define them you're thrust into this hell. dpp makes it worse though because of its translation model - it has no semantic awareness of actual need. so it ends up doing silly things like turning import core.sys.linux to import core.sys.1. These tend to be fixed with some ad-hoc #undef at the end of the translated section, or sometimes translating certain macro patterns into superior D patterns, adn enough work toward that will prolly achieve like 95% goodness. but there's still bound to be some case that got missed no matter what.

3) some C code is  still slightly different than D code and even with the help of an AST you can't fix all this up - again blame macros for much of it - meaning you can still generat crap
It is possible that a dmd-integrated solution can address some of this. It can just parse the C definitions and keep a table of them in memory instead of translating it to a file. It can keep a separate macro namespace and apply them in a semantically-aware fashion to avoid generating a bunch of that intermediate code. when it does apply a macro, it would take the D ast, to C String it, run the processor in a sandbox, C parse the result, then paste that into the  ast, limiting the pain.

But that's still not going to always work.

And a lot of those things use various compiler extensions and newer language features that dmc likely doesn't implement much of. Some require annoying #defines before #include and that's ... well possible but fairly tricky too, especially if it is trying to translate the declarations into the D module structure
so i think dmd integration is a better approach than dpp. but im a lil skeptical  if it will actually work as opposed to being some proof of concept dumped in and abandoned. The amount of work needed to bring this from proof of concept that sometimes works to actually usable thing that covers ground that the Deimos repo doesn't already have is going to be significant and probably better spent somewhere else.


May 10, 2021
On Monday, 10 May 2021 at 00:55:34 UTC, Adam D. Ruppe wrote:
> so i think dmd integration is a better approach than dpp.

That post was the positive side.

Here's what I said in the PR comment for the negative side:

This appears completely useless to me. First, it doesn't "just work" since you need to run cpp over the file before you can use it, and since that expands macros, it must be done for each different target, meaning it must be part of the build system. This offers no advantage over the (not great) existing solutions and shares many of their problems including creating huge files.

Even if that just worked, C bindings tend to rely on the preprocessor to do common and necessary tasks like defining constants. Without that, you can't even reasonably use real world C headers. Possibly a limited processor could translate them, but even that data is gone after cpp goes through it, so there's no real hope to recover it into a usable form.

I think trying to run the cpp over a whole D file like dpp does is problematic... but like something must be done. dstep's approach actually does a reasonably good job, at the cost of requiring a bit of hand editing and some setting up of other files too.

I do think there's some potential in integrating with dmd... but this PR as it is looks like more harm than good. And I'm not sure it is worth the time sink it will take to get it to a positive point.


======

Kinda the same thing: has potential but it is a lot of work to get it to actually realize that and there's other ways to get to that goal.

The preprocessor problem needs a solid action plan before we merge this thing or we'll be left with a half-done thing in limbo for years, if not forever.
May 10, 2021

On Monday, 10 May 2021 at 00:47:22 UTC, Walter Bright wrote:

>

On 5/9/2021 4:27 PM, Max Haughton wrote:

>

inline in the C standard (although given as an example C11 footnote 138) technically only has guaranteed implications other than actually inlining the function, is ignoring it valid?

C11 6.7.4-6 says:

"A function declared with an inline function specifier is an inline function. Making a function an inline function suggests that calls to the function be as fast as possible. The extent to which such suggestions are effective is implementation-defined."

Ignoring it is not only valid, it's what modern C compilers do anyway.

Which is why I bring up that inline is used in modern programs, but not for performance, specifically in header files for example.

#include <stdio.h>
inline int square(int num) {
    return num * num;
}

int main()
{
    printf("%d\n", square(3));
}

fails to link in C but not in C++.

I don't know whether this actually breaks anything since D doesn't really have a "you have permission to emit nothing" keyword.

May 09, 2021
[A meta comment in reply to Adam]

The end goal is to be able to import a C file and it will "just work". The following problems need to be solved to make this happen:

a. collecting user-supplied #defines

b. running the C preprocessor

c. collecting the final set of macro definitions at the end of the preprocessor run

d. converting the collection (C) to D equivalents, for example:

    #define ABC 3  => enum ABC = 3;

    #define f(g) ((g)+1)  =>  auto f(T)(T g) { return g + 1; }

Going further than that is likely to be intractable, Atila has spent a lot of time on this, he likely pushed it further.

e. Running the C compiler

==================

ImportC only addresses (e).

Does this make it useless? Frankly, I have no idea. I don't know any way of finding out other than making it work and seeing what happens. But there are some things that are knowable:

1. C changes only glacially. It's not like the https://www.winchestermysteryhouse.com/ constantly adding new rooms and different architectural styles.

2. C is a simple language. It only took a 5 days to get this far with it. DMD's lexer, semantics, optimizer and code gen can all be leveraged. I know how to write a C compiler.

3. Writing a C preprocessor is a nightmare. Fortunately, I already did that https://github.com/facebookarchive/warp and we can use it if we choose.

4. A builtin C compiler will be very fast, and quite small.

5. We will have complete control over it. We can adjust it so it works best for us. We don't need to fork or get anyone's buy-in. We control the user experience.

6. The D code will be able to inline C code, and even CTFE it.

7. We can easily do inline assembler for it. (It's already there!)

8. There are a lot of wacky C extensions out there. We only need to implement currently used ones (not the 16 bit stuff), and only the stuff that appears in C headers.

9. Without a C compiler, we're stuck with, wedded to, and beholden to libclang. I wouldn't be surprised that the eventual cost of adapting ourselves to libclang will exceed the cost of doing our own C compiler.

10. By nailing (e), then (a..d) starts looking possible.

11. C++ can compile C code, which is a ginormous advantage for C++. Can we afford not to do that?

12. Being built-in, in the box, batteries included, rather than an add-on, has enormous positive implications for D. It's hard to understate it. Think how great it was for Ddoc and unittest being built-in.
May 10, 2021

On Monday, 10 May 2021 at 01:13:25 UTC, Max Haughton wrote:

>
#include <stdio.h>
inline int square(int num) {
    return num * num;
}

int main()
{
    printf("%d\n", square(3));
}

fails to link in C but not in C++.

Whether or not your code links in C depends on a number of factors. It is not as simple as you make it out to be.

Consider on my OpenBSD machine:

/home/brian/c $ gcc --version
gcc (GCC) 12.0.0 20210421 (experimental)
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

/home/brian/c $ gcc -o square square.c
ld: error: undefined symbol: square
>>> referenced by square.c
>>>               /tmp//cc3wDuve.o:(main)
collect2: error: ld returned 1 exit status
/home/brian/c $ gcc -O2 -o square square.c
/home/brian/c $ ./square
9

~Brian

May 09, 2021
On 5/9/2021 6:13 PM, Max Haughton wrote:
> ```
> #include <stdio.h>
> inline int square(int num) {
>      return num * num;
> }
> 
> int main()
> {
>      printf("%d\n", square(3));
> }
> ```
> fails to link in C but not in C++.

I just tried it with gcc, it links. It does not with clang. Examination of clang's output shows it treated it as extern. I don't really understand what's going on with clang here, but that code will work with DMD.
May 10, 2021
On Monday, 10 May 2021 at 01:53:26 UTC, Walter Bright wrote:
> On 5/9/2021 6:13 PM, Max Haughton wrote:
>> ```
>> #include <stdio.h>
>> inline int square(int num) {
>>      return num * num;
>> }
>> 
>> int main()
>> {
>>      printf("%d\n", square(3));
>> }
>> ```
>> fails to link in C but not in C++.
>
> I just tried it with gcc, it links. It does not with clang. Examination of clang's output shows it treated it as extern. I don't really understand what's going on with clang here, but that code will work with DMD.

It links on clang 11.0.1 here at -O2 or higher (including -Os and -Oz).
Also links with pcc -O and even tcc!