June 08, 2020
On Monday, 8 June 2020 at 08:47:36 UTC, Stefan Koch wrote:
> Is that a valid concern?
> Will _any_ project out there break because of that?

It's easy to imagine. If one has a very simple mathematical function taking, say int and float and returning a float, and that function is called in a very hot loop, performance can drop noticeably if inlining suddently fails. Especially if the mathematical function calls another function instead of using the compiler primitives directly.
June 08, 2020
On Monday, 8 June 2020 at 06:14:44 UTC, Manu wrote:
> 1. I only want the function to be present in the CALLING binary. I do not want an inline function present in the local binary where it was defined (unless it was called internally). I do not want a linker to see the inline function symbols and be able to link to them externally. [This is about linkage and controlling the binary or distribution environment]

I think this should be controlled by the visibility attributes, not `pragma(inline)`. If a function is `private`, there's no reason why the linker should see it at all. If it's `public`, it should be seen when linking the internal object files together, but not when linking a precompiled binary. `export` should always be visible.

>
> 2. I am unhappy that the optimiser chose to not inline a function call, and I want to override that judgement. [This is about micro-optimisation]

This is what `pragma(inline)` should be for. For some extent at least, it also is. If you compile a simple example with `ldc2 -O1`, it'll inline the function if you use the pragma but not otherwise. `ldc2 -O2` will inline it regardless and DMD will never do it IIRC, at least not without the `-inline` switch. I agree that it should.

>
> 3. I want to treat the function like an AST macro; I want the function inserted at the callsite, and I want to have total confidence in this mechanic. [This is about articulate mechanical control over code-gen; ie, I know necessary facts about the execution context/callstack that I expect to maintain]

Sounds like a case for mixins, either type of them. But it'd be better if the compiler would only ignore `pragma(inline, true)` if either linking externally or avoiding recursion. There's just no reason to ignore it otherwise. This is an implementation problem, not a spec problem IMO.
June 08, 2020
On Monday, 8 June 2020 at 11:53:11 UTC, Dukc wrote:
> On Monday, 8 June 2020 at 08:47:36 UTC, Stefan Koch wrote:
>> Is that a valid concern?
>> Will _any_ project out there break because of that?
>
> It's easy to imagine. If one has a very simple mathematical function taking, say int and float and returning a float, and that function is called in a very hot loop, performance can drop noticeably if inlining suddently fails. Especially if the mathematical function calls another function instead of using the compiler primitives directly.

I didn't ask about not inlinng.
I asked about the address of a function changing.
From object file to object file
June 08, 2020
On Monday, 8 June 2020 at 10:19:16 UTC, Walter Bright wrote:
>
> Why does it matter where it is emitted? Why would you want multiple copies of the same function in the binary?

Performance in HPC.

In C++, consider an []operator. There would be a lot of function calls inside a kernel (some function with lot of loops, one billion iterations of the inner most loop easily). If then I have some kind of stencil or any array accesses, calling a function each time a top of resolving the current pointer would be very costly.
June 08, 2020
On Mon, Jun 8, 2020 at 6:50 PM Basile B. via Digitalmars-d < digitalmars-d@puremagic.com> wrote:

> On Monday, 8 June 2020 at 06:14:44 UTC, Manu wrote:
> > Inline has been bugging me forever, it's usually not what I want. The spec says this-or-that, but I think we should take a step back, ignore what's written there, look at the problem space, determine the set of things that we want, and then make sure they're expressed appropriately.
> >
> > I think a first part of the conversation to understand, is that since D doesn't really have first-class `inline` (just a pragma, assumed to be low-level compiler control), I think most people bring their conceptual definition over from C/C++, and that definition is a little odd (although it is immensely useful), but it's not like what D does.
> >
> > In C/C++, inline says that a function will be emit to the
> > binary only when
> > it is called, and the function is marked with internal linkage
> > (it is not
> > visible to the linker from the symbol table)
> > By this definition; what inline REALLY means is that the
> > function is not
> > placed in the binary where it is defined, it is placed in the
> > binary where
> > it is CALLED, and each CU that calls an inline function
> > receives their own
> > copy of the function. From here; optimisers will typically
> > inline the call
> > if they determine it's an advantage to do so.
> >
> > Another take on inline, and perhaps a more natural take (if your mind is not poisoned by other native languages), is that the function is not actually emit to an object anywhere, it is rather wired directly inline into the AST instead of 'called'. Essentially a form of AST macro.
>
> No. I rather see "inline" as a hint for the backend.
> DMD is peculiar with its way of inlining.
>
> > I reach for inline in C/C++ for various different reasons at different times, and I'd like it if we were able to express each of them:
> >
> > 1. I only want the function to be present in the CALLING binary. I do not want an inline function present in the local binary where it was defined (unless it was called internally). I do not want a linker to see the inline function symbols and be able to link to them externally. [This is about linkage and controlling the binary or distribution environment]
>
> what if the function address is took in a delegate ?
> It still needs to be there, in the object matching to the CU
> where it is declared, otherwise there will be surprises, e.g
> &func in a CU and &func in another will have different addresses.


Right. This is fine and normal.


June 08, 2020
On Mon, Jun 8, 2020 at 7:10 PM Basile B. via Digitalmars-d < digitalmars-d@puremagic.com> wrote:

> On Monday, 8 June 2020 at 08:47:36 UTC, Stefan Koch wrote:
> > On Monday, 8 June 2020 at 08:45:02 UTC, Basile B. wrote:
> >
> >> what if the function address is took in a delegate ?
> >> It still needs to be there, in the object matching to the CU
> >> where it is declared, otherwise there will be surprises, e.g
> >> &func in a CU and &func in another will have different
> >> addresses.
> >
> > I have _never_ compared addresses of inline functions from
> > different CUs.
> > How would I even do that?
> > by storing the function pointer in a global which is visible
> > from multiple translation units?
> > Is that a valid concern?
> > Will _any_ project out there break because of that?
>
> It's a detail. I just meant that in case where the address of function that is marked for inlining is took then it must still be emitted in the object matching to the declaration unit.
>

It should be emit to the CU where the address is taken.


June 09, 2020
On Mon, Jun 8, 2020 at 8:20 PM Walter Bright via Digitalmars-d < digitalmars-d@puremagic.com> wrote:

> On 6/7/2020 11:14 PM, Manu wrote:
> > I think a first part of the conversation to understand, is that since D
> doesn't
> > really have first-class `inline` (just a pragma, assumed to be low-level compiler control), I think most people bring their conceptual definition
> over
> > from C/C++, and that definition is a little odd (although it is
> immensely
> > useful), but it's not like what D does.
>
> C/C++ inline has always been a hint to the compiler, not a command.
>

It's not a hint at all. It's a mechanical tool; it marks symbols with internal linkage, and it also doesn't emit them if it's never referenced. The compiler may not choose to ignore that behaviour, it's absolutely necessary, and very important.

> In C/C++, inline says that a function will be emit to the binary only when it is
> > called, and the function is marked with internal linkage (it is not
> visible to
> > the linker from the symbol table)
> > By this definition; what inline REALLY means is that the function is not
> placed
> > in the binary where it is defined, it is placed in the binary where it
> is
> > CALLED, and each CU that calls an inline function receives their own
> copy of the
> > function.
>
> Why does it matter where it is emitted? Why would you want multiple copies
> of
> the same function in the binary?
>

I want zero copies if it's never called. That's very important.
I also want copies to appear locally when it is referenced; inline
functions should NOT require that you link something to get the code...
that's not inline at all.

> Another take on inline, and perhaps a more natural take (if your mind is not
> > poisoned by other native languages), is that the function is not
> actually emit
> > to an object anywhere, it is rather wired directly inline into the AST
> instead
> > of 'called'. Essentially a form of AST macro.
>
> The problem with this is what is inlined and what isn't is rather fluid,
> i.e. it
> varies depending on circumstances and which compiler you use. For example,
> if
> you recursively call a function, it's going to have to give up on inlining
> it.
> Changes in the compiler can expand or contract inlining opportunities.
> Having it
> inline or issue an error is disaster for compiling existing code without
> constantly breaking it.
>

I understand why this particular take is more complicated; and as such I
wouldn't suggest we do it. I'm just offering it as one possible take on the
concept.
I think to make this form work, it must be handled in the frontend;
essentially an AST macro. Naturally, it would fail on recursive calls,
because that would be a recursive expansion.
I'm not suggesting we do this, except maybe what I describe as the #3 use
case could potentially be this.

> I reach for inline in C/C++ for various different reasons at different times,
> > and I'd like it if we were able to express each of them:
> >
> > 1. I only want the function to be present in the CALLING binary. I do
> not want
> > an inline function present in the local binary where it was defined
> (unless it
> > was called internally). I do not want a linker to see the inline
> function
> > symbols and be able to link to them externally. [This is about linkage
> and
> > controlling the binary or distribution environment]
>
> Why? What is the problem with the emission of one copy where it was defined?
>

That's the antithesis of inline. If I wanted that, I wouldn't mark it
inline.
I don't want a binary full of code that shouldn't be there. It's very
important to be able to control what code is in your binaries.

If it's not referenced, it doesn't exist.

> 2. I am unhappy that the optimiser chose to not inline a function call, and I
> > want to override that judgement. [This is about micro-optimisation]
>
> It's not always possible to inline a function.
>

Sure, but in this #2 case, I know it's possible, but the compiler chose not to. This #2 case is the 'hint' form.

> 3. I want to treat the function like an AST macro; I want the function inserted
> > at the callsite, and I want to have total confidence in this mechanic.
> [This is
> > about articulate mechanical control over code-gen; ie, I know necessary
> facts
> > about the execution context/callstack that I expect to maintain]
>
> The PR I have on this makes it an informational warning. You can choose to
> be
> notified if inlining fails.
>

That's not sufficient though for all use cases. This is a different kind of
inline (I think it's 'force inline').
This #3 mechanic is rare, and #1/2 are overwhelmingly common. You don't
want a sea of warnings to apply to cases of 1/2.
I think it's important to be able to distinguish #3 from the other 2 cases.

> Are there non-theoretical use cases I've missed that people have encountered?
>
> At its root, inlining is an optimization, like deciding which variables go
> into
> registers.
>

No, actually... it's not. It's not an 'optimisation' in any case except maaaaybe #2; it's about control of the binary output and code generation. Low level control of code generation is important in native languages; that's why we're here.


June 09, 2020
On Mon, Jun 8, 2020 at 8:30 PM kinke via Digitalmars-d < digitalmars-d@puremagic.com> wrote:

> On Monday, 8 June 2020 at 06:14:44 UTC, Manu wrote:
> > D's inline today doesn't do any of those things. It doesn't implement a mechanic that I have ever wanted or known a use for.
>
> We've had this discussion a while back. As of the latest LDC beta, `pragma(inline, true)` functions are now properly emitted into each referencing CU, and will be inlined in most cases (`alwaysinline` LLVM function attribute), even at -O0.
>

Yes, and thank you; it's always nice that we can fix broken things in LDC. But I think for inline it's actually quite important that the spec is useful, and that all compilers do the same thing; otherwise DMD experiences link errors that LDC properly avoids.

In modules other than the owning one, the function 'copy' is
> emitted as `available_externally`, meaning that it's *only* available for inlining at the IR level, it will never make it to the assembler and object file.
>

Yeah I always forget the proper names for all the linkage flags that
symbols can have.
This approach is fine from a linkage point of view (there shouldn't be
errors), but I'd really like to not see the symbol in the owning module if
it's never called locally.
It's a shame to have a ton of noise in a binary when it's unnecessary.
Most D software is full of bloaty junk, but in the cases where I care about
this, it's actually a very tightly controlled binary ecosystem, which binds
between multiple languages, and only exposes necessary stuff at the ABI
boundary.
I've also seen cases where the function is not referenced in the owning
module, but it does not get stripped by the linker for whatever reason, and
then if the inline function has a call to an extern symbol that you don't
link, you'll get link errors, even though there's no calls.
By emitting it even when it's never been referenced, we're just inviting
link errors and inconsistent behaviour between different linkers. We can
trivially avoid that risk.

In its owning module, the function is emitted as a regular
> function, as a fallback for non-inlined cases (and for when people take its address etc.). Our opinions diverge wrt. whether that's a problem - to me it's clearly no big deal, as the function is a) most likely small, and b) subject to linker stripping if unreferenced.


a) most likely small, but still not nothing; the symbol table is public ABI
material, and in some projects I've worked on, the symbol table is
carefully curated.
b) linker stripped is not reliable. We are unnecessarily inviting issues in
some cases, and there's just no reason for that.

If we're confident that link stripping is 100% reliable when a symbol is not referenced, then I have no complaint here.

Can you show what case a hard-symbol in the owning CU solves? Non-inlined
cases will still find it locally if it has internal linkage (or whatever
that link flag is called).
I think it's the same flag that `static` (or `inline`) in C++ specifies
right?

Wrt. control of which CUs contain which functions, that's totally
> out of hand anyway due to the way templates are emitted.
>

I'm not sure what you mean. Templates work correctly; template instances are only generated and emit to the calling CU.


June 08, 2020
On Tue, Jun 09, 2020 at 12:09:04AM +1000, Manu via Digitalmars-d wrote: [...]
>    I don't want a binary full of code that shouldn't be there. It's
>    very important to be able to control what code is in your binaries.
>    If it's not referenced, it doesn't exist.
[...]

Could you just use LTO for this?  LDC's LTO, for example, lets the linker discard unreferenced symbols.


T

-- 
A mathematician is a device for turning coffee into theorems. -- P. Erdos
June 08, 2020
On Monday, 8 June 2020 at 14:09:04 UTC, Manu wrote:
> [snip]
> It's not a hint at all. It's a mechanical tool; it marks symbols with internal linkage, and it also doesn't emit them if it's never referenced. The compiler may not choose to ignore that behaviour, it's absolutely necessary, and very important.
>[snip]

Perhaps it would be helpful to split the discussion into inline as a linkage attribute vs. inline as an tool for inline expansion? D has no linkage attribute that has the same behavior as inline or extern inline, and you believe that is important.