Thread overview
Debugging "missing callgraph edge for call stmt"
Jan 01, 2013
Johannes Pfau
Jan 02, 2013
Iain Buclaw
Jan 03, 2013
Johannes Pfau
Jan 03, 2013
Iain Buclaw
Jan 03, 2013
Johannes Pfau
Jan 03, 2013
Johannes Pfau
Jan 03, 2013
Iain Buclaw
Jan 03, 2013
Johannes Pfau
January 01, 2013
This is a bug which only seems to happen with gdc-4.7. At least I can't reproduce it with gcc 4.8 but I'd like to fix it in 4.7 as well if possible. (It prevents some phobos modules from being built in unittest mode).

First, here's the reduced test case:
----------------
void main()
{
    topNIndex();
}

void topNIndex()() //<= only fails if this is a template
{
    bool indirectLess(int a, int b) //<= adding 'static' makes it work
    {
        return a > b;
    }
    auto a = BinaryHeap!(indirectLess)();
}

struct BinaryHeap(alias less )
{
    void percolateDown()
    {
        less(0, 1);
    }
}
----------------

It's somehow related to closures in templated functions. Making topNIndex a function or making indirectLess static fixes the issue.

This is the compiler error:
----------------
bug/std/algorithm2.d:1: error: missing callgraph edge for call stmt:
algorithm2.topNIndex!().topNIndex.indirectLess (0B, 0, 1);

algorithm2.topNIndex!().topNIndex.BinaryHeap!(indirectLess).BinaryHeap.percolateDown/3
@0x7fe2ce7a6b40 (asm:
_D10algorithm214__T9topNIndexZ9topNIndexFZv82__T10BinaryHeapS63_D10algorithm214__T9topNIndexZ9topNIndexFZv12indirectLessMFiiZbZ10BinaryHeap13percolateDownMFZv)
analyzed needed reachable body finalized called by: calls:
_d_assert_msg/5 (1.00 per call) References: Refering this function:
bug/std/algorithm2.d:1: internal compiler error: verify_cgraph_node
failed
----------------

Any clue what to do about this?
* Was it probably a gdc specific issue fixed in master but missed when
  backporting?
* Could be an issue fixed in the GCC backend, but I can't find a
  matching bug report?
* Any clue how to debug this further?
January 02, 2013
On 1 January 2013 19:22, Johannes Pfau <nospam@example.com> wrote:

> This is a bug which only seems to happen with gdc-4.7. At least I can't reproduce it with gcc 4.8 but I'd like to fix it in 4.7 as well if possible. (It prevents some phobos modules from being built in unittest mode).
>
> First, here's the reduced test case:
> ----------------
> void main()
> {
>     topNIndex();
> }
>
> void topNIndex()() //<= only fails if this is a template
> {
>     bool indirectLess(int a, int b) //<= adding 'static' makes it work
>     {
>         return a > b;
>     }
>     auto a = BinaryHeap!(indirectLess)();
> }
>
> struct BinaryHeap(alias less )
> {
>     void percolateDown()
>     {
>         less(0, 1);
>     }
> }
> ----------------
>
> It's somehow related to closures in templated functions. Making topNIndex a function or making indirectLess static fixes the issue.
>
> This is the compiler error:
> ----------------
> bug/std/algorithm2.d:1: error: missing callgraph edge for call stmt:
> algorithm2.topNIndex!().topNIndex.indirectLess (0B, 0, 1);
>
>
> algorithm2.topNIndex!().topNIndex.BinaryHeap!(indirectLess).BinaryHeap.percolateDown/3
> @0x7fe2ce7a6b40 (asm:
>
> _D10algorithm214__T9topNIndexZ9topNIndexFZv82__T10BinaryHeapS63_D10algorithm214__T9topNIndexZ9topNIndexFZv12indirectLessMFiiZbZ10BinaryHeap13percolateDownMFZv)
> analyzed needed reachable body finalized called by: calls:
> _d_assert_msg/5 (1.00 per call) References: Refering this function:
> bug/std/algorithm2.d:1: internal compiler error: verify_cgraph_node
> failed
> ----------------
>
> Any clue what to do about this?
> * Was it probably a gdc specific issue fixed in master but missed when
>   backporting?
> * Could be an issue fixed in the GCC backend, but I can't find a
>   matching bug report?
> * Any clue how to debug this further?
>


Only two suggestions I can give are:

a) Make sure that indirectLess DECL_CONTEXT is topNIndex, and not main in
code generation (eg: you don't want it to look logically like this)

void main()
{
    bool indirectLess(int a, int b)
    {
        return a > b;
    }
    topNIndex();
}

void topNIndex()
{
    auto a = BinaryHeap!(indirectLess);
}


b) Make sure that indirectLess is codegen'd and pushed before topNIndex.


Regards,
-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';


January 03, 2013
Am Wed, 2 Jan 2013 00:12:03 +0000
schrieb Iain Buclaw <ibuclaw@ubuntu.com>:

> Only two suggestions I can give are:
> 
> a) Make sure that indirectLess DECL_CONTEXT is topNIndex, and not main in code generation ...
OK, DECL_CONTEXT is correct.

> b) Make sure that indirectLess is codegen'd and pushed before topNIndex.

What's the best way to verify this? I had a look at ObjectFile::outputFunction and the order is:

algorithm2.topNIndex!().topNIndex.indirectLess
algorithm2.topNIndex!().topNIndex
algorithm2.topNIndex!().topNIndex.BinaryHeap!(indirectLess).BinaryHeap.percolateDown



OK, I've stepped through way to much of gccs internals now trying to
find out why gcc thinks indirectLess is unreachable, could be a backend
bug but then I looked at the test case again:
Isn't the compiler actually right in removing indirectLess?

indirectLess as a nested function can't be accessed outside of the
topNIndex function, correct? It's then passed to BinaryHeap, OK.
BinaryHeaps percolateDown could access indirectLess, but:
topNIndex never calls percolateDown. But outside of topNIndex
BinaryHeap!(indirectLess).percolateDown can't be called, is that
correct? So in the end it's correct that indirectLess isn't reachable?

Then the problem is that gcc thinks BinaryHeap!(indirectLess).percolateDown is a unit entry point - afaics it's not as it can't be called outside of topNIndex.

GCC assumes all functions marked with TREE_PUBLIC are valid entry points...

So do we have to set TREE_PUBLIC=false for functions which are part of a template instance (or nested struct/class) which is only valid in a certain scope? (experimenting shows we then also have to set DECL_WEAK and DECL_COMDAT to false)

Or we have to set TREE_PUBLIC(indirectLess) = true. Currently we have indirectLess=false but percolateDown=true

Then there's the question why the outer function has to be a template as well for this to happen: It's because if it's not a template, gdc uses a completely different code path: There's a "!gen.functionNeedsChain (f)" check in outputFunction which is false for the failing case. (cgraph_finalize_function marks the function as reachable)

cgraphunit.c:cgraph_decide_is_function_needed
/* Determine if function DECL is needed.  That is, visible to something
   either outside this translation unit, something magic in the system
   configury.  */

  /* Externally visible functions must be output.  The exception is
     COMDAT functions that must be output only when they are needed.
     [...] */
  if (((TREE_PUBLIC (decl)
	|| (!optimize
	    && !node->same_body_alias
	    && !DECL_DISREGARD_INLINE_LIMITS (decl)
	    && !DECL_DECLARED_INLINE_P (decl)
	    && !(DECL_CONTEXT (decl)
		 && TREE_CODE (DECL_CONTEXT (decl)) == FUNCTION_DECL)))
       && !flag_whole_program
       && !flag_lto)
      && !DECL_COMDAT (decl) && !DECL_EXTERNAL (decl))


BTW: I think that this isn't happening in gcc-4.8 is pure luck. Most of
the cgraph_analyze* functions have been rewritten in gcc-4.8. But I
think this specific case is still bogus (we tell the backend that
we can access percolateDown from outside the unit even though it
requires access to a function which is only valid in a limited scope)
January 03, 2013
On 3 January 2013 16:12, Johannes Pfau <nospam@example.com> wrote:

> Then there's the question why the outer function has to be a template as well for this to happen: It's because if it's not a template, gdc uses a completely different code path: There's a "!gen.functionNeedsChain (f)" check in outputFunction which is false for the failing case. (cgraph_finalize_function marks the function as reachable)
>
>
That might be just it then...


BTW: I think that this isn't happening in gcc-4.8 is pure luck. Most of
> the cgraph_analyze* functions have been rewritten in gcc-4.8. But I
> think this specific case is still bogus (we tell the backend that
> we can access percolateDown from outside the unit even though it
> requires access to a function which is only valid in a limited scope)
>

I have been rolling round my head to remove the notion of functions nested in functions within the D codegen.  Having only RECORD and UNION types set in DECL_CONTEXT.  So all public functions are TREE_PUBLIC=1 (unless there's some overriding attribute), and all nested functions are TREE_PUBLIC=0.


-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';


January 03, 2013
Am Thu, 3 Jan 2013 16:47:00 +0000
schrieb Iain Buclaw <ibuclaw@ubuntu.com>:

> On 3 January 2013 16:12, Johannes Pfau <nospam@example.com> wrote:
> 
> > Then there's the question why the outer function has to be a template as well for this to happen: It's because if it's not a template, gdc uses a completely different code path: There's a "!gen.functionNeedsChain (f)" check in outputFunction which is false for the failing case. (cgraph_finalize_function marks the function as reachable)
> >
> >
> That might be just it then...

I admit I don't really know why this check is needed but even if I try to call cgraph_finalize_function for indirectLess the backend dies with the same error. Actually my statement above was wrong, cgraph_finalize_function only marks the function as reachable if it's TREE_PUBLIC/. The not-templated case works even without cgraph_finalize_function so that's not the issue.

The only other thing which worked was setting the cgraph
finalized flag manually and then explicitly calling
cgraph_mark_needed_node for indirectLess, but I don't think that's a
real solution.


> 
> I have been rolling round my head to remove the notion of functions nested in functions within the D codegen.  Having only RECORD and UNION types set in DECL_CONTEXT.  So all public functions are TREE_PUBLIC=1 (unless there's some overriding attribute), and all nested functions are TREE_PUBLIC=0.

I see! That's why percolateDown is public, but isn't it essentially a nested function in the test case?

It's in a template which is nested in a second function (topNIndex) and
it accesses a function(indirectLess) nested in that function by alias.
So isn't that the same as:
-topNIndex
----indirectLess
----percolateDown{indirectLess();} <--- nested function

(we have)
-topNIndex
----indirectLess
----BinaryHeap(alias indirectLess)
--------percolateDown{indirectLess()}; <--- is TREE_PUBLIC

January 03, 2013
> 
> It's in a template which is nested in a second function (topNIndex)
> and it accesses a function(indirectLess) nested in that function by
> alias. So isn't that the same as:
> -topNIndex
> ----indirectLess
> ----percolateDown{indirectLess();} <--- nested function
> 
> (we have)
> -topNIndex
> ----indirectLess
> ----BinaryHeap(alias indirectLess)
> --------percolateDown{indirectLess()}; <--- is TREE_PUBLIC
> 

I should add that indirectLess is a real nested function - not static - so it could access variables defined in topNIndex. This is why I think percolateDown can't be TREE_PUBLIC: It calls a function which is nested in another function and can access variables in that function. So percolateDown can indirectly access topNIndex's variables and is therefore, as far as I understand, nested.
January 03, 2013
On 3 January 2013 17:38, Johannes Pfau <nospam@example.com> wrote:

> Am Thu, 3 Jan 2013 16:47:00 +0000
> schrieb Iain Buclaw <ibuclaw@ubuntu.com>:
>
> > On 3 January 2013 16:12, Johannes Pfau <nospam@example.com> wrote:
> >
> > > Then there's the question why the outer function has to be a template as well for this to happen: It's because if it's not a template, gdc uses a completely different code path: There's a "!gen.functionNeedsChain (f)" check in outputFunction which is false for the failing case. (cgraph_finalize_function marks the function as reachable)
> > >
> > >
> > That might be just it then...
>
> I admit I don't really know why this check is needed but even if I try to call cgraph_finalize_function for indirectLess the backend dies with the same error. Actually my statement above was wrong, cgraph_finalize_function only marks the function as reachable if it's TREE_PUBLIC/. The not-templated case works even without cgraph_finalize_function so that's not the issue.
>
>
I would have thought that it be a little more factors than that.  Remember, TREE_PUBLIC only means whether or not the function is callable outside of the module we are compiling (TREE_PUBLIC=0 means that the function is not marked as global in the assembly output).

I think it would be a mixture between what DECL_CONTEXT is set for the affected functions, and the order that the functions generated are passed to cgraph_finalize_function.




> >
> > I have been rolling round my head to remove the notion of functions nested in functions within the D codegen.  Having only RECORD and UNION types set in DECL_CONTEXT.  So all public functions are TREE_PUBLIC=1 (unless there's some overriding attribute), and all nested functions are TREE_PUBLIC=0.
>
> I see! That's why percolateDown is public, but isn't it essentially a nested function in the test case?
>
>
What I said was an idea that *could* be implemented, but is not yet.

See ObjectFile::setupSymbolStorage for why templates are public by default...

-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';


January 03, 2013
Am Thu, 3 Jan 2013 18:10:09 +0000
schrieb Iain Buclaw <ibuclaw@ubuntu.com>:

> On 3 January 2013 17:38, Johannes Pfau <nospam@example.com> wrote:
> 
> > Am Thu, 3 Jan 2013 16:47:00 +0000
> > schrieb Iain Buclaw <ibuclaw@ubuntu.com>:
> >
> > > On 3 January 2013 16:12, Johannes Pfau <nospam@example.com> wrote:
> > >
> > > > Then there's the question why the outer function has to be a template as well for this to happen: It's because if it's not a template, gdc uses a completely different code path: There's a "!gen.functionNeedsChain (f)" check in outputFunction which is false for the failing case. (cgraph_finalize_function marks the function as reachable)
> > > >
> > > >
> > > That might be just it then...
> >
> > I admit I don't really know why this check is needed but even if I try to call cgraph_finalize_function for indirectLess the backend dies with the same error. Actually my statement above was wrong, cgraph_finalize_function only marks the function as reachable if it's TREE_PUBLIC/. The not-templated case works even without cgraph_finalize_function so that's not the issue.
> >
> >
> I would have thought that it be a little more factors than that. Remember, TREE_PUBLIC only means whether or not the function is callable outside of the module we are compiling (TREE_PUBLIC=0 means that the function is not marked as global in the assembly output).

Sorry, I should try to reword that. There's indeed much more which determines if a function is ultimately reachable or not. I debugged much of that code (the cgraph part, I tried to stay away from gimple) in the last 24 hours. The backend does correctly detect, for example, that percolateDown calls indirectLess. But in the end it throws that information away and says indirectLess is not reachable, probably exactly because of DECL_CONTEXT. After trying to find out why the backend falsely marks indirectLess as unreachable, I revisited the test case and I think the backend is actually correct about indirectLess, it is unreachable and can be optimized away.

The backend is wrong about percolateDown. And this is where TREE_PUBLIC is the problem. TREE_PUBLIC short circuits any reachability checks. If a function is TREE_PUBLIC it's always potentially reachable. And we tell the backend that percolateDown is TREE_PUBLIC => reachable although it's not.

It is of course also a bug in the backend as the reachability check for indirectLess should probably detect that percolateDown is TREE_PUBLIC. But we still give the backend false information, percolateDown isn't TREE_PUBLIC and the wrong information causes the crash.

(Remember, the real issue is that when gcc verifies the percolateDown function, the indirectLess function is not available. But as percolateDown can never be called this is not a real runtime bug, it just confuses the verify verifier.)


> > >
> > > I have been rolling round my head to remove the notion of functions nested in functions within the D codegen.  Having only RECORD and UNION types set in DECL_CONTEXT.  So all public functions are TREE_PUBLIC=1 (unless there's some overriding attribute), and all nested functions are TREE_PUBLIC=0.
> >
> > I see! That's why percolateDown is public, but isn't it essentially a nested function in the test case?
> >
> >
> What I said was an idea that *could* be implemented, but is not yet.
> 
> See ObjectFile::setupSymbolStorage for why templates are public by default...
> 

OK I'll have a look at that later.