Aren't the template instances generated at the place of the template?
The external code would instantiate the template, adding the instance information to the structure.
Then the sizeof would just read the set of instantiated templates.
This wouldn't require knowing about the code which instantiates the template.
The only requirement would be to compute the size of the structure AFTER all template instances are generated.
The only limitation would be that separate compilation passes could generate different structures due to different template instances, but that's not a problem, because the static templates go through the same limitation.