Thread overview | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
January 18, 2019 I can crash program (core.exception.InvalidMemoryOperationError) by using __FUNCTION__ in a class ctor | ||||
---|---|---|---|---|
| ||||
First, let me apologize for not being able to reduce this to a minimal test case. I have tried to do so but was unsuccessful. Offending code, and links to codebase at bottom. TL;DR: I believe this is a compiler error. Please help me prove it, or teach me where I've gone wrong. --- Briefly, I have a class, SAMRecord, that contains: a default constructor, a non-default ctor, and a destructor, each of which for debugging purposes emits a log-line that includes __FUNCTION__. When the default ctor passes __FUNCTION__ to the log function, the program will crash after about 15k or 30k objects are created when using dmd or ldc2, respectively. Further, if I change __FUNCTION__ to any other string (except __MODULE__, which induces the same behavior), the program operates flawlessly. The error is: core.exception.InvalidMemoryOperationError@src/core/exception.d(700): Invalid memory operation and is thrown at the the time that when under normal operation the GC would start clearing out the old objects. _Importantly, my code never calls the default constructor_ -- This part puzzles me the most. Additional data points: 1. I added a dummy log function to demonstrate it (likely?) not my library code offending 2. adding assert statements into the default constructor seems to elide the offending behaviour (i.e., I can leave the __FUNCTION__ in without memory error) 3. Commenting out the usage of __FUNCTION__ in the non-default ctor, which IS being used, does not prevent the error. Overall, the error seems to be related to passing __FUNCTION__ out of default class constructors, even when that constructor is not called explicitly by user code. Altering compiler behavior by e.g. adding assert(0) prevents the error. Thus, I conclude this is almost surely a compiler error, but I would be glad to be proven wrong. I would be happy to provide any more information, or to help develop an appropriate minimal reproducible bug, if given appropriate guidance. Otherwise, to replicate this you will need to install a C library. Details below. Thanks all in advance. James --- DMD: v2.081.2 LDC2: 1.11.0 repository: https://github.com/blachlylab/dhtslib/ commit: ee87434d0c48919aaa0ff6d04d6e0ca403564618 (current as of this post) requirement: htslib; https://github.com/samtools/htslib To replicate: 1. Download any BAM file. (random BAM file: https://www.encodeproject.org/files/ENCFF765NLQ/@@download/ENCFF765NLQ.bam ) 2. Update test/samreader.d to open your BAM file (sorry I have not parameterized this, have been tearing hair about re this crash for hours) 3. `dub build -c sam_test && ./samreader` Attempt at minimal reproducible test-case (that does NOT trigger the error): repo/test/bug_minimal.d Offending code section: (note that I wrote test_log as a substitute for hts_log_debug to prove that the error is still triggered even when not calling library code) ``` import core.stdc.stdlib: calloc, free; import std.format; import std.parallelism: totalCPUs; import std.stdio: writeln, writefln; import std.string: fromStringz, toStringz; import dhtslib.htslib.hts: htsFile, hts_open, hts_close; import dhtslib.htslib.hts: hts_itr_t; import dhtslib.htslib.hts: seq_nt16_str; import dhtslib.htslib.hts: hts_set_threads; import dhtslib.htslib.hts_log; import dhtslib.htslib.kstring; import dhtslib.htslib.sam; void test_log(string ctx, string msg) { import std.stdio; stderr.writeln(ctx, msg); } /** Encapsulates a SAM/BAM/CRAM record, using the bam1_t type for memory efficiency, and the htslib helper functions for speed. **/ class SAMRecord { /// bam1_t *b; /// this() { //debug(dhtslib_debug) hts_log_debug(__FUNCTION__, "ctor()"); /// This line triggers memory error when __FUNCTION__, but not when "Other string" test_log(__FUNCTION__, "ctor()"); /// This line will also trigger the memory error when __FUNCTION__, but not other strings //writeln(__FUNCTION__); // this will not trigger the memory error this.b = bam_init1(); //assert(0); // This will elide(?) the memory error //assert(1 == 2); // This will elide(?) the memory error } /// this(bam1_t *b) { debug(dhtslib_debug) hts_log_debug(__FUNCTION__, "ctor(bam1_t *b)"); this.b = b; } ~this() { writeln("Record dtor"); debug(dhtslib_debug) hts_log_debug(__FUNCTION__, "dtor"); //bam_destroy1(this.b); // we don't own it! } } ``` |
January 18, 2019 Re: I can crash program (core.exception.InvalidMemoryOperationError) by using __FUNCTION__ in a class ctor | ||||
---|---|---|---|---|
| ||||
Posted in reply to James Blachly | On 1/18/19 10:29 AM, James Blachly wrote: > First, let me apologize for not being able to reduce this to a minimal test case. I have tried to do so but was unsuccessful. Offending code, and links to codebase at bottom. > > TL;DR: I believe this is a compiler error. Please help me prove it, or teach me where I've gone wrong. > > --- > > Briefly, I have a class, SAMRecord, that contains: a default constructor, a non-default ctor, and a destructor, each of which for debugging purposes emits a log-line that includes __FUNCTION__. > > When the default ctor passes __FUNCTION__ to the log function, the program will crash after about 15k or 30k objects are created when using dmd or ldc2, respectively. Further, if I change __FUNCTION__ to any other string (except __MODULE__, which induces the same behavior), the program operates flawlessly. > > The error is: > > core.exception.InvalidMemoryOperationError@src/core/exception.d(700): Invalid memory operation > > and is thrown at the the time that when under normal operation the GC would start clearing out the old objects. > > _Importantly, my code never calls the default constructor_ -- This part puzzles me the most. > > Additional data points: > 1. I added a dummy log function to demonstrate it (likely?) not my library code offending > 2. adding assert statements into the default constructor seems to elide the offending behaviour (i.e., I can leave the __FUNCTION__ in without memory error) > 3. Commenting out the usage of __FUNCTION__ in the non-default ctor, which IS being used, does not prevent the error. > > Overall, the error seems to be related to passing __FUNCTION__ out of default class constructors, even when that constructor is not called explicitly by user code. Altering compiler behavior by e.g. adding assert(0) prevents the error. Thus, I conclude this is almost surely a compiler error, but I would be glad to be proven wrong. > > I would be happy to provide any more information, or to help develop an appropriate minimal reproducible bug, if given appropriate guidance. Otherwise, to replicate this you will need to install a C library. Details below. > > Thanks all in advance. > James > > --- > DMD: v2.081.2 > LDC2: 1.11.0 > repository: https://github.com/blachlylab/dhtslib/ > commit: ee87434d0c48919aaa0ff6d04d6e0ca403564618 (current as of this post) > requirement: htslib; https://github.com/samtools/htslib > > To replicate: > 1. Download any BAM file. (random BAM file: https://www.encodeproject.org/files/ENCFF765NLQ/@@download/ENCFF765NLQ.bam ) > 2. Update test/samreader.d to open your BAM file (sorry I have not parameterized this, have been tearing hair about re this crash for hours) > 3. `dub build -c sam_test && ./samreader` > > Attempt at minimal reproducible test-case (that does NOT trigger the error): repo/test/bug_minimal.d > > Offending code section: > (note that I wrote test_log as a substitute for hts_log_debug to prove that the error is still triggered even when not calling library code) > ``` > import core.stdc.stdlib: calloc, free; > import std.format; > import std.parallelism: totalCPUs; > import std.stdio: writeln, writefln; > import std.string: fromStringz, toStringz; > > import dhtslib.htslib.hts: htsFile, hts_open, hts_close; > import dhtslib.htslib.hts: hts_itr_t; > import dhtslib.htslib.hts: seq_nt16_str; > import dhtslib.htslib.hts: hts_set_threads; > > import dhtslib.htslib.hts_log; > import dhtslib.htslib.kstring; > import dhtslib.htslib.sam; > > void test_log(string ctx, string msg) > { > import std.stdio; > stderr.writeln(ctx, msg); > } > > /** > Encapsulates a SAM/BAM/CRAM record, > using the bam1_t type for memory efficiency, > and the htslib helper functions for speed. > **/ > class SAMRecord { > /// > bam1_t *b; > > /// > this() > { > //debug(dhtslib_debug) hts_log_debug(__FUNCTION__, "ctor()"); /// This line triggers memory error when __FUNCTION__, but not when "Other string" > test_log(__FUNCTION__, "ctor()"); /// This line will also trigger the memory error when __FUNCTION__, but not other strings > //writeln(__FUNCTION__); // this will not trigger the memory error > this.b = bam_init1(); > //assert(0); // This will elide(?) the memory error > //assert(1 == 2); // This will elide(?) the memory error > } > /// > this(bam1_t *b) > { > debug(dhtslib_debug) hts_log_debug(__FUNCTION__, "ctor(bam1_t *b)"); > this.b = b; > } > ~this() > { > writeln("Record dtor"); > debug(dhtslib_debug) hts_log_debug(__FUNCTION__, "dtor"); > //bam_destroy1(this.b); // we don't own it! > } > } > ``` https://github.com/blachlylab/dhtslib/blob/master/source/dhtslib/htslib/hts_log.d#L94 That is your problem. toStringz is going to attempt to allocate using the GC. Allocating during a GC collection causes an InvalidMemoryOperationError. Take that out of your destructor, and you should not cause that error. Alternatively, fix the logger so it doesn't use the GC. -Steve |
January 18, 2019 Re: I can crash program (core.exception.InvalidMemoryOperationError) by using __FUNCTION__ in a class ctor | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | On 1/18/19 10:35 AM, Steven Schveighoffer wrote: > > https://github.com/blachlylab/dhtslib/blob/master/source/dhtslib/htslib/hts_log.d#L94 > > > That is your problem. toStringz is going to attempt to allocate using the GC. Allocating during a GC collection causes an InvalidMemoryOperationError. > > Take that out of your destructor, and you should not cause that error. > > Alternatively, fix the logger so it doesn't use the GC. If I had to guess why __FUNCTION__ causes it and others do not, then I would say it's because toStringz has particular cases where it DOESN'T allocate, and that depends completely on the length of the string. https://github.com/dlang/phobos/blob/master/std/string.d#L364-L365 -Steve |
January 18, 2019 Re: I can crash program (core.exception.InvalidMemoryOperationError) by using __FUNCTION__ in a class ctor | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | On Friday, 18 January 2019 at 15:42:55 UTC, Steven Schveighoffer wrote:
> On 1/18/19 10:35 AM, Steven Schveighoffer wrote:
>
>>
>> https://github.com/blachlylab/dhtslib/blob/master/source/dhtslib/htslib/hts_log.d#L94
>>
>>
>> That is your problem. toStringz is going to attempt to allocate using the GC. Allocating during a GC collection causes an InvalidMemoryOperationError.
>>
>> Take that out of your destructor, and you should not cause that error.
>>
>> Alternatively, fix the logger so it doesn't use the GC.
>
> If I had to guess why __FUNCTION__ causes it and others do not, then I would say it's because toStringz has particular cases where it DOESN'T allocate, and that depends completely on the length of the string.
>
> https://github.com/dlang/phobos/blob/master/std/string.d#L364-L365
>
> -Steve
Steve: Thanks for your quick reply. Why does manipulation of the default constructor, which is never called, affect whether or not the destructor throws the error?
James
|
January 18, 2019 Re: I can crash program (core.exception.InvalidMemoryOperationError) by using __FUNCTION__ in a class ctor | ||||
---|---|---|---|---|
| ||||
Posted in reply to James Blachly | On 1/18/19 11:37 AM, James Blachly wrote: > On Friday, 18 January 2019 at 15:42:55 UTC, Steven Schveighoffer wrote: >> On 1/18/19 10:35 AM, Steven Schveighoffer wrote: >> >>> >>> https://github.com/blachlylab/dhtslib/blob/master/source/dhtslib/htslib/hts_log.d#L94 >>> >>> >>> >>> That is your problem. toStringz is going to attempt to allocate using the GC. Allocating during a GC collection causes an InvalidMemoryOperationError. >>> >>> Take that out of your destructor, and you should not cause that error. >>> >>> Alternatively, fix the logger so it doesn't use the GC. >> >> If I had to guess why __FUNCTION__ causes it and others do not, then I would say it's because toStringz has particular cases where it DOESN'T allocate, and that depends completely on the length of the string. >> >> https://github.com/dlang/phobos/blob/master/std/string.d#L364-L365 >> > > Steve: Thanks for your quick reply. Why does manipulation of the default constructor, which is never called, affect whether or not the destructor throws the error? It's a good question. My explanation above is slightly wrong, though, it's not just the length, but the address of the end byte. If the end byte is on a 4-byte boundary, then it allocates, otherwise it checks for a zero (which is very likely true, especially with string literals), and then just returns the pointer. When you have behavior changes based on where something is allocated, many bets are off. It's almost like the randomness of memory corruption. I would say, if you fix the destructor to not allocate, and your problems all go away, you should assume that's the fix and move on. Something to consider using instead of toStringz: https://github.com/dlang/phobos/blob/5e6fe2f9c8f72f4c3b0497dc363fc61d823ff489/std/internal/cstring.d#L77 This only allocates on the C heap (if necessary) and cleans itself up upon going out of scope. Use like this: import std.internal.cstring; auto cmsg = tempCString(msg); auto cctx = tempCString(ctx); hts_log(htsLogLevel.HTS_LOG_DEBUG, cctx, cmsg); -Steve |
January 18, 2019 Re: I can crash program (core.exception.InvalidMemoryOperationError) by using __FUNCTION__ in a class ctor | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | On Friday, 18 January 2019 at 17:03:59 UTC, Steven Schveighoffer wrote:
> import std.internal.cstring;
The only downside here is the internal namespace has no public guarantees; the phobos devs can remove or modify it as they want without notice.
|
January 18, 2019 Re: I can crash program (core.exception.InvalidMemoryOperationError) by using __FUNCTION__ in a class ctor | ||||
---|---|---|---|---|
| ||||
Posted in reply to Adam D. Ruppe | On 1/18/19 12:08 PM, Adam D. Ruppe wrote:
> On Friday, 18 January 2019 at 17:03:59 UTC, Steven Schveighoffer wrote:
>> import std.internal.cstring;
>
> The only downside here is the internal namespace has no public guarantees; the phobos devs can remove or modify it as they want without notice.
I'm 99.9% sure the behavior of tempCString won't be changed or removed in a way that breaks code.
-Steve
|
January 18, 2019 Re: I can crash program (core.exception.InvalidMemoryOperationError) by using __FUNCTION__ in a class ctor | ||||
---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | On Friday, January 18, 2019 10:17:02 AM MST Steven Schveighoffer via Digitalmars-d wrote:
> On 1/18/19 12:08 PM, Adam D. Ruppe wrote:
> > On Friday, 18 January 2019 at 17:03:59 UTC, Steven Schveighoffer wrote:
> >> import std.internal.cstring;
> >
> > The only downside here is the internal namespace has no public guarantees; the phobos devs can remove or modify it as they want without notice.
>
> I'm 99.9% sure the behavior of tempCString won't be changed or removed in a way that breaks code.
The safest thing for anyone wanting to use std.internal.cstring is to just copy the module into their own project, though arguably, it's one of those things that should be cleaned up and made public instead of being in internal (which would actually break any code currently using std.internal.cstring). Regardless, I would definitely _not_ advise anyone to just import something from internal, and I definitely don't want to see anyone trying to argue later that we can't change something from internal just because a bunch of people decided that something from there was useful and started importing it. While the risk of your code breaking later on because it's using something from internal _might_ be low, it is basically begging for it to be broken at some point.
- Jonathan M Davis
|
January 18, 2019 Re: I can crash program (core.exception.InvalidMemoryOperationError) by using __FUNCTION__ in a class ctor | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | On 1/18/19 1:37 PM, Jonathan M Davis wrote:
> On Friday, January 18, 2019 10:17:02 AM MST Steven Schveighoffer via
> Digitalmars-d wrote:
>> On 1/18/19 12:08 PM, Adam D. Ruppe wrote:
>>> On Friday, 18 January 2019 at 17:03:59 UTC, Steven Schveighoffer wrote:
>>>> import std.internal.cstring;
>>>
>>> The only downside here is the internal namespace has no public
>>> guarantees; the phobos devs can remove or modify it as they want without
>>> notice.
>>
>> I'm 99.9% sure the behavior of tempCString won't be changed or removed
>> in a way that breaks code.
>
> The safest thing for anyone wanting to use std.internal.cstring is to just
> copy the module into their own project, though arguably, it's one of those
> things that should be cleaned up and made public instead of being in
> internal (which would actually break any code currently using
> std.internal.cstring). Regardless, I would definitely _not_ advise anyone to
> just import something from internal, and I definitely don't want to see
> anyone trying to argue later that we can't change something from internal
> just because a bunch of people decided that something from there was useful
> and started importing it. While the risk of your code breaking later on
> because it's using something from internal _might_ be low, it is basically
> begging for it to be broken at some point.
I'm not speaking about rules or restrictions, I'm talking about stability. tempCString is pretty stable and likely won't change in any way except implementation details.
-Steve
|
January 18, 2019 Re: I can crash program (core.exception.InvalidMemoryOperationError) by using __FUNCTION__ in a class ctor | ||||
---|---|---|---|---|
| ||||
On Fri, Jan 18, 2019 at 11:37:02AM -0700, Jonathan M Davis via Digitalmars-d wrote: [...] > The safest thing for anyone wanting to use std.internal.cstring is to just copy the module into their own project, though arguably, it's one of those things that should be cleaned up and made public instead of being in internal (which would actually break any code currently using std.internal.cstring). Regardless, I would definitely _not_ advise anyone to just import something from internal, and I definitely don't want to see anyone trying to argue later that we can't change something from internal just because a bunch of people decided that something from there was useful and started importing it. While the risk of your code breaking later on because it's using something from internal _might_ be low, it is basically begging for it to be broken at some point. [...] Shouldn't std.internal.* declare all symbols as private to std.*, like with `package` or something? Or is that not expressible currently? It's a pretty bad idea for user code to depend on std.internal.*... T -- Philosophy: how to make a career out of daydreaming. |
Copyright © 1999-2021 by the D Language Foundation