May 09, 2017
On Tue, May 09, 2017 at 08:18:09AM +0000, Patrick Schluter via Digitalmars-d wrote:
> On Tuesday, 9 May 2017 at 06:15:12 UTC, H. S. Teoh wrote:
> > 
> > 	int my_func(mytype_t *input, outbuf_t *output_buf,
> > 	            char *buffer, int size)
> > 	{
> > 		/* Typical lazy way of null-checking (that will blow up
> > 		 * later) */
> > 		myhandle_t *h = input ? input->handle : 0;
> > 		writer_t *w = output_buf ? output_buf->writer : 0;
> > 		char *block = (char *)malloc(size);
> 
> Hey, you've been outed as a C++ programmer. A real C programmer never casts a void *. In that specific case, casting away the malloc() return can mask a nasty bug. If you have forgotten to include the header declaring the function, the compiler would assume an int returning function and the cast would suppress the righteous warning message of the compiler. On 64 bit machines the returned pointer would be truncated to the lower half.  Unfortunately on Linux, as the heap starts in the lower 4 GiB of address space, the code would run for a long time before it crashed. On Solaris-SPARC it would crash directly as binaries are loaded address 0x1_0000_0000 of the address space.

Ouch.  Haha, even I forgot about this particularly lovely aspect of C. Hooray, freely call functions without declaring them, and "obviously" they return int! Why not?

There's an even more pernicious version of this, in that the compiler blindly believes whatever you declare a symbol to be, and the declaration doesn't even have to be in a .h file or anything even remotely related to the real definition. Here's a (greatly) reduced example (paraphrased from an actual bug I discovered):

	module.c:
	-------
	int get_passwd(char *buf, int size);
	int func() {
		char passwd[100];
		if (!get_passwd(buf, 100)) return -1;
		do_something(passwd);
	}

	passwd.c:
	---------
	void get_passwd(struct user_db *db, struct login_record *rec) {
		... // stuff
	}

	old_passwd.c:
	-------------
	/* Please don't use this code anymore, it's deprecated. */
	/* ^^^^ gratuitous useless comment */
	int get_passwd(char *buf, int size) { ... /* old code */ }

Originally, in the makefile, module.o is linked with libutil.so, which in turn is built from old_passwd.o and a bunch of other stuff. Later on, passwd.o was added to libotherutil.so, which was listed after libutil.so in the linker command, so the symbol conflict was masked because the linker found the libutil.so version of get_passwd first.

Then one day, somebody changed the order of libraries in the makefile, and suddenly func() mysteriously starts malfunctioning because get_passwd now links to the wrong version of the function!

Worse yet, the makefile was written to be "smart", as in, it uses wildcards to pick up .so files (y'know, us lazy programmers don't wanna have to manually type out the name of every library). So when somebody tried to fix this bug by removing old_passwd.o from libotherutil.so altogether, the bug was still happening in other developers' machines, because a stale copy of the old version of libotherutil.so was still left in their source tree, so when *they* built the executable, it contains the bug, but the bug vanishes when built from a fresh checkout. Who knows how many hours were wasted chasing after this heisenbug.


[...]
> > 		if (w->buffered) { /* oops, possible null deref */
> > 			strncpy(buffer, input->data, size); /* oops, unterminated string */
> > 			if (w->write(buffer, size) != 0)
> > 				/* hmm, is 0 the error status, or is it -1? */
> > 				/* also, what if w->write == null? */
> 
> Or is it inspired by fwrite, which returns the number of written records? In that case 0 return might be an error or not, depends on size.

Yep, fwrite has an utterly lovely interface. The epitome of API design. :-D


[...]
> > 		if (fp) fclose(fp);	/* oops, uninitialized ptr deref */
> 
> Worse, you didn't check the return of fclose() on writing FILE.
> fclose() can fail if the disk was full. As the FILE is buffered, the
> last fwrite might not have flushed it yet. So it is the fclose() that
> will try to write the last block and that can fail, but the app
> wouldn't be able to even report it.
[...]

Haha, you're right.  NONE of the code I've ever dealt with even considers this case. None at all.  In fact, I don't even remember the last time I've seen C code that bothers checking the return value of fclose().  Maybe I've written it *once* in my lifetime when I was young and naïve, and actually bothered to notice the documentation that fclose() may sometimes fail. Even the static analysis tool we're using doesn't report it!!

So again Walter was spot on: fill up the disk to 99% full, and 99% of C programs would start malfunctioning and showing all kinds of odd behaviours, because they never check the return code of printf, fprintf, or fclose, or any of a whole bunch of other syscalls that are regularly *assumed* to just work, when in reality they *can* fail.

The worst part of all this is, this kind of C code is prevalent everywhere in C projects, including those intended for supposedly security-aware software.  Basically, the language itself is just so unfriendly to safe coding practices that it's nigh impossible to write safe code in it.  It's *theoretically* possible, certainly, but in practice nobody writes C code that way.  It is a scary thought indeeed, how much of our current infrastructure relies on software running this kind of code.  Something's gotta give, eventually. And it ain't gonna be pretty when it all starts crumbling down.


T

-- 
Caffeine underflow. Brain dumped.
May 09, 2017
On 05/09/2017 06:29 AM, Adrian Matoga wrote:
> On Tuesday, 9 May 2017 at 09:22:13 UTC, Joakim wrote:
>> On Tuesday, 9 May 2017 at 06:15:12 UTC, H. S. Teoh wrote:
>>> On Mon, May 08, 2017 at 06:33:08PM +0000, Jerry via Digitalmars-d wrote:
>>>> [...]
>>>
>>> Is that a subtle joke, or are you being serious?
>>>
>>> [...]
>>
>> Heh, I saw you wrote the post and knew it would be long, then I kept
>> scrolling and scrolling... :)
>>
>> Please, please, please submit this as a post on the D blog, perhaps
>> prefaced by the Walter/Scott exchange and with some links to the
>> issues you mention and the relevant portions of the D reference.  I
>> think it would do well.
>
> +1

+ a kajillion (give or take a few hundred)
May 09, 2017
On Tue, May 09, 2017 at 07:13:31AM -0700, Walter Bright via Digitalmars-d wrote:
> On 5/8/2017 1:55 PM, John Carter wrote:
> > On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
> > 
> > > Walter: I believe memory safety will kill C.
> > 
> > C/C++ has been granted an extension of life by the likes of valgrind and purify and *-sanitizer.
> 
> I agree. But one inevitably runs into problems relying on valgrind and other third party tools:
> 
> 1. it isn't part of the language

And it doesn't catch everything.


> 2. it may not be available on your platform

And may not be compatible with your target architecture -- a lot of C code, especially in the embedded realm, have curious target archs that could be problematic for 3rd party tools that need to inject runtime instrumentation.


> 3. somebody has to find it, install it, and integrate it into the dev/test process

This is a big one. Many large C projects have their own idiomatic way of building, which is often incompatible with 3rd party tools.  This is a major demotivator for people to want to use those tools, because it's a big time investment to configure the tool to work with the build scripts, and an error-prone and painful process to rework the build scripts to work with the tool. "Why break our delicate tower-of-cards build system that's worked just fine for 20 years, just to run this new-fangled whatever 3rd party thingy promulgated by these young upstarts these days?"


> 4. it's incredibly slow to run valgrind, so there are powerful tendencies to skip it

Yes, it's an extra step that the developer has to manually run, when he is already under an unreasonable deadline to meet an unreasonable customer request upon which hinges a $1M deal so you can't turn it down no matter how unreasonable it is.  He barely has enough time to write code that won't crash outright, nevermind writing *proper* code. Yet another extra step to run manually? Forget it, not gonna happen.  Not until a major crash happens on the customer's site that finally convinces the PTB to dictate the use of valgrind as a part of regular work routine.  Other than that, the chances of someone actually bothering to do it are slim indeed.


> valgrind is a stopgap measure, and has saved me much grief over the years, but it is NOT the solution.

Yes, it's a patch over the current festering wound so that, at least for the time being, the blood is out of sight.  But you can't wear that patch forever. Sooner or later the gangrene be visible on the surface. :-D


T

-- 
Change is inevitable, except from a vending machine.
May 09, 2017
On Tuesday, 9 May 2017 at 16:55:54 UTC, H. S. Teoh wrote:
> On Tue, May 09, 2017 at 08:18:09AM +0000, Patrick Schluter via
[...]
> Ouch.  Haha, even I forgot about this particularly lovely aspect of C. Hooray, freely call functions without declaring them, and "obviously" they return int! Why not?
>
> There's an even more pernicious version of this, in that the compiler blindly believes whatever you declare a symbol to be, and the declaration doesn't even have to be in a .h file or anything even remotely related to the real definition. Here's a (greatly) reduced example (paraphrased from an actual bug I discovered):
>
> 	module.c:
> 	-------
> 	int get_passwd(char *buf, int size);

yeah, this is a code smell. A not static declared function prototype in a C file. Raises the alarm bells automatically now. The same issue but much more frequent to observe, extern variable declaration in .c files. That one is really widespread and few see it as an anti-pattern. An extern global variable should always be put in the header file, never in the C file. Exactly for the same reason as your example with the wrong prototype below: non matching types the linker will join wrongly.

> 	int func() {
> 		char passwd[100];
> 		if (!get_passwd(buf, 100)) return -1;
> 		do_something(passwd);
> 	}
>
> 	passwd.c:
> 	---------
> 	void get_passwd(struct user_db *db, struct login_record *rec) {
> 		... // stuff
> 	}
>
> 	old_passwd.c:
> 	-------------
> 	/* Please don't use this code anymore, it's deprecated. */
> 	/* ^^^^ gratuitous useless comment */
> 	int get_passwd(char *buf, int size) { ... /* old code */ }
>
> Originally, in the makefile, module.o is linked with libutil.so, which in turn is built from old_passwd.o and a bunch of other stuff. Later on, passwd.o was added to libotherutil.so, which was listed after libutil.so in the linker command, so the symbol conflict was masked because the linker found the libutil.so version of get_passwd first.
>
> Then one day, somebody changed the order of libraries in the makefile, and suddenly func() mysteriously starts malfunctioning because get_passwd now links to the wrong version of the function!
>
> Worse yet, the makefile was written to be "smart", as in, it uses wildcards to pick up .so files (y'know, us lazy programmers don't wanna have to manually type out the name of every library).

Yeah, we also had makefiles using wildcards. It took a long time but I managed to get rid of them.

> So when somebody tried to fix this bug by removing old_passwd.o from libotherutil.so altogether, the bug was still happening in other developers' machines, because a stale copy of the old version of libotherutil.so was still left in their source tree, so when *they* built the executable, it contains the bug, but the bug vanishes when built from a fresh checkout. Who knows how many hours were wasted chasing after this heisenbug.
>
>
> [...]
>> > 		if (fp) fclose(fp);	/* oops, uninitialized ptr deref */
>> 
>> Worse, you didn't check the return of fclose() on writing FILE.
>> fclose() can fail if the disk was full. As the FILE is buffered, the
>> last fwrite might not have flushed it yet. So it is the fclose() that
>> will try to write the last block and that can fail, but the app
>> wouldn't be able to even report it.
> [...]
>
> Haha, you're right.  NONE of the code I've ever dealt with even considers this case. None at all.  In fact, I don't even remember the last time I've seen C code that bothers checking the return value of fclose().  Maybe I've written it *once* in my lifetime when I was young and naïve, and actually bothered to notice the documentation that fclose() may sometimes fail. Even the static analysis tool we're using doesn't report it!!

I discovered that one only a few month ago. I have now around 30 places in our code base to fix. It's only important for writing FILE. Reading FILE can ignore the return values.

>
> So again Walter was spot on: fill up the disk to 99% full, and 99% of C programs would start malfunctioning and showing all kinds of odd behaviours, because they never check the return code of printf, fprintf, or fclose, or any of a whole bunch of other syscalls that are regularly *assumed* to just work, when in reality they *can* fail.
>
> The worst part of all this is, this kind of C code is prevalent everywhere in C projects, including those intended for supposedly security-aware software.  Basically, the language itself is just so unfriendly to safe coding practices that it's nigh impossible to write safe code in it.  It's *theoretically* possible, certainly, but in practice nobody writes C code that way.  It is a scary thought indeeed, how much of our current infrastructure relies on software running this kind of code.  Something's gotta give, eventually. And it ain't gonna be pretty when it all starts crumbling down.

Agreed. That's why I'm learning D now, it's probably the only language that will be able to replace progressively our C code base in a skunk work fashion. I wanted to do it already 5 or 6 years ago but couldn't as we were on Solaris/SPARC back then. Now that we have migrated to Linux-AMD64 there's not much holding us back. Oracle client is maybe still an issue though.

May 09, 2017
On Tuesday, 9 May 2017 at 16:55:54 UTC, H. S. Teoh wrote:
> Ouch.  Haha, even I forgot about this particularly lovely aspect of C. Hooray, freely call functions without declaring them, and "obviously" they return int! Why not?

To be fair, most of your complaints can be fixed by enabling compiler warnings and by avoiding the use of de-facto-deprecated functions (strnlen).  The remaining problems theoretically shouldn't occur by disciplined use of commonly accepted C99 guidelines.  But I agree that even then and with the use of sanitizers writing correct C code remains very hard.
May 09, 2017
On Tue, May 09, 2017 at 11:09:27PM +0000, Guillaume Boucher via Digitalmars-d wrote:
> On Tuesday, 9 May 2017 at 16:55:54 UTC, H. S. Teoh wrote:
> > Ouch.  Haha, even I forgot about this particularly lovely aspect of C.  Hooray, freely call functions without declaring them, and "obviously" they return int! Why not?
> 
> To be fair, most of your complaints can be fixed by enabling compiler warnings and by avoiding the use of de-facto-deprecated functions (strnlen).

The problem is that warnings don't work, because people ignore them. Everybody knows warnings shouldn't be ignored, but let's face it, when you make a 1-line code change and run make, and the output is 250 pages long (large project, y'know), any warnings that are buried somewhere in there won't even be noticed, much less acted on.

In this sense I agree with Walter that warnings are basically useless, because they're not enforced. Either something is correct and compiles, or it should be an error that stops compilation. Anything else, and you start having people ignore warnings.

Yes I know, there's gcc -Werror and the analogous flags for the other compilers, but in a sufficiently large project, -Werror is basically impractical because too much of legacy code will just refuse to compile, and it's not feasible to rewrite / waste time fixing it.

As for avoiding de-facto-deprecated functions, I've already said it: *everybody* knows strcat is bad, and strcpy is bad, and so on and so forth.  So how come I still see new C code being written almost every day that continues to use these functions?  It's not that the coders refuse to cooperate... I've seen a lot of code in my project where people meticulously use strncpy instead of strcat / strcpy -- I presume out of the awareness that they are "bad".  But when push comes to shove and there's a looming deadline, all scruples are thrown to the winds and people just take the path of least resistance.  The mere fact that strcat and strcpy exist means that somebody, sometime, will use them, and usually to disastrous consequences.

And *that's* the fundamental problem with C (and in the same principle, C++): the correct way to write code is also a very onerous, fragile, error-prone, and verbose way of writing code. The "obvious" and "easy" way to write C code is almost always the wrong way.  The incentives are all wrong, and so there's a big temptation for people to cut corners and take the easy way out.

It's much easier to write this:

	int myfunc(context_t *ctx) {
		data_desc_t *desc = ctx->data;
		FILE *fp = fopen(desc->filename, "w");
		char *tmp = malloc(1000);
		strcpy(tmp, desc->data1);
		fwrite(tmp, strlen(tmp), 1, fp);
		strcpy(tmp, desc->data2);
		fwrite(tmp, strlen(tmp), 1, fp);
		strcpy(desc->cache, tmp);
		fclose(fp);
		free(tmp);
		return 0;
	}

rather than this:

	int myfunc(context_t *ctx) {
		data_desc_t *desc;
		FILE *fp;
		char *tmp;
		size_t bufsz;

		if (!ctx) return INVALID_CONTEXT;
		desc = ctx->data;

		if (!desc->data1 || !desc->data2) return INVALID_ARGS;

		fp = fopen("blah", "w");
		if (!fp) return CANT_OPEN_FILE;

		bufsz = desc->data1_len + desc->data2_len + 1;
		tmp = malloc(bufsz);
		if (!tmp) return OUT_OF_MEMORY;

		strncpy(tmp, desc->data1, bufsz);
		if (fwrite(tmp, strlen(tmp), 1, fp) != 1)
		{
			fclose(fp);
			unlink("blah");
			return IO_ERROR;
		}

		strncpy(tmp, desc->data2, bufsz);
		if (fwrite(tmp, strlen(tmp), 1, fp) != 1)
		{
			fclose(fp);
			unlink("blah");
			return IO_ERROR;
		}

		if (desc->cache)
			strncpy(desc->cache, tmp, sizeof(desc->cache));

		if (fclose(fp) != 0)
		{
			WARN("I/O error");
			free(tmp);
			return IO_ERROR;
		}
		free(tmp);
		return OK;
	}

Most people would probably write something in between, which is neither completely wrong, nor completely right. But it works for 90% of the cases, and since it meets the deadline, it's "good enough".

Notice how much longer and more onerous it is to write the "correct" version of the code than the easy way. A properly-designed language ought to reverse the incentives: the default, "easy" way to write code should be the "correct", safe, non-leaking way.  Potentially unsafe, potentially resource-leaking behaviour should require work on the part of the coder, so that he'd only do it when there's a good reason for it (optimization, or writing @system code that needs to go outside the confines of the default safe environment, etc.).

In this respect, D scores much better than C/C++.  Very often, the "easy" way to write something in D is also the correct way. It may not be the fastest way for the performance-obsessed premature-optimizing C hacker crowd (and I include myself among them), but it won't leak memory, overrun buffers, act on random stack values from uninitialized local variables, etc.. Your program is correct to begin with, which then gives you a stable footing to start working on improving its performance.  In C/C++, your program is most likely wrong to begin with, so imagine what happens when you try to optimize that wrong code in typical C/C++ hacker premature optimization fashion.

(Nevermind the elephant in the room that 80-90% of the "optimizations" C/C++ coders -- including myself -- have programmed into their finger reflexes are actually irrelevant at best, because either compilers already do those optimizations for you, or the hot spot simply isn't where we'd like to believe it is; or outright de-optimizing at worst, because we've successfully defeated the compiler's optimizer by writing inscrutable code.)

Null dereference is one area where D does no better than C/C++, though even in that case, language features like closures help alleviate much of the kind of code that would otherwise need to deal with pointers directly. (Yes, I'm aware C++ now has closures... but most of the C++ code out in the industry -- and C++ coders themselves -- have a *long* ways to go before they can catch up with the latest C++ standards. Until then, it's lots of manual pointer manipulations that are ready to explode in your face anytime.)


> The remaining problems theoretically shouldn't occur by disciplined use of commonly accepted C99 guidelines.  But I agree that even then and with the use of sanitizers writing correct C code remains very hard.

That's another fundamental problem with the C/C++ world: coding by convention.  We all know all too well that *if* we'd only abide by such-and-such coding guidelines and recommendations, our code would actually stand a chance of being correct, safe, non-leaking, etc.. However, the problem with conventions is that they are just that: conventions. They get broken all the time, with disastrous consequences. I used to believe in convention -- after all, who wouldn't want to be goodie-two-shoes coders who abides by all the rules so that they could take pride in their shiny, perfect code?  Unfortunately, after almost 20 years working in the industry and seeing "enterprise" code that makes my eyes bleed, I've lost all confidence that conventions are of any help. I've seen code written by supposedly "renown" or "expert" C coders that represent some of the most repulsive, stomach-turning examples of antipatterns I've ever had the dubious pleasure of needing to debug.

D's stance of static verifiability and compile-time guarantees is an oft under-appreciated big step in the right direction.  In the long run, conventions will not solve anything; you need *enforcement*. The compiler has to be able to prove, at compile-time, that function X is actually pure, or nothrow, or @safe, or whatever, for those things to have any value whatsoever. And for this to be possible, the language itself needs to have these notions built-in, rather than have it tacked on by an external tool (that people will be reluctant to use, or outright ignore, or iti doesn't work with their strange build system, target arch, or whatever).

Sure, there are currently implementation bugs that make @safe not quite so safe in some cases, or too much of Phobos is still @safe-incompatible. But still, these are implementation quality issues. The concept itself is a sound and powerful one.  A compiler-verified attribute is far more effective than any blind faith trust in convention ever will be, e.g., D's immutable vs. C++'s easy-to-cast-away const -- that we *trust* people won't attempt. Yes, I'm aware of bugs in the current implementation that allows you to bypass immutable, but still, it's a QoI issue.  And yes, there are areas in the spec that have holes, etc.. But assuming these QoI issues and spec holes / inconsistencies are fixed, what we have is a powerful system that will actually deliver compile-time guarantees about memory safety, rather than a system of conventions that you can never be too sure that somebody somewhere didn't break, and therefore you can only *hope* that it is memory-safe.


T

-- 
Life is too short to run proprietary software. -- Bdale Garbee
May 09, 2017
On 05/09/2017 08:30 PM, H. S. Teoh via Digitalmars-d wrote:
>
> In this sense I agree with Walter that warnings are basically useless,
> because they're not enforced. Either something is correct and compiles,
> or it should be an error that stops compilation. Anything else, and you
> start having people ignore warnings.
>

Not 100% useless. I'd much rather risk a warning getting ignored that NOT be informed of something the compiler noticed but decided "Nah, some people ignore warnings so I'll just look the other way and keep my mouth shut". (Hogan's Compiler Heroes: "I see NUH-TING!!")

And then the flip side is that some code smells are just to pedantic to justify breaking the build while the programmer is in the middle of some debugging or refactoring or some such.

That puts me strongly in the philosophy of "Code containing warnings: Allowed while compiling, disallowed when committing (with allowances for mitigating circumstances)."

C/C++ doesn't demonstrate that warnings are doomed to be useless and "always" ignored. What it demonstrates is that warnings are NOT an appropriate strategy for fixing language problems.


> As for avoiding de-facto-deprecated functions, I've already said it:
> *everybody* knows strcat is bad, and strcpy is bad, and so on and so
> forth.  So how come I still see new C code being written almost every
> day that continues to use these functions?  It's not that the coders
> refuse to cooperate... I've seen a lot of code in my project where
> people meticulously use strncpy instead of strcat / strcpy -- I presume
> out of the awareness that they are "bad".  But when push comes to shove
> and there's a looming deadline, all scruples are thrown to the winds and
> people just take the path of least resistance.  The mere fact that
> strcat and strcpy exist means that somebody, sometime, will use them,
> and usually to disastrous consequences.

The moral of this story: Sometimes, breaking people's code is GOOD! ;)


> And *that's* the fundamental problem with C (and in the same principle,
> C++): the correct way to write code is also a very onerous, fragile,
> error-prone, and verbose way of writing code. The "obvious" and "easy"
> way to write C code is almost always the wrong way.  The incentives are
> all wrong, and so there's a big temptation for people to cut corners and
> take the easy way out.

Damn straight :)


> (Nevermind the elephant in the room that 80-90% of the "optimizations"
> C/C++ coders -- including myself -- have programmed into their finger
> reflexes are actually irrelevant at best, because either compilers
> already do those optimizations for you, or the hot spot simply isn't
> where we'd like to believe it is; or outright de-optimizing at worst,
> because we've successfully defeated the compiler's optimizer by writing
> inscrutable code.)

C++'s fundamental paradigm has always been "Premature-optimization oriented programming". C++ promotes POOP.


> That's another fundamental problem with the C/C++ world: coding by
> convention.  We all know all too well that *if* we'd only abide by
> such-and-such coding guidelines and recommendations, our code would
> actually stand a chance of being correct, safe, non-leaking, etc..

Luckily, there IS a way to enforce that proper coding conventions are actually adhered to: It's called "compile-time error". :)


May 10, 2017
On Wednesday, 10 May 2017 at 00:30:42 UTC, H. S. Teoh wrote:
> 		strncpy(tmp, desc->data1, bufsz);
> 		if (fwrite(tmp, strlen(tmp), 1, fp) != 1)
> 		{
> 			fclose(fp);
> 			unlink("blah");
> 			return IO_ERROR;
> 		}
>
> 		strncpy(tmp, desc->data2, bufsz);
> 		if (fwrite(tmp, strlen(tmp), 1, fp) != 1)
> 		{
> 			fclose(fp);
> 			unlink("blah");
> 			return IO_ERROR;
> 		}

I think you cause a memory leak in these branches because you forget to free tmp before returning.

Side note: scope(exit) is one of the best inventions in PLs ever.
May 09, 2017
On Wed, May 10, 2017 at 01:32:33AM +0000, Jack Stouffer via Digitalmars-d wrote:
> On Wednesday, 10 May 2017 at 00:30:42 UTC, H. S. Teoh wrote:
> > 		strncpy(tmp, desc->data1, bufsz);
> > 		if (fwrite(tmp, strlen(tmp), 1, fp) != 1)
> > 		{
> > 			fclose(fp);
> > 			unlink("blah");
> > 			return IO_ERROR;
> > 		}
> > 
> > 		strncpy(tmp, desc->data2, bufsz);
> > 		if (fwrite(tmp, strlen(tmp), 1, fp) != 1)
> > 		{
> > 			fclose(fp);
> > 			unlink("blah");
> > 			return IO_ERROR;
> > 		}
> 
> I think you cause a memory leak in these branches because you forget to free tmp before returning.

Well, there ya go. Case in point. Even when you're consciously trying to write "proper" C code, there are just far too many ways things can go wrong that slip-ups are practically inevitable.

Eventually, the idiom that I (and others) eventually converged on looks
something like this:

	int myfunc(blah_t *blah, bleh_t *bleh, bluh_t *bluh) {
		void *resource1, *resource2, *resource3;
		int ret = RET_ERROR;

		/* Vet arguments */
		if (!blah || !bleh || !bluh)
			return ret;

		/* Acquire resources */
		resource1 = acquire_resource(blah->blah);
		if (!resource1) goto EXIT;

		resource2 = acquire_resource(bleh->bleh);
		if (!resource1) goto EXIT;

		resource3 = acquire_resource(bluh->bluh);
		if (!resource1) goto EXIT;

		/* Do actual work */
		if (do_step1(blah, resource1) == RET_ERROR)
			goto EXIT;

		if (do_step2(blah, resource1) == RET_ERROR)
			goto EXIT;

		if (do_step3(blah, resource1) == RET_ERROR)
			goto EXIT;

		ret = RET_OK;
	EXIT:
		/* Cleanup everything */
		if (resource3) release(resource3);
		if (resource2) release(resource2);
		if (resource1) release(resource1);

		return ret;
	}

In other words, we just converged to what essentially amounts to a try-catch block with the manual equivalent of RAII. After a while, this is pretty much the only way to have any confidence at all that there isn't any hidden resource leaks or other such errors in the code.

Of course, this is only the first part of the equation. There's also managing buffers and arrays safely, which still needs to be addressed. We haven't quite gotten there yet, but at least some of the code now has started moving away from C standard library string functions completely, and towards a common string buffer type where you work with a struct wrapper with functions for appending data, extracting the result, etc.. IOW, we're slowly reinventing a properly-encapsulated string type that's missing from the language.

So eventually, after so much effort chasing down pointer bugs, buffer overflows, resource leaks, and the rest of C's endless slew of pitfalls, we're gradually reinventing RAII, try-catch blocks, and string types all over again. It's like historians are repeating each other^W^W^W^W^W I mean, history is repeating itself. :-D  It makes me wonder how much longer it will take for us to gradually converge onto features that today we're enjoying in D. Will it take another decade of segfaults, untraceable pointer bugs, security holes, and memory leaks?  Who knows. I'm secretly hoping that between now and then, D finally takes off and we can finally shed this dinosaur age language that should have died after the 70's or latest the 80's, yet still persists to this day.


> Side note: scope(exit) is one of the best inventions in PLs ever.

Ironically, D has gone so far past the woes that still plague C coders every day that scope(exit) is hardly ever used in D anymore. :-P  It has its uses, certainly, but in my regular D code, I'm already benefitting so much from D's other features that I can hardly think of a use case for scope(exit) anymore, in the context of idiomatic D.  I do regularly find myself wishing for scope(exit) in my C code, though!


T

-- 
Век живи - век учись. А дураком помрёшь.
May 09, 2017
On Tue, May 09, 2017 at 09:19:08PM -0400, Nick Sabalausky (Abscissa) via Digitalmars-d wrote:
> On 05/09/2017 08:30 PM, H. S. Teoh via Digitalmars-d wrote:
> > 
> > In this sense I agree with Walter that warnings are basically useless, because they're not enforced. Either something is correct and compiles, or it should be an error that stops compilation. Anything else, and you start having people ignore warnings.
> > 
> 
> Not 100% useless. I'd much rather risk a warning getting ignored that NOT be informed of something the compiler noticed but decided "Nah, some people ignore warnings so I'll just look the other way and keep my mouth shut".  (Hogan's Compiler Heroes: "I see NUH-TING!!")

I'd much rather the compiler say "Hey, you! This piece of code is probably wrong, so please fix it! If it was intentional, please write it another way that makes that clear!" - and abort with a compile error.

This is actually one of the things I like about D. For example, if you wrote:

	switch (e) {
		case 1: return "blah";
		case 2: return "bluh";
	}

the compiler will refuse to compile the code until you either add a default case, or make it a final switch (in which case the compiler will refuse the compile the code unless every possible case is in fact covered).

Now imagine if this was merely a warning that people could just ignore.

Yep, we're squarely back in good ole C/C++ land, where an unexpected value of e causes the code to amble down an unexpected path, with the consequent hilarity that ensues.

IOW, it should not be possible to write tricky stuff by default; you should need to ask for it explicitly so that intent is clear.  Another switch example:

	switch (e) {
		case 1: x = 2;
		case 2: x = 3;
		default: x = 4;
	}

In C, the compiler happily compiles the code for you.  In D, at least the latest dmd will give you deprecation warnings (and presumably, in the future, actual compile errors) for forgetting to write `break;`. But if the fallthrough was intentional, you document that with an explicit `goto case ...`. IOW, the default behaviour is the safe one (no fallthrough), and the non-default behaviour (fallthrough) has to be explicitly asked for.  Much, much better.


> And then the flip side is that some code smells are just to pedantic to justify breaking the build while the programmer is in the middle of some debugging or refactoring or some such.
> 
> That puts me strongly in the philosophy of "Code containing warnings: Allowed while compiling, disallowed when committing (with allowances for mitigating circumstances)."

I'm on the fence about the former.  My current theory is that being forced to write "proper" code even while refactoring actually helps the quality of the resulting code.   But I definitely agree that code with warnings should never make it into the code repo.  The problem is that it's not enforced by the compiler, so *somebody* somewhere will inevitably bypass it.


> C/C++ doesn't demonstrate that warnings are doomed to be useless and "always" ignored. What it demonstrates is that warnings are NOT an appropriate strategy for fixing language problems.

Point.  I suppose YMMV, but IME unless warnings are enforced with -Werror or equivalent, after a while people just stop paying attention to them, at least where I work.  It's entirely possible that it's a bias specific to my job, but somehow I have a suspicion that this isn't completely the case.  Humans tend to be lazy, and ignoring compiler warnings is rather high up on the list of things lazy people tend to do. The likelihood increases with the presence of other factors like looming deadlines, unreasonable customer requests, ambiguous feature specs handed down from the PTBs, or just plain having too much on your plate to be bothering with "trivialities" like fixing compiler warnings.

That's why my eventual conclusion is that anything short of enforcement will ultimately fail. Unless there is no way you can actually get an executable out of badly-written code, there will always be *somebody* out there that will write bad code. And by Murphy's Law, that somebody will eventually be someone in your team, and chances are you'll be the one cleaning up the mess afterwards.  Not something I envy doing (I've already had to do too much of that).


[...]
> The moral of this story: Sometimes, breaking people's code is GOOD! ;)

Tell that to Walter / Andrei. ;-)


[...]
> > (Nevermind the elephant in the room that 80-90% of the "optimizations" C/C++ coders -- including myself -- have programmed into their finger reflexes are actually irrelevant at best, because either compilers already do those optimizations for you, or the hot spot simply isn't where we'd like to believe it is; or outright de-optimizing at worst, because we've successfully defeated the compiler's optimizer by writing inscrutable code.)
> 
> C++'s fundamental paradigm has always been "Premature-optimization oriented programming". C++ promotes POOP.

LOL!!

Perhaps I'm just being cynical, but my current unfounded hypothesis is that the majority of C/C++ programmers don't use a profiler, and don't *want* to use a profiler, because they're either ignorant that such things exist (unlikely), or they're too dang proud to admit that their painfully-accumulated preconceptions about optimization might possibly be wrong.

Or maybe my perceptions are just heavily colored by the supposedly "expert" C coders I've met, who wrote supposedly better code that I eventually realized was actually not better, but in many ways actually worse -- less readable, less maintainable, more error-prone to write, and at the end of the day arguably less performant because it ultimately led to far too much boilerplate and other sources of code bloat, excessive string copying, too much indirection (cache unfriendliness), and other such symptoms that C coders often overlook.

(And meanwhile, the mere mention of the two letters "G C" and they instantly recoil, and rattle of an interminable list of 20-years-outdated GC-phobic excuses, preferring rather to die the death of a thousand pointer bugs (and memory leaks, and overrun buffers) than succumb to the Java of the early 90's with its klunky, poorly-performing GC of spotted repute that has long since been surpassed. And of course, any mention of any evidence that Java *might* actually perform better than poorly-written C code in some cases will incite instant vehement denial. After all, how can an "interpreted" language possibly outperform poorly-designed, over-engineered C scaffolding that necessitates far too much excessive buffer copying and destroys cache coherence with far too many unnecessary indirections? Inconceivable!)


> > That's another fundamental problem with the C/C++ world: coding by convention.  We all know all too well that *if* we'd only abide by such-and-such coding guidelines and recommendations, our code would actually stand a chance of being correct, safe, non-leaking, etc..
> 
> Luckily, there IS a way to enforce that proper coding conventions are actually adhered to: It's called "compile-time error". :)

Exactly. Not compiler warnings... :-D


T

-- 
You have to expect the unexpected. -- RL