May 10, 2017
On Wed, May 10, 2017 at 04:38:48AM -0700, Ali Çehreli via Digitalmars-d wrote:
> On 05/09/2017 10:26 PM, H. S. Teoh via Digitalmars-d wrote:
> > On Wed, May 10, 2017 at 01:32:33AM +0000, Jack Stouffer via Digitalmars-d
> wrote:
> >> On Wednesday, 10 May 2017 at 00:30:42 UTC, H. S. Teoh wrote:
[...]
> >>> 		strncpy(tmp, desc->data2, bufsz);
> >>> 		if (fwrite(tmp, strlen(tmp), 1, fp) != 1)
> >>> 		{
> >>> 			fclose(fp);
> >>> 			unlink("blah");
> >>> 			return IO_ERROR;
> >>> 		}
> >>
> >> I think you cause a memory leak in these branches because you forget to free tmp before returning.
> >
> > Well, there ya go. Case in point.
> 
> I caught that too but I thought you were testing whether we were listening.  ;)

Haha, I guess I'm not as good of a C coder as I'd like to think I am. :-D


[...]
> > 		/* Acquire resources */
> > 		resource1 = acquire_resource(blah->blah);
> > 		if (!resource1) goto EXIT;
> >
> > 		resource2 = acquire_resource(bleh->bleh);
> > 		if (!resource1) goto EXIT;
> 
> Copy paste error! :p (resource1 should be resource2.)
> 
> >
> > 		resource3 = acquire_resource(bluh->bluh);
> > 		if (!resource1) goto EXIT;
> 
> Ditto.

Ouch.  Ouch.  :-D

But then again, I've actually seen similar copy-paste errors in real code before, too. Sometimes they could be overlooked for >5 years (I kid you not, I actually checked the date in svn blame / svn log).


[...]
> As an improvement, consider hiding the checks and the goto statements in macros:
> 
>     resource2 = acquire_resource(bleh->bleh);
>     exit_if_null(resource1);
> 
>     err = do_step2(blah, resource1);
>     exit_if_error(err);
> 
> Or something similar... Obviously, it requires certain standardization like functions never having a goto statement, yet all having an EXIT area, etc.  It makes C code very uniform, which is a good thing as you notice nonstandard idioms quickly.

Yes, eventually this is the only sane and consistent way to deal with these problems.  Unfortunately, in C this can only be done by convention, which means that some non-conforming code will inevitably slip through and cause havoc.

Also, this starts running dangerously near the slippery slope down into macro hell, where the project accretes its own idiomatic set of inscrutable macro usage conventions and eventually almost all of the C syntax has disappeared and the code no longer looks like C.  Then along comes New Recruit, and he makes a right mess with it because he doesn't understand the 15-level-deep nested macros in the global include/macros.h file that's become a 5200-line monstrosity of unreadable CPP hacks.  (Also not exaggerating: the very project I'm working on has a module that's written this way, and only the initiated dare dream of fixing bugs in those macros. Fortunately, they have not yet nested to 15 levels deep, so for the most part you just copy and paste existing working code and pray that it will Just Work by analogy. Actually understand what you just wrote? Pfeh! You don't have time for that. The customer wants the release by last week. Copy-n-paste cargo cult FTW!)


> This safer way of needing to do everything in steps of two lines is one of the reasons why I was convinced that exceptions are superior to return codes.
[...]

Yeah, once practically every single statement in your function is an if-statement checking for error codes, you start wondering, why can't the language abstract this nasty boilerplate away for me?! And then the need for exceptions becomes clear.


T

-- 
Written on the window of a clothing store: No shirt, no shoes, no service.
May 10, 2017
On Wed, May 10, 2017 at 12:34:05PM +0000, Guillaume Boucher via Digitalmars-d wrote: [...]
> In modern C and with GLib (which makes use of a gcc/clang extension) you can
> write this as:
> 
> gboolean myfunc(blah_t *blah, bleh_t *bleh, bluh_t *bluh) {
>         /* Cleanup everything automatically at the end */
>         g_autoptr(GResource) resource1 = NULL, resource2 = NULL, resource3 =
> NULL;
>         gboolean ok;
> 
>         /* Vet arguments */
>         g_return_if_fail(blah != NULL, FALSE);
>         g_return_if_fail(bleh != NULL, FALSE);
>         g_return_if_fail(bluh != NULL, FALSE);
> 
> 	/* Acquire resources */
> 	ok = acquire_resource(resource1, blah->blah);
> 	g_return_if_fail(ok, FALSE);
> 
>         ok = acquire_resource(resource2, bleh->bleh);
> 	g_return_if_fail(ok, FALSE);
> 
> 	ok = acquire_resource(resource3, bluh->bluh);
> 	g_return_if_fail(ok, FALSE);
> 
>         /* Do actual work */
> 	ok = do_step1(blah, resource1);
> 	g_return_if_fail(ok, FALSE);
> 
> 	ok = do_step2(blah, resource1);
> 	g_return_if_fail(ok, FALSE);
> 
> 	return do_step3(blah, resource1);
> }
[...]

Yes, this would address the problem somewhat, but the problem is again, this is programming by convention.  The language doesn't enforce that you have to write code this way, and because it's not enforced, *somebody* will ignore it and write things the Bad Ole Way.  You're essentially writing in what amounts to a subdialect of C using Glib idioms, and that's not a bad thing in and of itself. But the larger language that includes all the old unsafe ways of writing code is still readily accessible.  By Murphy's Law, somebody will eventually write something that breaks the idiom and causes problems.

Also, because this way of writing code is not part of the language, the compiler cannot verify that you're using the macros correctly.  And it cannot verify that you didn't write goto labels or other things that might conflict with the way the macros are implemented. Lack of hygiene in C macros does not help in this respect.

I don't dispute that there are ways of writing correct (or mostly correct) C code.  But the problem is that these ways of writing correct C code are (1) non-obvious to someone not already in the know, and so you will always have people who either don't know about them or aren't sufficiently well-versed in them to use them effectively; and (2) not statically enforceable because they are not a part of the language. Lack of enforcement, in the long run, can only end in disaster, because programming by convention does not work.  It works as long as the convention is kept, but humans are fallible, and we all know how well humans are at keeping conventions over a sustained period of time (or even just short periods of time).

Not even D is perfect in this regard, but it has taken significant steps in the right directions.  Correct-by-default (well, for the most part anyway, barring compiler bugs / spec issues) and static guarantees (verified by the compiler -- again barring compiler bugs) are major steps forward.  Ultimately, I'm unsure how far a language can go at static guarantees: I think somewhere along the line human error will still be unpreventable because you start running into the halting problem when verifying certain things. But I certainly think there's still a LOT that can be done by the language between here and there, much more than what we have today.


T

-- 
Mediocrity has been pushed to extremes.
May 10, 2017
On Wed, May 10, 2017 at 11:16:57AM +0000, Atila Neves via Digitalmars-d wrote: [...]
> The likelihood of a randomly picked C/C++ programmer not even knowing what a profiler is, much less having used one, is extremely high in my experience.  I worked with a lot of embedded C programmers with several years of experience who knew nothing but embedded C. We're talking dozens of people here. Not one of them had ever used a profiler. In fact, a senior developer (now tech lead) doubted I could make our build system any faster. I did by 2 orders of magnitude.

Very nice!  Reminds me of an incident many years ago where I "optimized" a shell script that took >2 days to generate a report by rewriting it Perl, which produced the report in 2 mins. (Don't ask why somebody thought it was a good idea to write a report generation script as a *shell script*, of all things. You really do not want to know.)


> When I presented the result to him he said in disbelief: "But, how? I mean, if it's doing exactly the same thing, how can it be faster?". Big O?  Profiler? What are those? I actually stood there for a few seconds with my mouth open because I didn't know what to say back to him.

Glad to hear I'm not the only one faced with senior programmers who show surprising ignorance in matters you'd think they really ought to know like the back of their hand.


> These people are also likely to raise concerns about performance during code review despite having no idea what a cache line is. They still opine that one shouldn't add another function call for readability because that'll hurt performance. No need to measure anything, we all know calling functions is bad, even when they're in the same file and the callee is `static`.

Yep, typical C coder premature optimization syndrome.

I would not be surprised if today there's still a significant number of C coders who believe that writing "i++;" is faster than writing "i=i+1;".  Ironically, these same people would also come up with harebrained schemes of avoiding something they are prejudiced against, like C++ standard library string types, while ignoring the cost of needing to constantly call O(n) algorithms for string processing (strlen, strncpy, etc.).

I remember many years ago when I was still young and naïve, in one of projects, I spent days micro-optimizing my code to eliminate every last CPU cycle I could from my linked-list type, only to discover to my chagrin that the bottleneck was nowhere near it -- it was caused by a debugging fprintf() that I had forgotten to take out.  And I had only found this out because I finally conceded to run a profiler.  That was when this amazing concept finally dawned on me that I could possibly be *wrong* about my ideas of performant code, imagine that!

(Of course, then later on I discovered that my meticulously optimized linked list was ultimately worthless, because it has O(n) complexity, whereas had I just used a library type instead, I could've had O(log n) complexity.  But I had dismissed the library type because it was obviously "too complex" to possibly be performant enough for my oh-so-performance-critical code. (Ahem.  It was a *game*, and not even a good one. But it absolutely needed every last drop of juice I could squeeze from the CPU.  Oh yes.))


> I think a lot of us underestimate just how bad the "average" developer is. A lot of them write C code, which is like giving chainsaws to chimpanzees.
[...]

Hmm. What would giving them D be equivalent to, then? :-D


T

-- 
If you're not part of the solution, you're part of the precipitate.
May 10, 2017
On Wed, May 10, 2017 at 12:06:46PM +0000, Patrick Schluter via Digitalmars-d wrote:
> On Wednesday, 10 May 2017 at 06:28:31 UTC, H. S. Teoh wrote:
> > On Tue, May 09, 2017 at 09:19:08PM -0400, Nick Sabalausky
> [...]
> > Perhaps I'm just being cynical, but my current unfounded hypothesis is that the majority of C/C++ programmers ...
> 
> Just a nitpick, could we also please stop conflating C and C++ programmers?  My experience is that C++ programmer are completely clueless when it comes to C programming? They think they know C but it's generally far away. The thing is, that C has evolved with C99 and C11 and the changes have not all been adopted by C++ (and Microsoft actively stalling the adoption of C99 in Visual C didn't help either).

OK, I'll try to stop conflating them... but the main reason for that is because I find myself stuck in-between the two, having started myself on C (well, assembly before that, but anyway) then moved on to C++, only to grow skeptical of C++'s direction of development and eventually settling on a hybrid of the two commonly known as "C with classes" (i.e., a dialect of C++ without some of what I consider to be poorly-designed features).  Recently, though, I've mostly been working on pure C because of my job. I used to still use "C with classes" in my own projects but after I found D, I'd essentially swore myself off ever using C++ in my own projects again.

My experience reviewing the C++ code that comes up every now and then at work, though, tells me that the average typical C++ programmer is probably worse than the average typical C programmer when it comes to code quality.  And C++ gives you just so many more ways to shoot yourself in the foot.  The joke used to go that C gives you many ways to shoot yourself in the foot, but C++ gives you many ways to shoot yourself in the foot and then encapsulate all the evidence away, all packaged in one convenient wrapper.

(And don't get me started on C++ "experts" who invent extravagantly over-engineered class hierarchies that nobody can understand and 90% of which is actually completely irrelevant to the task at hand, resulting in such abysmal performance that people just bypassed the whole thing in the first place and revert to copy-pasta-ism and using C hacks in C++ code, causing double the carnage.  Once I had to invent a stupendous hack to bridge a C++ daemon with a C module whose owners flatly refused to link in any C++ libraries. The horrendous result had 7 layers of abstraction just to make a single function call, one of which involved fwrite()-ing function arguments to a file, fork-and-exec'ing, and fread()-ing it from the other end. Why didn't I just open a socket to the daemon directly? Because the ridiculously over-engineered daemon only understands the reverse-encrypted Klingon protocol spoken by a makefile-generated IPC wrapper file containing 2000 procedurally-generated templates (I kid you not, I'm not talking about 2000 instantiations of one template, I'm talking about 2000 templates which are themselves procedurally generated), and the only way you could speak this protocol was to use the resultant ridiculously bloated C++ library. Which the PTBs have dictated that I cannot link into the C module. What else was a man to do?)


T

-- 
Try to keep an open mind, but not so open your brain falls out. -- theboz
May 10, 2017
On Wednesday, 10 May 2017 at 18:58:35 UTC, H. S. Teoh wrote:
> On Wed, May 10, 2017 at 11:16:57AM +0000, Atila Neves via Digitalmars-d wrote: [...]
>> [...]
>
> Very nice!  Reminds me of an incident many years ago where I "optimized" a shell script that took >2 days to generate a report by rewriting it Perl, which produced the report in 2 mins. (Don't ask why somebody thought it was a good idea to write a report generation script as a *shell script*, of all things. You really do not want to know.)
>
> [...]

> Hmm. What would giving them D be equivalent to, then? :-D

I'm not sure! If I knew you were going to ask that I'd probably have picked a different analogy ;)

Atila
May 11, 2017
On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
> Walter: Anything that goes on the internet.
https://bugs.chromium.org/p/project-zero/issues/detail?id=1252&desc=5 - a vulnerability in an application that doesn't go on the internet.
May 11, 2017
On Thursday, 11 May 2017 at 09:39:57 UTC, Kagamin wrote:
> On Saturday, 6 May 2017 at 06:26:29 UTC, Joakim wrote:
>> Walter: Anything that goes on the internet.
> https://bugs.chromium.org/p/project-zero/issues/detail?id=1252&desc=5 - a vulnerability in an application that doesn't go on the internet.

To be fair, if you're not on the internet, you're unlikely to get any files that will trigger that bug in Microsoft's malware checker, as they noted that they first saw it on a website on the internet.  Of course, you could still get such files on a USB stick, which just highlights that unless you completely shut in your computer from the world, you can get bit, just slower and with less consequences than on the internet.

I wondered what that Project Zero topic had to do with Chromium, turns out it's a security team that google started three years ago to find zero-day holes in almost any software.  That guy from the team also found the recently famous Cloudbleed bug that affected Cloudflare.

They have a blog up that details holes they found in all kinds of stuff, security porn if you will: ;)

https://googleprojectzero.blogspot.com
May 11, 2017
On Monday, May 08, 2017 23:15:12 H. S. Teoh via Digitalmars-d wrote:
> Recently I've had the dubious privilege of being part of a department wide push on the part of my employer to audit our codebases (mostly C, with a smattering of C++ and other code, all dealing with various levels of network services and running on hardware expected to be "enterprise" quality and "secure") and fix security problems and other such bugs, with the help of some static analysis tools. I have to say that even given my general skepticism about the quality of so-called "enterprise" code, I was rather shaken not only to find lots of confirmation of my gut feeling that there are major issues in our codebase, but even more by just HOW MANY of them there are.

In a way, it's amazing how successful folks can be with software that's quite buggy. A _lot_ of software works just "well enough" that it gets the job done but is actually pretty terrible. And I've had coworkers argue to me before that writing correct software really doesn't matter - it just has to work well enough to get the job done. And sadly, to a great extent, that's true.

However, writing software that's works just "well enough" does come at a cost, and if security is a real concern (as it increasingly is), then that sort of attitude is not going to cut it. But since the cost often comes later, I don't think that it's at all clear that we're going to really see a shift towards languages that prevent such bugs. Up front costs tend to have a powerful impact on decision making - especially when the cost that could come later is theoretical rather than guaranteed.

Now, given that D is also a very _productive_ language to write in, it stands to reduce up front costs as well, and that combined with its ability to reduce the theoretical security costs, we could have a real win, but with how entrenched C and C++ are and how much many companies are geared towards not caring about security or software quality so long as the software seems to get the job done, I think that it's going to be a _major_ uphill battle for a language like D to really gain mainstream use on anywhere near the level that languages like C and C++ have. But for those who are willing to use a language that makes it harder to write code with memory safety issues, there's a competitive advantage to be gained.

- Jonathan M Davis

May 11, 2017
On Wednesday, 10 May 2017 at 17:51:38 UTC, H. S. Teoh wrote:
> Haha, I guess I'm not as good of a C coder as I'd like to think I am. :-D
>

That comment puts you ahead of the pack already :)
May 11, 2017
On Thursday, 11 May 2017 at 09:39:57 UTC, Kagamin wrote:
> https://bugs.chromium.org/p/project-zero/issues/detail?id=1252&desc=5 - a vulnerability in an application that doesn't go on the internet.

This link got me thinking: When will we see the first class action lawsuit for criminal negligence for not catching a buffer overflow (or other commonly known bug) which causes identity theft or loss of data?

Putting aside the moral questions, the people suing would have a good case, given the wide knowledge of these bugs and the availability of tools to catch/fix them. I think they could prove negligence/incompetence and win given the right circumstances.

Would be an interesting question to pose to any managers who don't want to spend time on security.