December 08, 2020
On Wed, Dec 09, 2020 at 01:23:37AM +0000, Paul Backus via Digitalmars-d wrote:
> On Tuesday, 8 December 2020 at 22:52:29 UTC, Dukc wrote:
[...]
> > Consider a module with `@safe:` or `@trusted:` at top. The problem with the rule "external C functions can't be @safe" or "can't be @safe by default" is that you do not know why the annotation is at the top of module. It could be because it has been reviewed, or it could be just to get functions calling it to quickly work.

IMO, if the module was reviewed, individual functions would be annotated. I would not trust a "review" that simply slaps a blanket @trusted on the top of the file.  Individually-annotated functions increase my confidence that somebody has at least put in the effort to look over each function.


> > Theoretically, if C code is considered `@safe` by default, you can tell them from each other. No attributes: `@system`, but considered `@safe` for now, meaning you want to review the module in the near future.  `@safe` or `@trusted`: someone has checked the module, nothing to worry about.
> 
> The problem with this is that there is existing *correct* D code that relies "no attributes" meaning @system, and which would silently become incorrectly-annotated if the default were changed. For example, there are many external @system functions in the D runtime that do not have an explicit @system annotation.

Yeah, this is one of the main reasons there was a big backlash against that DIP.


[...]
> Of course, you can try to argue that it's the fault of the library maintainer for not realizing that they need to re-review all of their external function declarations--but why should they have to, when the compiler can just as easily flag those functions automatically? Isn't the whole reason we have automatic memory-safety checks in the first place to *avoid* relying on programmer discipline for this kind of thing?

Yeah, D's mantra all these years has always been, automatic verification rather than programming by convention.  This DIP undermines that principle.


T

-- 
Guns don't kill people. Bullets do.
December 09, 2020
On Wednesday, 2 December 2020 at 17:52:29 UTC, H. S. Teoh wrote:
> An equally bad thing about C strings is that utterly evil function known as strncpy.  Why is it evil?  Because it comes with the warning that the result may not be terminated if the target buffer is not large enough to contain the entire string.
>  And guess how many people gloss over or simply forget that detail?  Yep, I've fixed a whole bunch of bugs caused by that.
>

The only sin of strncpy() is its name. The problem is that people think it is a string function (even you fell for it), but it never was a string function, it is a buffer function and a mem*/buf* prefix would have gone a long way to avoid its misuse as a string function. Beyond its truncation feature, it has a second functionality that most people do not know and that make it definitely different from the string function, it overwrites the whole buffer with 0 to the end of it, making it often a performance hog:

    char buffer[32000];
    strncpy(buffer, "a", sizeof buffer);

will write 32000 characters.
Historically it was invented for early Unix, to write the filename in the directory entry, which was size 14 at that time.

     strncpy(direntry, filename, 14);

strncpy() has its uses, but it is important to know, that it is NOT a string function. The new warning in gcc since version 9 is annoying and has to be shut up in some cases (with pragmas) as there are legitimate uses of strncpy (unlike gets(), which is always wrong)

Except for that, I completely agree with the rest of your rant.
December 09, 2020
On Wednesday, 9 December 2020 at 01:23:37 UTC, Paul Backus wrote:
>
> The problem with this is that there is existing *correct* D code that relies "no attributes" meaning @system, and which would silently become incorrectly-annotated if the default were changed. For example, there are many external @system functions in the D runtime that do not have an explicit @system annotation.

IIUC, such functions in existing .o files and libs would not be designated (mangled) @safe so I'd expect linker errors, not silence.  New compilations will have the source body and will, of course, reject non @safe code so, again, not silent. What have I misunderstood?  What is the "silent" problem?  Is there some transitive issue?

Note: @safe designation should be part of the external mangle of any future defaulted-and-verified-@safe function.  I don't see how it works otherwise.

>
> You flip the switch, your tests pass, and then months or years later, you discover that a memory-corruption bug has snuck its way into your @safe code, because no one ever got around to putting an explicit @system annotation on some external function deep in one of your dependencies. How would you react? Personally, I'd jump ship to Rust and never look back.

How do your tests pass?  How does the code even compile?  If the default moves from lax (@system) to strict (@safe) I see how a lot of code that formerly compiled would stop compiling/linking, an ongoing concern were the DIP edited and re-introduced, but I don't see how you get bugs "sneaking" in or lying dormant.  Absent explicit greenwashing by the programmer, how do the bugs sneak in?

>
> Of course, you can try to argue that it's the fault of the library maintainer for not realizing that they need to re-review all of their external function declarations--but why should they have to, when the compiler can just as easily flag those functions automatically? Isn't the whole reason we have automatic memory-safety checks in the first place to *avoid* relying on programmer discipline for this kind of thing?

Well, @safe by default is about as automatic/not-relying-on-discipline as it gets.  Unless annotated otherwise all functions with source are flagged at compile time if not verified @safe.  Extern declarations against old object files and libs should flag errors at link time.

Again I feel that I must be missing something.  What "programmer discipline" are you referring to?

December 09, 2020
On Wednesday, 9 December 2020 at 08:26:35 UTC, Patrick Schluter wrote:
> On Wednesday, 2 December 2020 at 17:52:29 UTC, H. S. Teoh wrote:
>>[...]
>
> The only sin of strncpy() is its name. The problem is that people think it is a string function (even you fell for it), but it never was a string function, it is a buffer function and a mem*/buf* prefix would have gone a long way to avoid its misuse as a string function. Beyond its truncation feature, it has a second functionality that most people do not know and that make it definitely different from the string function, it overwrites the whole buffer with 0 to the end of it, making it often a performance hog:
>
> [...]
Simplest implementation of strncpy

    char *strncpy(char *dest, const char *src, size_t n)
    {
      memset(dest, 0, n);
      memcpy(dest, src, min(strlen(src),n));
    }

Checking the man on Linux does perpetuate the error. strncpy() is joined with strcpy(), which is wrong imo. As my implementation above shows, strncpy() is semantically closer to memcpy() than to strcpy().
December 09, 2020
On Wednesday, 9 December 2020 at 08:52:10 UTC, Patrick Schluter wrote:
> On Wednesday, 9 December 2020 at 08:26:35 UTC, Patrick Schluter wrote:
>> On Wednesday, 2 December 2020 at 17:52:29 UTC, H. S. Teoh wrote:
>>>[...]
>>
>> The only sin of strncpy() is its name. The problem is that people think it is a string function (even you fell for it), but it never was a string function, it is a buffer function and a mem*/buf* prefix would have gone a long way to avoid its misuse as a string function. Beyond its truncation feature, it has a second functionality that most people do not know and that make it definitely different from the string function, it overwrites the whole buffer with 0 to the end of it, making it often a performance hog:
>>
>> [...]
> Simplest implementation of strncpy
>
>     char *strncpy(char *dest, const char *src, size_t n)
>     {
>       memset(dest, 0, n);
>       memcpy(dest, src, min(strlen(src),n));

     return memcpy(dest, src, min(strlen(src), n));

obviously

>     }
>
> Checking the man on Linux does perpetuate the error. strncpy() is joined with strcpy(), which is wrong imo. As my implementation above shows, strncpy() is semantically closer to memcpy() than to strcpy().


December 09, 2020
On Wednesday, 9 December 2020 at 01:23:37 UTC, Paul Backus wrote:
> The problem with this is that there is existing *correct* D code that relies "no attributes" meaning @system, and which would silently become incorrectly-annotated if the default were changed. For example, there are many external @system functions in the D runtime that do not have an explicit @system annotation.
>
> You flip the switch, your tests pass, and then months or years later, you discover that a memory-corruption bug has snuck its way into your @safe code, because no one ever got around to putting an explicit @system annotation on some external function deep in one of your dependencies. How would you react? Personally, I'd jump ship to Rust and never look back.

Yes, this would be a problem, but I believe it'd be less of a problem than you think. If you use some third-party library, you need to think twice how much you trust it's `@safe`ty in any case. At least if you're that strict about it. No annotations anywhere is about the easiest thing to spot when considering that.

>
> Of course, you can try to argue that it's the fault of the library maintainer for not realizing that they need to re-review all of their external function declarations--but why should they have to, when the compiler can just as easily flag those functions automatically?

It might have some benefit: If non-annotated C libraries are considered `@safe`, it'll mean that not-so-quality code is using compromised `@safe`. Bad. But if they are considered `@system`, not-so-quality code will not be using `@safe` AT ALL. Even worse.

Now I understand there is a drawback for higher-quality code. You have to either copy-paste the C library header and add `@system:` to top of it, or make a module that automatically wraps the header as `@system`. That's more work than just importing the C header, and thus will result in more greenwashed headers than C headers being `@system` by default.

Also it sure sucks that the compiler would do the wrong thing by default, but would the pragmatic downsides be even worse for the Common Sense option? I don't know, But I'm saying that 1: it's a judgement call, not anything absolute, 2: either way one could live with, and would be far from making `@safe` meaningless.
December 09, 2020
On Wednesday, 9 December 2020 at 08:29:58 UTC, Bruce Carneal wrote:
>
> IIUC, such functions in existing .o files and libs would not be designated (mangled) @safe so I'd expect linker errors, not silence.  New compilations will have the source body and will, of course, reject non @safe code so, again, not silent. What have I misunderstood?  What is the "silent" problem?  Is there some transitive issue?
>
> Note: @safe designation should be part of the external mangle of any future defaulted-and-verified-@safe function.  I don't see how it works otherwise.

This does not work for extern(C) functions because their names are not mangled.
December 09, 2020
On 09.12.20 12:46, Dukc wrote:
> 
> It might have some benefit: If non-annotated C libraries are considered `@safe`, it'll mean that not-so-quality code is using compromised `@safe`. Bad. But if they are considered `@system`, not-so-quality code will not be using `@safe` AT ALL. Even worse.

That's a bit like saying it's bad if products produced using slave labour don't get a fair trade label.

Anyway, extern(C) code that may corrupt memory is not even necessarily buggy, some C functions just have an unsafe interface. If you want the @safe checks make small @trusted wrappers around those functions so that the interface becomes safe and explicitly annotate those extern(C) functions whose interface you think is safe with @trusted.

And if you don't care about @safe, that's fine too. If someone wants to use your library from @safe code they can add the required annotations themselves and send you a pull request.

> 
> 
> Also it sure sucks that the compiler would do the wrong thing by default, but would the pragmatic downsides be even worse for the Common Sense option? I don't know, But I'm saying that 1: it's a judgement call, not anything absolute, 2: either way one could live with, and would be far from making `@safe` meaningless. 

I'm not really willing to debate the pragmatic upsides of encouraging dishonesty in a modular verification context. There are none.
December 09, 2020
On 09.12.20 14:06, Paul Backus wrote:
> On Wednesday, 9 December 2020 at 08:29:58 UTC, Bruce Carneal wrote:
>>
>> IIUC, such functions in existing .o files and libs would not be designated (mangled) @safe so I'd expect linker errors, not silence.  New compilations will have the source body and will, of course, reject non @safe code so, again, not silent. What have I misunderstood?  What is the "silent" problem?  Is there some transitive issue?
>>
>> Note: @safe designation should be part of the external mangle of any future defaulted-and-verified-@safe function.  I don't see how it works otherwise.
> 
> This does not work for extern(C) functions because their names are not mangled.

It does not even work for extern(D) functions because their return types are not mangled.
December 09, 2020
On Wednesday, 9 December 2020 at 11:46:41 UTC, Dukc wrote:
>
> It might have some benefit: If non-annotated C libraries are considered `@safe`, it'll mean that not-so-quality code is using compromised `@safe`. Bad. But if they are considered `@system`, not-so-quality code will not be using `@safe` AT ALL. Even worse.

Using compromised @safe is much, much worse than not using @safe at all. Not using @safe at all means you still have the option of migrating to @safe in the future. If you're using compromised @safe, you have no migration path.