Notes from C++ static analysis (page 4) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Notes from C++ static analysis (page 4)

June 27, 2013

Re: Notes from C++ static analysis

Posted by Andrei Alexandrescu
in reply to Timon Gehr

Andrei Alexandrescu

Posted in reply to Timon Gehr

On 6/26/13 4:48 PM, Timon Gehr wrote:
> On 06/27/2013 01:01 AM, Walter Bright wrote:
>> Oh, and the cake topper is IOStreams performs badly, too.
>
> Yes, but that's just a default.
>
> std::ios_base::sync_with_stdio(false);
> std::cin.tie(0);

That's the least of iostreams' efficiency problems.

Andrei

June 27, 2013

Re: Notes from C++ static analysis

Posted by Andrej Mitrovic
in reply to Andrei Alexandrescu

Andrej Mitrovic

Posted in reply to Andrei Alexandrescu

On 6/27/13, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> I think that's a bug in format that we need to fix.

Absolutely. We must remove informative error messages and implement sloppy APIs in the standard library.

June 27, 2013

Re: Notes from C++ static analysis

Posted by Andrei Alexandrescu
in reply to dennis luehring

Andrei Alexandrescu

Posted in reply to dennis luehring

On 6/26/13 12:53 PM, dennis luehring wrote:
> Am 26.06.2013 21:33, schrieb Andrei Alexandrescu:
>> On 6/26/13 11:08 AM, bearophile wrote:
>>> On the other hand this D program prints just
>>> "10" with no errors, ignoring the second x:
>>>
>>> import std.stdio;
>>> void main() {
>>> size_t x = 10;
>>> writefln("%d", x, x);
>>> }
>>>
>>> In a modern statically typed language I'd like such code to give a
>>> compile-time error.
>>
>> Actually this is good because it allows to customize the format string
>> to print only a subset of available information (I've actually used
>> this).
>
> why is there always a tiny need for such tricky stuff - isn't that only
> usefull in very rare cases

This is no tricky stuff, simply allows the user to better separate format from data. The call offers the data, the format string chooses what and how to show it. Obvious examples include logging lines with various levels of detail and internationalized/localized/customized messages that don't need to display all data under all circumstances.

Checking printf for undefined behavior and mistakes affecting memory safety of the entire program is a noble endeavor, and kudos to the current generation of C and C++ compilers that warn upon misuse.

Forcing D's writef to match exactly the format string against the number of arguments passed is a waste of flexibility and caters to programmers who can't bring themselves to unittest or even look at the program output - not even once. Our motivation is to help those out of such habits, not support them.

Andrei

June 27, 2013

Re: Notes from C++ static analysis

Posted by Andrei Alexandrescu
in reply to Andrej Mitrovic

Andrei Alexandrescu

Posted in reply to Andrej Mitrovic

On 6/26/13 7:38 PM, Andrej Mitrovic wrote:
> On 6/27/13, Andrei Alexandrescu<SeeWebsiteForEmail@erdani.org>  wrote:
>> I think that's a bug in format that we need to fix.
>
> Absolutely. We must remove informative error messages and implement
> sloppy APIs in the standard library.

Apologies for the overly curt reply, which I have now canceled. OK let's do this: aside from sarcasm, do you have good arguments to line up to back up your opinion?

Andrei

June 27, 2013

Re: Notes from C++ static analysis

Posted by Andrej Mitrovic

Andrej Mitrovic

On Thursday, 27 June 2013 at 02:40:53 UTC, Andrei Alexandrescu wrote:
> You are wrong.

format has thrown exceptions with such code since v2.000 (that's the year 2007). It's only in v2.061 that it has finally gotten an informative error message in the exception it throws.

> Forcing D's writef to match exactly the format string against the number of arguments passed is a waste of flexibility and caters to programmers who can't bring themselves to unittest or even look at the program output - not even once.

If you are the type of programmer who often tests their own code, why are you passing more arguments than needed to format?

June 27, 2013

Re: Notes from C++ static analysis

Posted by Andrej Mitrovic
in reply to Andrei Alexandrescu

Andrej Mitrovic

Posted in reply to Andrei Alexandrescu

On 6/27/13, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> Apologies for the overly curt reply, which I have now canceled. OK let's do this: aside from sarcasm, do you have good arguments to line up to back up your opinion?

format has thrown exceptions with such code since v2.000 (that's the year 2007). It's only in v2.061 that it has finally gotten an informative error message in the exception it throws.

I'm not against having the current lax argument count handling feature for writef, but for format it's enabled me to catch bugs at the call site.

The way I see it, write/writef is primarily used for debugging and benefits having some lax features, whereas format is used in more heavy-duty work where it's important not to screw things up at the call site.

As for the unittesting argument (if the argument applies to format), in theory the argument is perfectly sound and I agree with you, but in the real world convenience trumps manual labor. We should unittest more, but it's super-convenient when format tells you you've screwed something up when you're e.g. writing some D-style shell scripts or when porting C code to D, maybe you don't have a lot of time to unittest constantly.

---

But the bottom line is I don't think we need to force anything on anybody. If anything, we could split up the internal format implementation and provide format and safeFormat functions.

format("%s %s", 1);  // no exceptions
safeFormat("%s %s", 1);  // exception thrown

All current code will de-facto still work (because otherwise they would already get runtime exceptions), and all future code could start using safer formatting functions.

We could have that, or some getopt-style configuration options, such as:

format("%s %s", std.format.config.safe, 1);

Or some other form of wizardry (perhaps a compile-time argument).

Anything goes. Let's not break too much sweat arguing for what should ultimately be a customization point.

June 27, 2013

Re: Notes from C++ static analysis

Posted by Walter Bright
in reply to H. S. Teoh

Walter Bright

Posted in reply to H. S. Teoh

On 6/26/2013 12:03 PM, H. S. Teoh wrote:
> But yeah, that's bad practice and the compiler should warn about it. The
> reason it doesn't, though, IIRC is because of generic code, where it
> would suck to have to special-case when two template arguments actually
> alias the same thing.

It can also occur in machine-generated code, such as what mixin's do.

June 27, 2013

Re: Notes from C++ static analysis

Posted by Jonathan M Davis
in reply to Andrei Alexandrescu

Jonathan M Davis

Posted in reply to Andrei Alexandrescu

On Wednesday, June 26, 2013 19:18:27 Andrei Alexandrescu wrote:
> On 6/26/13 1:50 PM, bearophile wrote:
> > Andrei Alexandrescu:
> >> Actually this is good because it allows to customize the format string to print only a subset of available information (I've actually used this).
> > 
> > Your use case is a special case that breaks a general rule.
> 
> There's no special case here.

I have never heard anyone other than you even suggest this sort of behavior. Granted, I may just not talk with the right people, but that at least makes it sound like what you're suggesting is a very special case.

> > That
> > behavour is surprising, and it risks hiding some information silently.
> 
> Doesn't surprise me one bit.

Well, it shocks most of us. We expect the number of arguments to a function to match the number of parameters, and with format strings, you're basically declaring what the parameters are, and then the other arguments to format or writefln are the arguments to the format string. Most of us don't even think about swapping out the format string at runtime.

> > I
> > think format() is more correct here.
> 
> I think it has a bug that diminishes its usefulness.

It's a difference in design. It's only a bug if it's not what it was designed to do. And I think that it's clear that format was never designed to accept more arguments than format specifiers given that it's never worked that way. That doesn't necessarily mean that it _shouldn't_ work that way, but the only bug I see here is that the designs of writefln and format don't match. Which one is better designed is up for debate.

> > If you want a special behavour you
> > should use a special function as partialWritefln that ignores arguments
> > not present in the format string.
> 
> That behavior is not special.

Well, it's special enough that most of us seem to have never even thought of it, let alone thought that it was useful or a good idea.

I don't know whether it's really better to have format and writefln ignore extra arguments or not. My gut reaction is definitely that it's a very bad idea and will just lead to bugs. But clearly you have use cases for it and think that it's very useful. So, maybe it _is_ worth doing. But I'd be inclined to go with Bearophile's suggestion and make it so that a wrapper function or alternate implementation handled the ignoring of extra arguments. Then it would be clear in the code that that's what was intended, and we would get the default behavior that most of us expect. An alternative would be a template argument to writeln and format which allowed you to choose which behavior you wanted and defaulted to not ignoring arguments.

- Jonathan M Davis

June 27, 2013

Re: Notes from C++ static analysis

Posted by Peter Williams
in reply to Andrei Alexandrescu

Peter Williams

Posted in reply to Andrei Alexandrescu

On 27/06/13 12:17, Andrei Alexandrescu wrote:
> On 6/26/13 1:31 PM, Andrej Mitrovic wrote:
>> On 6/26/13, Andrei Alexandrescu<SeeWebsiteForEmail@erdani.org>  wrote:
>>> Actually this is good because it allows to customize the format string
>>> to print only a subset of available information (I've actually used
>>> this).
>>
>> Note that this works:
>>
>> writefln("%d", x, x);
>>
>> But the following throws since v2.061:
>>
>> writeln(format("%d", x, x));
>>
>> std.format.FormatException@C:\dmd-git\dmd2\windows\bin\..\..\src\phobos\std\string.d(2346):
>>
>> Orphan format arguments: args[1..2]
>>
>> I find the latter to be quite useful for debugging code, and wanted
>> this feature for a long time.
>
> I think that's a bug in format that we need to fix.

While you're fixing it can you modify it so that the format string can specify the order in which the arguments are replaced?  This is very important for i18n.  I apologize if it can already do this but I was unable to find any documentation of format()'s format string other than examples with %s at the appropriate places.

Peter

June 27, 2013

Re: Notes from C++ static analysis

Posted by H. S. Teoh

H. S. Teoh

On Thu, Jun 27, 2013 at 01:56:31PM +1000, Peter Williams wrote: [...]
> While you're fixing it can you modify it so that the format string can specify the order in which the arguments are replaced?  This is very important for i18n.  I apologize if it can already do this but I was unable to find any documentation of format()'s format string other than examples with %s at the appropriate places.
[...]

You can use positional arguments for this purpose. For example:

	writefln("%2$s %1$s", "a", "b");

outputs "b a".


T

-- 
The easy way is the wrong way, and the hard way is the stupid way. Pick one.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation