Thread overview
About Format String Attack for D's *writef*()
Oct 05, 2006
is91042
Oct 05, 2006
Derek Parnell
Oct 05, 2006
Lionello Lunesu
October 05, 2006
Format string attacks are a problem of C's printf().

Consider the following C code:

	char name[100];
	printf("Please input your name: ");
	scanf("%s", name);
	printf(name);

If a user type "John", he will get:

	Please Input your name: John
	John

However, he may type "John%sWang" and get:

	Please Input your name: John%sWang
	JohnJohn%sWangWang (This may depend on the runtime environment.)

These bugs may be prevented by saying to the programmers:
"Don't let the input of a user be the first argument of printf()"
because printf() only interprets the first parameter as a format string.
But we can't just do the same thing to prevent the bugs for *writef*()
because of the power of *writef*().

The problem is *writef*() can interpret not only the first but also many
parameters as format strings.
Consider the following code.

	int x=123, y=321;
	writefln("This is a test: %s. ", x,
		"And this is another test: %s.", y);
	writefln("This is a test: %s.",
			"And this is another test: %s.", x, y);

And the output will be:

	This is a test: 123. And this is another test: 321.
	This is a test: And this is another test: %s..123321

It shows that *writef*() interpret any string as a format string if it way not assigned by any other format strings.

Consider the following code.

	char[] user_name;
	writefln("Please Input your name: ");
	din.readf("%s", &user_name);
	writefln("Your name is ", user_name, ". And my name is Peter.");

If a user type "John", he will get:

	Please Input your name:
	John
	Your name is John. And my name is Peter.

However, he may type "John%sWang":

	Please Input your name:
	John%sWang
	Your name is John. And my name is Peter.Wang

Its behavior is so strange and is not what we expected.

Although we can use the same approach that we requires the programmers put an argument "%s" before every string affected by users, I think it is not a good privacy because it requires an extra heavy load for programmers and loses the convenience of that *writef* can treat many arguments as format strings.

So, I suggest a solution: Add a new type 'fstring' as the meaning "format string" and *writef*() will do different thing for fstrings and strings. If a string is encountered, they dump the string.  If a fstring is encountered, they do the same thing as before.

Moreover, for easily creating a fstring, we can use f" and ". For example:

	writefln(f"Your name is %s", user_name, ". And my name is Peter.");
October 05, 2006
On Thu, 5 Oct 2006 07:01:30 +0000 (UTC), is91042 wrote:


> The problem is *writef*() can interpret not only the first but also many parameters as format strings.

Agreed.

The way I handle this is to only use the first parameter to specify the formatting tokens, and to specify one for each subsequent parameter.

Another is to make safe any user entered data.

For example:

 import std.stdio;
 import std.cstream;
 import std.string;

 // Replace all occurrences of '%' with '%%'
 char[] safe(char[] a)
 {
    int i;
    int j;
    j = 0;
    while(j < a.length)
    {
        i = std.string.find(a[j..$], '%');
        if (i < 0)
            break;
        i += j;
        a = a[0..i] ~ "%" ~ a[i..$];
        j = i + 2;
    }

    return a;
 }

 void main()
 {
    char[] user_name;
    writefln("Please Input your name: ");
    din.readf("%s", &user_name);

    // Safer
    writefln("A,Your name is ", safe(user_name),
             ". And my name is Peter.");

    // My preference
    writefln("B,Your name is %s. And my name is Peter.", user_name);
 }

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Down with mediocrity!"
5/10/2006 6:01:41 PM
October 05, 2006
is91042 wrote:

> The problem is *writef*() can interpret not only the first but also many
> parameters as format strings.

This is a feature, not a bug...

> It shows that *writef*() interpret any string as a format string if it way
> not assigned by any other format strings.
> 
> Consider the following code.
> 
> 	char[] user_name;
> 	writefln("Please Input your name: ");
> 	din.readf("%s", &user_name);
> 	writefln("Your name is ", user_name, ". And my name is Peter.");

This is the expected behaviour with writef, need to use "%s".
You get the same with printf, if you concatenate the strings.

Which is why I think using printf (in C) and writef (in D)
*by default* isn't very nice to newcomers, as it is harder...

There should be a simple function that just outputs a string.

> Its behavior is so strange and is not what we expected.

You get the same "odd" behaviour in: writef("100% unexpected");
(need to escape % by using %%, when you specify a format string)

> Although we can use the same approach that we requires the programmers
> put an argument "%s" before every string affected by users, I think it
> is not a good privacy because it requires an extra heavy load for
> programmers and loses the convenience of that *writef* can treat many
> arguments as format strings.
> 
> So, I suggest a solution: Add a new type 'fstring' as the meaning
> "format string" and *writef*() will do different thing for fstrings
> and strings. If a string is encountered, they dump the string.  If a
> fstring is encountered, they do the same thing as before.

My suggestion was to instead add a "write" function, that would not
interpret the format character '%' but just output the string as-is ?

writeln("100% easier");
writeln("Your name is ", user_name, ". And my name is Peter.");

See http://www.digitalmars.com/d/archives/digitalmars/D/21692.html
and http://www.digitalmars.com/d/archives/digitalmars/D/15627.html

--anders
October 05, 2006
Anders F Björklund wrote:
> is91042 wrote:
> 
>> The problem is *writef*() can interpret not only the first but also many
>> parameters as format strings.
> 
> This is a feature, not a bug...
> 
>> It shows that *writef*() interpret any string as a format string if it way
>> not assigned by any other format strings.
>>
>> Consider the following code.
>>
>>     char[] user_name;
>>     writefln("Please Input your name: ");
>>     din.readf("%s", &user_name);
>>     writefln("Your name is ", user_name, ". And my name is Peter.");
> 
> This is the expected behaviour with writef, need to use "%s".
> You get the same with printf, if you concatenate the strings.
> 
> Which is why I think using printf (in C) and writef (in D)
> *by default* isn't very nice to newcomers, as it is harder...
> 
> There should be a simple function that just outputs a string.
> 
>> Its behavior is so strange and is not what we expected.
> 
> You get the same "odd" behaviour in: writef("100% unexpected");
> (need to escape % by using %%, when you specify a format string)
> 
>> Although we can use the same approach that we requires the programmers
>> put an argument "%s" before every string affected by users, I think it
>> is not a good privacy because it requires an extra heavy load for
>> programmers and loses the convenience of that *writef* can treat many
>> arguments as format strings.
>>
>> So, I suggest a solution: Add a new type 'fstring' as the meaning
>> "format string" and *writef*() will do different thing for fstrings
>> and strings. If a string is encountered, they dump the string.  If a
>> fstring is encountered, they do the same thing as before.
> 
> My suggestion was to instead add a "write" function, that would not
> interpret the format character '%' but just output the string as-is ?

Good idea.
October 05, 2006
is91042 wrote:

> Consider the following code.
> 
> 	char[] user_name;
> 	writefln("Please Input your name: ");
> 	din.readf("%s", &user_name);
> 	writefln("Your name is ", user_name, ". And my name is Peter.");

BTW; "din" does not work in GDC on the Mac:
(i.e. std.stream.readf doesn't, actually...)

Please Input your name:
Anders
Your name is . And my name is Peter.


This is because there is no portable D standard
for how "typeid comparison" is supposed to work ?

In DMD, one typeid === another. In GDC, only ==.
(meaning that "arguments[j] is typeid()" breaks)


And I think that readf should go in std.stdio...
(along with freadf, and also std.string.unformat)

http://www.digitalmars.com/d/archives/digitalmars/D/11021.html

--anders