Interpolated strings and SQL (page 7)

Settings

Help

Index » General » Interpolated strings and SQL (page 7)

January 11, 2024

Re: Interpolated strings and SQL

Posted by Walter Bright
in reply to Timon Gehr

Permalink

Walter Bright

Posted in reply to Timon Gehr

Permalink

I'd like to see an example of how DIP1027 does not prevent an injection attack.

January 11, 2024

Re: Interpolated strings and SQL

Posted by Walter Bright
in reply to Timon Gehr

Permalink

Walter Bright

Posted in reply to Timon Gehr

Permalink

Please post an example of a problem it cannot detect.

January 11, 2024

Re: enum Format

Posted by Walter Bright
in reply to Richard (Rikki) Andrew Cattermole

Permalink

Walter Bright

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

On 1/11/2024 9:36 PM, Richard (Rikki) Andrew Cattermole wrote:
> Making things crash at runtime, because the compiler did not apply the knowledge it has is just ridiculous.
> 
> Imagine going to ``http://google.com/itsacrash`` and crashing Google.
> 
> Or pressing a button too fast on an airplane and suddenly the fuel pumps turn off and then refuse to turn back on.
> 
> Instead of the compiler catching clearly bad logic that it has a full understanding of, you're disrupting service and making people lose money. This is not a good thing.

I agree that compile time checking is preferable. But there is a cost involved, as I explained more fully in another post. It isn't free.

Since the format string is a compile time creature, not a user input feature, if the fault only happened when the code is deployed, it means the code was *never* executed before it was shipped.

This is an inexcusable failure for any avionics system, or any critical system, since we have simple tools that check coverage.

BTW, professional code is full of assert()s. Asserts check for faults in the code logic that are not the result of user input, but are the result of programming errors. We leave them as asserts because nobody knows how to get compilers to detect them, or is too costly to detect them.

In other words, this is not an absolute thing. It's a weighing of cost and benefit.

January 12, 2024

Re: enum Format

Posted by Richard (Rikki) Andrew Cattermole
in reply to Walter Bright

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Walter Bright

Permalink

On 12/01/2024 8:00 PM, Walter Bright wrote:
> On 1/11/2024 9:36 PM, Richard (Rikki) Andrew Cattermole wrote:
>> Making things crash at runtime, because the compiler did not apply the knowledge it has is just ridiculous.
>>
>> Imagine going to ``http://google.com/itsacrash`` and crashing Google.
>>
>> Or pressing a button too fast on an airplane and suddenly the fuel pumps turn off and then refuse to turn back on.
>>
>> Instead of the compiler catching clearly bad logic that it has a full understanding of, you're disrupting service and making people lose money. This is not a good thing.
> 
> I agree that compile time checking is preferable. But there is a cost involved, as I explained more fully in another post. It isn't free.
> 
> Since the format string is a compile time creature, not a user input feature, if the fault only happened when the code is deployed, it means the code was *never* executed before it was shipped.
> 
> This is an inexcusable failure for any avionics system, or any critical system, since we have simple tools that check coverage.
> 
> BTW, professional code is full of assert()s. Asserts check for faults in the code logic that are not the result of user input, but are the result of programming errors. We leave them as asserts because nobody knows how to get compilers to detect them, or is too costly to detect them.
> 
> In other words, this is not an absolute thing. It's a weighing of cost and benefit.

So I guess the question is, do you want to hear from a company that they lost X amount of business because they used a language feature that could have caught errors at compile time, but instead continually crashed in a live environment?

I do not.

That would be a total embarrassment.

I have an identical problem currently with ``@mustuse``.
It errors out at runtime if you do not check to see if it has an error, if you try to get access to the value.

It is hell. I could never recommend such an error prone design. I am only putting up with it until the language is capable of something better.

https://issues.dlang.org/show_bug.cgi?id=23998

January 12, 2024

Re: enum Format

Posted by Richard (Rikki) Andrew Cattermole
in reply to Richard (Rikki) Andrew Cattermole

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

Let's try something different.

Would you like me to write a small specification for an alternative method for passing metadata from the call site into the body that would allow a string interpolation feature to not use extra templates while still being compile time based?

I described this to Adam Wilson yesterday:

```d
func(@metadata("hi!") 2);

void func(T)(T arg) {
	enum MetaData = __traits(getAttributes, arg);
	pragma(msg, MetaData);
}
```

This is essentially what 1036e is attempting to do, but it does it with extra templates.

January 12, 2024

Re: enum Format

Posted by zjh
in reply to Richard (Rikki) Andrew Cattermole

Permalink

zjh

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

On Friday, 12 January 2024 at 07:31:49 UTC, Richard (Rikki) Andrew Cattermole wrote:

Let's try something different.

func(@metadata("hi!") 2);

void func(T)(T arg) {
	enum MetaData = __traits(getAttributes, arg);
	pragma(msg, MetaData);
}

I think D language can create an attribute dictionary for any building block

In this way, the attribute soup can be simplified. It would be even better to simplify the method of getting and setting attributes. It can be used to facilitate the extraction of metadata

January 12, 2024

Re: Compile Time vs Run Time

Posted by Paolo Invernizzi
in reply to Walter Bright

Permalink

Paolo Invernizzi

Posted in reply to Walter Bright

Permalink

On Friday, 12 January 2024 at 06:06:52 UTC, Walter Bright wrote:
> On 1/9/2024 3:49 PM, Paolo Invernizzi wrote:
>> You are underestimating what can be gained as value in catching SQL problems at compile time instead of runtime. And, believe me, it's not a matter of mocking the DB and relying on unittest and coverage.
>
> Please expand on that. This is a very important topic. I want to know all the relevant facts.

As a preamble, we are _currently_ doing all the SQL validations against schemas at compile time: semantic of the query, correctness of the relations involved, types matching with D (and Elm types), permission granted to roles that are performing the query.

That's not a problem at all, it's just something like:

   sql!`select foo from bar where baz > 1` [1]

In the same way we check also this:

  sql!`update foo set bag = ${d_variable_bag}`

But to attach sanitise functionalities in what is inside `d_variable_bag`, checking its type, and actually bind the content for the sql protocol is done by mixins, after the sql!string instantiation. As you can guess, that is the most common usage, by far, the business logic is FULL of stuff like that.

The security aspect is related to the fact that you _always_ need to sanitise the data content of the d variable, the mixin takes care of that part, and you can't skip it.

Said that, unittesting at runtime can be done against a real db, or mocking it.

A real db is onerous, sometime you need additional licenses, resource management, and it's time consuming. Just imagine writing D code, but having back errors not during compilations but only when the "autotester" CI task completed!

Keep in mind that using a real db is a very common, for one simple reason: mocking a db to be point of being useful for unit testing is a PITA. The common approach is simply skipping that, and mock the _results_ of the data retrieved by the query, to unittest the business logic. The queries are not checked until they run agains the dev db.

The compile time solutions instead, give you immediately feedback on wrong query, wrong type bindings, and that's invaluable especially regarding a fundamental things: refactory of code, or schema changes.

If the DB schema is changed, the application simply does not compile anymore, until you align it again against the changed schema. And the compiler gently points you to the pieces of code you need to adjust, and the same if you change a D type that somewhere will be bond to a sql parameters. So you can refactor without fears, and if the application compiles, you are assured to have everything aligned.

It's like extending the correctness of  type system down to the db type system, and it's priceless.

So, long story short: we will be forced to use mixin if we can't rely on CT interpolation, but having it will simplify the codebase.

[1] well, query sometimes can be things like that:

    with
        dsx as (select face_id, bounding_box_px, gaze_yaw_deg, gaze_pitch_deg from dev_eyes where eye = ${sx}),
        ddx as (select face_id, bounding_box_px, gaze_yaw_deg, gaze_pitch_deg from dev_eyes where eye = ${dx})
    select
        dfc.bounding_box_px as face, dfc.expression, dby.center_z_mm,
        dsx.bounding_box_px as eye_sx, dsx.gaze_pitch_deg, dsx.gaze_yaw_deg,
        ddx.bounding_box_px as eye_dx, ddx.gaze_pitch_deg, ddx.gaze_yaw_deg
    from dev_samples
        left join dev_bodies as dby using(sample_id)
        left join dev_faces as dfc using(body_id)
        left join dsx using(face_id)
        left join ddx using(face_id)
    where dev_samples.device_id = ${deviceId}
        and system_timestamp_ms = (select max(system_timestamp_ms) from dev_samples where dev_samples.device_id=${deviceId})
        and dfc.bounding_box_px is not null`
    order by dby.center_z_mm

January 12, 2024

Re: enum Format

Posted by Timon Gehr
in reply to Walter Bright

Permalink

Timon Gehr

Posted in reply to Walter Bright

Permalink

On 1/12/24 06:28, Walter Bright wrote:
> On 1/11/2024 11:50 AM, Timon Gehr wrote:
>> On 1/11/24 03:21, Walter Bright wrote:
>>> As for it being a required feature of string interpolation to do this processing at compile time, that's a nice feature, not a must have.
>>
>> As far as I am concerned it is a must-have. For example, this is what prevents the SQL injection attack, it's a safety guarantee.
> 
> Why does compile time make it a guarantee and runtime not?
> ...

Because a SQL injection attack by definition is when a third party can control safety-critical parts of your SQL query at runtime.

The very fact that the whole prepared SQL query is known at compile-time, with runtime data only entering through the placeholders, conclusively rules this out. If the SQL query is constructed at runtime based on runtime data, `execi` is unable to check whether an SQL injection vulnerability is present.

> We do array bounds checking at runtime.

You can check array bounds at runtime. You cannot check where a runtime-known string came from at runtime. It's simply not possible.

January 12, 2024

Re: enum Format

Posted by Timon Gehr
in reply to Walter Bright

Permalink

Timon Gehr

Posted in reply to Walter Bright

Permalink

On 1/12/24 06:33, Walter Bright wrote:
> On 1/11/2024 11:45 AM, Timon Gehr wrote:
>> My point was with DIP1036e it either works or does not compile, not that you called the wrong function.
> 
> What's missing is why is a runtime check not good enough?

There is no runtime check, it just does the wrong thing.

> The D compiler emits more than one safety check at runtime. For example, array bounds checking, and switch statement default checks.

Sure.

January 12, 2024

Re: Compile Time vs Run Time

Posted by Timon Gehr
in reply to Walter Bright

Permalink

Timon Gehr

Posted in reply to Walter Bright

Permalink

On 1/12/24 07:06, Walter Bright wrote:
> 
> The compile-time vs runtime issue is the only thing left standing where the advantage goes to DIP1036.

This is not true, DIP1027 also suffers from other drawbacks. For example:

- DIP1027 has already been rejected.
- Format string has to be passed as a runtime argument.
- Format string has to be parsed. (Whether at runtime or compile time.)
- Format string is not transparent to the library user, they have to manually escape '%'.
- No simple way to detect the end of the part of the argument list that is part of the istring.
- Cannot support nested istrings. (I guess the `enum Format: string;` would mitigate this to some extent.)

DIP1027 has the following advantages:
- No interspersed runtime arguments not carrying any runtime data, this is a bit easier to consume.
- Fewer template instantiations.


In any case, I think the compile-time vs runtime issue is the most significant. I do not want a solution that does not integrate well with metaprogramming, it's just not worth it.

Top | Forum index | About this forum

Forums