Jump to page: 1 28  
Page
Thread overview
Interpolated strings and SQL
Overhead of DIP1036
Re: enum Format
Jan 12
zjh
Jan 10
Hipreme
Re: Compile Time vs Run Time
January 08
Here's how SQL support is done for DIP1036:

https://github.com/adamdruppe/interpolation-examples/blob/master/lib/sql.d

```
auto execi(Args...)(Sqlite db, InterpolationHeader header, Args args, InterpolationFooter footer) {
    import arsd.sqlite;

    // sqlite lets you do ?1, ?2, etc

    enum string query = () {
        string sql;
        int number;
        import std.conv;
        foreach(idx, arg; Args)
            static if(is(arg == InterpolatedLiteral!str, string str))
                sql ~= str;
            else static if(is(arg == InterpolationHeader) || is(arg == InterpolationFooter))
                throw new Exception("Nested interpolation not supported");
            else static if(is(arg == InterpolatedExpression!code, string code))
                {   } // just skip it
            else
                sql ~= "?" ~ to!string(++number);
        return sql;
    }();

    auto statement = Statement(db, query);
    int number;
    foreach(arg; args) {
        static if(!isInterpolatedMetadata!(typeof(arg)))
            statement.bind(++number, arg);
    }

    return statement.execute();
}
```
This:

1. The istring, after converted to a tuple of arguments, is passed to the `execi` template.
2. It loops over the arguments, essentially turing it (ironically!) back into a format
string. The formats, instead of %s, are ?1, ?2, ?3, etc.
3. It skips all the Interpolation arguments inserted by DIP1036.
4. The remaining argument are each bound to the indices 1, 2, 3, ...
5. Then it executes the sql statement.

Note that nested istrings are not supported.

Let's see how this can work with DIP1027:

```
auto execi(Args...)(Sqlite db, Args args) {
    import arsd.sqlite;

    // sqlite lets you do ?1, ?2, etc

    enum string query = () {
        string sql;
        int number;
        import std.conv;
        auto fmt = arg[0];
        for (size_t i = 0; i < fmt.length, ++i)
        {
            char c = fmt[i];
            if (c == '%' && i + 1 < fmt.length && fmt[i + 1] == 's')
            {
                sql ~= "?" ~ to!string(++number);
                ++i;
            }
            else if (c == '%' && i + 1 < fmt.length && fmt[i + 1] == '%')
                ++i;  // skip escaped %
            else
                sql ~= c;
        }
        return sql;
    }();

    auto statement = Statement(db, query);
    int number;
    foreach(arg; args[1 .. args.length]) {
        statement.bind(++number, arg);
    }

    return statement.execute();
}
```
This:

1. The istring, after converted to a tuple of arguments, is passed to the `execi` template.
2. The first tuple element is the format string.
3. A replacement format string is created by replacing all instances of "%s" with
"?n", where `n` is the index of the corresponding arg.
4. The replacement format string is bound to `statement`, and the arguments are bound
to their indices.
5. Then it executes the sql statement.

It is equivalent.
January 09

Hello. It is fascinating to see string interpolation in D. Let me try to spread some light on it; I hope my thoughts will be useful.

  1. First of all, I’d like to notice that in the DIP1027 variant of the code we see:

    >

    auto fmt = arg[0];

    (arg is undeclared identifier here; I presume args was meant.) There is a problem: this line is executed at CTFE, but it cannot access args, which is a runtime parameter of execi. For this to work, the format string should go to a template parameter, and interpolated expressions should go to runtime parameters. How can DIP1027 accomplish this?

  2. >

    Note that nested istrings are not supported.

    To clarify: “not supported” means one cannot write

    db.execi(i"SELECT field FROM items WHERE server = $(i"europe$(number)")");
    

    Instead, you have to be more explicit about what you want the inner string to become. This is legal:

    db.execi(i"SELECT field FROM items WHERE server = $(i"europe$(number)".text)");
    

    However, it is not hard to adjust execi so that it fully supports nested istrings:

    struct Span {
        size_t i, j;
        bool topLevel;
    }
    
    enum segregatedInterpolations(Args...) = {
        Span[ ] result;
        size_t processedTill;
        size_t depth;
        static foreach (i, T; Args)
            static if (is(T == InterpolationHeader)) {
                if (!depth++) {
                    result ~= Span(processedTill, i, true);
                    processedTill = i;
                }
            } else static if (is(T == InterpolationFooter))
                if (!--depth) {
                    result ~= Span(processedTill, i + 1);
                    processedTill = i + 1;
                }
        return result;
    }();
    
    auto execi(Args...)(Sqlite db, InterpolationHeader header, Args args, InterpolationFooter footer) {
        import std.conv: text, to;
        import arsd.sqlite;
    
        // sqlite lets you do ?1, ?2, etc
    
        enum string query = () {
            string sql;
            int number;
            static foreach (span; segregatedInterpolations!Args)
                static if (span.topLevel) {
                    static foreach (T; Args[span.i .. span.j])
                        static if (is(T == InterpolatedLiteral!str, string str))
                            sql ~= str;
                        else static if (is(T == InterpolatedExpression!code, string code))
                            sql ~= "?" ~ to!string(++number);
                }
            return sql;
        }();
    
        auto statement = Statement(db, query);
        int number;
        static foreach (span; segregatedInterpolations!Args)
            static if (span.topLevel) {
                static foreach (arg; args[span.i .. span.j])
                    static if (!isInterpolatedMetadata!(typeof(arg)))
                        statement.bind(++number, arg);
            } else // Convert a nested interpolation to string with `.text`.
                statement.bind(++number, args[span.i .. span.j].text);
    
        return statement.execute();
    }
    

    Here, we just invoke .text on nested istrings. A more advanced implementation would allocate a buffer and reuse it. It could even be @nogc if it wanted.

  3. DIP1036 appeals more to me because it passes rich, high-level information about parts of the string. With DIP1027, on the other hand, we have to extract that information ourselves by parsing the string character by character. But the compiler already tokenized the string; why do we have to do it again? (And no, lower level doesn’t imply broader possibilities here.)

    It may have another implication: looping over characters might put current CTFE engine in trouble if strings are large. Much more iterations need to be executed, and more memory is consumed in the process. We certainly need numbers here, but I thought it was important to at least bring attention to this point.

  4. What I don’t like in both DIPs is a rather arbitrary selection of meta characters: $, $$ and %s. In regular strings, all of them are just normal characters; in istrings, they gain special meaning.

    I suppose a cleaner way would be to use \(...) syntax (like in Swift). So i"a \(x) b" interpolates x while "a \(x) b" is an immediate syntax error. First, it helps to catch bugs caused by missing i. Second, the question, how do we escape $, gets the most straightforward answer: we don’t.

    A downside is that parentheses will always be required with this syntax. But the community preferred them anyway even with $.

January 09

I’ve just realized DIP1036 has an excellent feature that is not evident right away. Look at the signature of execi:

auto execi(Args...)(Sqlite db, InterpolationHeader header, Args args, InterpolationFooter footer) { ... }

InterpolationHeader/InterpolationFooter require you to pass an istring. Consider this example:

db.execi(i"INSERT INTO items VALUES ($(x))".text);

Here, we accidentally added .text. It would be an SQL injection… but the compiler rejects it! typeof(i"...".text) is string, and execi cannot be called with (Sqlite, string).

January 09

On Tuesday, 9 January 2024 at 07:30:57 UTC, Nickolay Bukreyev wrote:

>

However, it is not hard to adjust execi so that it fully supports nested istrings:

Shame on me. segregatedInterpolations(Args...) should end with this:

result ~= Span(processedTill, Args.length, true);
return result;
January 09
Thank you for your thoughts!

On 1/8/2024 11:30 PM, Nickolay Bukreyev wrote:> 1.  First of all, I’d like to notice that in the DIP1027 variant of the code we
> see:
> 
>      > `auto fmt = arg[0];`
> 
>      (`arg` is undeclared identifier here; I presume `args` was meant.)

Yes. I don't have sql on my system, so didn't try to compile it. I always make typos. Oof.

> There is a problem: this line is executed at CTFE,

It's executed at runtime. The code is not optimized for speed, I just wanted to show the concept. The speed doesn't particularly matter, because after all this is a call to a database which is going to be slow. Anyhow, DIP1036 also uses unoptimized code here.


> 3.  DIP1036 appeals more to me because it passes rich, high-level information about parts of the string. With DIP1027, on the other hand, we have to extract that information ourselves by parsing the string character by character. But the compiler already tokenized the string; why do we have to do it again? (And no, lower level doesn’t imply broader possibilities here.)

DIP1036 also builds a new format string.

>      It may have another implication: looping over characters might put current CTFE engine in trouble if strings are large. Much more iterations need to be executed, and more memory is consumed in the process. We certainly need numbers here, but I thought it was important to at least bring attention to this point.

It happens at runtime.


> 4.  What I don’t like in both DIPs is a rather arbitrary selection of meta characters: `$`, `$$` and `%s`. In regular strings, all of them are just normal characters; in istrings, they gain special meaning.

I looked at several schemes, and picked `$` because it looked the nicest.

>      I suppose a cleaner way would be to use `\(...)` syntax (like in Swift). So `i"a \(x) b"` interpolates `x` while `"a \(x) b"` is an immediate syntax error. First, it helps to catch bugs caused by missing `i`.

I'm sorry to say, that looks like tty noise. Aesthetic appeal is very important design consideration for D.

> Second, the question, how do we escape `$`, gets the most straightforward answer: we don’t.

It will rarely need to be escaped, but when one does need it, one needs it!


>      A downside is that parentheses will always be required with this syntax. But the community preferred them anyway even with `$`.

DIP1027 does not require ( ) if it's just an identifier. That makes for the shortest, simplest istring syntax. The ( ) usage will be relatively rare. The idea is the most common cases should require the least syntactical noise.

Also, the reason I picked the SQL example is because that is the one most cited as being needed and in showing the power of DIP1036 and because I was told that DIP1027 couldn't do it :-)

The intent of DIP1027 is not to provide the most powerful, richest mechanism. It's meant to be the simplest I could think of, with the most attractive appearance, minimal runtime overhead, while handling the meat and potatoes use cases.
January 09
On Tuesday, 9 January 2024 at 08:29:08 UTC, Walter Bright wrote:
> The intent of DIP1027 is not to provide the most powerful, richest mechanism. It's meant to be the simplest I could think of, with the most attractive appearance, minimal runtime overhead, while handling the meat and potatoes use cases.

If that's the case, then 1036 wins imho, by simple thing of not doing any parsing of format string.

Note, that other use cases might not require building of a format string.

What about logging functionality?

In case of 1036, a log function could just dump all text into sink directly, for 1027 it would still need to parse format string to find where to inject arguments. This use case makes 1036 more favourable than 1027, by your own criterias for a good mechanism.
January 09

On Tuesday, 9 January 2024 at 08:29:08 UTC, Walter Bright wrote:

>

It happens at runtime.

No. This line is inside enum string query = () { ... }();. So CTFE-performance considerations do apply.

>

I'm sorry to say, that looks like tty noise.

That’s sad. In my opinion, it is at least as readable, plus I see a few objective advantages in it. We don’t have to agree on this though.

>

It will rarely need to be escaped, but when one does need it, one needs it!

Yes, but I see a benefit in reducing the number of characters that have to be escaped in the first place. While $ rarely appeared in examples we’ve been thinking of so far, if someone faces a need to create a string full of dollars, escaping them all will uglify the string.

>

DIP1027 does not require ( ) if it's just an identifier. That makes for the shortest, simplest
istring syntax. The ( ) usage will be relatively rare. The idea is the most common cases should
require the least syntactical noise.

Totally agree. Personally, I prefer omitting parentheses in interpolations when a language supports such syntax, but it’s a matter of taste.

>

Also, the reason I picked the SQL example is because that is the one most cited as being needed
and in showing the power of DIP1036 and because I was told that DIP1027 couldn't do it :-)

DIP1027 is unable to do it at compile time. I cannot argue that compile-time string creation doesn’t give us much if we call an SQL engine afterwards. So we need another example where CTFE-ability is desired. Alexandru Ermicioi asked about logging; I agree it is nice to rule out format-string parsing from every log call.

January 09

On Tuesday, 9 January 2024 at 07:30:57 UTC, Nickolay Bukreyev wrote:

>

I suppose a cleaner way would be to use \(...) syntax (like in Swift).

Also, when I said, like in Swift, in no event was I meaning, Swift has it, therefore, D should do the same. I meant, there is at least one other language that does it this way.

January 09

On Tuesday, 9 January 2024 at 09:25:28 UTC, Nickolay Bukreyev wrote:

>

On Tuesday, 9 January 2024 at 08:29:08 UTC, Walter Bright wrote:

>

It happens at runtime.

No. This line is inside enum string query = () { ... }();. So CTFE-performance considerations do apply.

>

I'm sorry to say, that looks like tty noise.

That’s sad. In my opinion, it is at least as readable, plus I see a few objective advantages in it. We don’t have to agree on this though.

>

It will rarely need to be escaped, but when one does need it, one needs it!

Yes, but I see a benefit in reducing the number of characters that have to be escaped in the first place. While $ rarely appeared in examples we’ve been thinking of so far, if someone faces a need to create a string full of dollars, escaping them all will uglify the string.

>

DIP1027 does not require ( ) if it's just an identifier. That makes for the shortest, simplest
istring syntax. The ( ) usage will be relatively rare. The idea is the most common cases should
require the least syntactical noise.

Totally agree. Personally, I prefer omitting parentheses in interpolations when a language supports such syntax, but it’s a matter of taste.

>

Also, the reason I picked the SQL example is because that is the one most cited as being needed
and in showing the power of DIP1036 and because I was told that DIP1027 couldn't do it :-)

DIP1027 is unable to do it at compile time. I cannot argue that compile-time string creation doesn’t give us much if we call an SQL engine afterwards. So we need another example where CTFE-ability is desired. Alexandru Ermicioi asked about logging; I agree it is nice to rule out format-string parsing from every log call.

Compile time string creation when dealing with SQL give you the ability to validate the string for correctness at compile time.

Here an example of what we are doing internally:

pinver@utumno fieldmanager % bin/yab build ldc_lab_mac_i64_dg
2024-01-09T10:48:07.889 [info] melkor.d:235:executeReadyLabel executing ldc_lab_mac_i64_dg:
/Users/pinver/dlang/ldc-1.36.0/bin/ldc2     -preview=dip1000 -i -Isrc -mtriple=x86_64-apple-darwin --vcolumns -J/Users/pinver/Lembas --d-version=env_dev_ --d-version=listen_for_nx_ --d-version=disable_ssl --d-version=disable_fixations --d-version=disable_metrics --d-version=disable_aggregator --d-debug -g -of/Users/pinver/Projects/DeepGlance/fieldmanager/bin/lab_mac_i64_dg /Users/pinver/Projects/DeepGlance/fieldmanager/src/application.d src/sbx/raygui/c_raygui.c
2024-01-09T10:48:13.423 [error] melkor.d:247:executeReadyLabel build failed:
src/ops/sql/semantics.d(489,31): Error: uncaught CTFE exception `object.Exception("42P01: relation \"snapshotsssss\" does not exist. SQL: select size_mm, size_px from snapshotsssss where snapshot_id = $1")`
src/api3.d(41,9):        thrown from here
src/api3.d(51,43):        called from here: `checkSql(Schema("public", ["aggregators":Table("aggregators", ["aggregated_till":Column("aggregated_till", Type.timestamp, true, false), "touchpoint_id":Column("touchpoint_id", Type.smallint, true, false)], [], [], ["pinver", "ipsos_analysis_operator", "i
/Users/pinver/Projects/DeepGlance/fieldmanager/src/application.d(644,45): Error: template instance `api3.forgeSqlCheckerForSchema!(Schema("public", ["aggregators":Table("aggregators", ["aggregated_till":Column("aggregated_till", Type.timestamp, true, false), "touchpoint_id":Column("touchpoint_id", T

or


pinver@utumno fieldmanager % bin/yab build ldc_lab_mac_i64_dg
2024-01-09T10:52:36.220 [info] melkor.d:235:executeReadyLabel executing ldc_lab_mac_i64_dg:
/Users/pinver/dlang/ldc-1.36.0/bin/ldc2     -preview=dip1000 -i -Isrc -mtriple=x86_64-apple-darwin --vcolumns -J/Users/pinver/Lembas --d-version=env_dev_ --d-version=listen_for_nx_ --d-version=disable_ssl --d-version=disable_fixations --d-version=disable_metrics --d-version=disable_aggregator --d-debug -g -of/Users/pinver/Projects/DeepGlance/fieldmanager/bin/lab_mac_i64_dg /Users/pinver/Projects/DeepGlance/fieldmanager/src/application.d src/sbx/raygui/c_raygui.c
2024-01-09T10:52:37.254 [error] melkor.d:247:executeReadyLabel build failed:

src/ops/sql/semantics.d(504,19): Error: uncaught CTFE exception `object.Exception("XXXX! role \"dummyuser\" can't select on table \"snapshots\". SQL: select size_mm, size_px from snapshots where snapshot_id = $1")`
src/api3.d(41,9):        thrown from here
src/api3.d(51,43):        called from here: `checkSql(Schema("public", ["aggregators":Table("aggregators", ["aggregated_till":Column("aggregated_till", Type.timestamp, true, false), "touchpoint_id":Column("touchpoint_id", Type.smallint, true, false)], [], [], ["pinver", "ipsos_analysis_operator", "i
/Users/pinver/Projects/DeepGlance/fieldmanager/src/application.d(644,45): Error: template instance `api3.forgeSqlCheckerForSchema!(Schema("public", ["aggregators":Table("aggregators", ["aggregated_till":Column("aggregated_till", Type.timestamp, true, false), "touchpoint_id":Column("touchpoint_id", T

CTFE support is a must IMHO

/P

January 09

On Tuesday, 9 January 2024 at 08:29:08 UTC, Walter Bright wrote:

>

that looks like tty noise.

Oh, I realized you might be reading this without a fancy Markdown renderer. Backticks are part of Markdown syntax, not D. I only suggested using

i"a \(x) b"

rather than

i"a $(x) b"
« First   ‹ Prev
1 2 3 4 5 6 7 8