Proposal: Database Engine for D (page 3) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Proposal: Database Engine for D (page 3)

January 02, 2016

Re: Proposal: Database Engine for D

Posted by Jacob Carlborg
in reply to Ola Fosheim Grøstad

Jacob Carlborg

Posted in reply to Ola Fosheim Grøstad

On 2016-01-02 13:55, Ola Fosheim Grøstad wrote:

> They have probably never done professional work with an ORM...
>
> Nobody wants that.

Exactly, but tell them ;)

-- 
/Jacob Carlborg

January 02, 2016

Re: Proposal: Database Engine for D

Posted by Jacob Carlborg
in reply to Sebastiaan Koppe

Jacob Carlborg

Posted in reply to Sebastiaan Koppe

On 2016-01-02 19:37, Sebastiaan Koppe wrote:

> Well, you can also generate the structs and specific serialization code.
> And depending on how advanced your dsl you can also auto generate
> database migration code. There are propably tons of other stuff you can
> do with it.
>
> All in all much better than extending the language.

I would rather to the opposite. Generate the necessary SQL for a migration based on a struct, not the other way around.

-- 
/Jacob Carlborg

January 02, 2016

Re: Proposal: Database Engine for D

Posted by Ola Fosheim Grøstad
in reply to Jacob Carlborg

Ola Fosheim Grøstad

Posted in reply to Jacob Carlborg

On Saturday, 2 January 2016 at 19:46:48 UTC, Jacob Carlborg wrote:
> On 2016-01-02 13:55, Ola Fosheim Grøstad wrote:
>
>> They have probably never done professional work with an ORM...
>>
>> Nobody wants that.
>
> Exactly, but tell them ;)

I did:

http://forum.dlang.org/post/mfdjcgykaizbzuuhoupv@forum.dlang.org

But Walter thinks that enforcing object-value comparisons in a rather limited way is a feature. It is a feature... but it is the wrong kind of feature... ;) It is an oversimplified and ineffective solution.

A more sensible approach would be to use type-classes/concepts for defining what kind of comparison relation a class is required to provide and then deduce comparison operators based on what the programmer provides (e.g. if the programmer only provides a definition for "≤" then the compiler can deduce "<", ">", "≥", "==" etc.).

D moved into templates and generics without cleaning up  the language and is essentially ducktyped and macroish. Like C++. And Go (just in a different way).

But C++ might start on a change process with C++17/20. We'll see.

January 02, 2016

Re: Proposal: Database Engine for D

Posted by Chris Wright
in reply to Piotrek

Chris Wright

Posted in reply to Piotrek

On Sat, 02 Jan 2016 16:40:16 +0000, Piotrek wrote:

> On Friday, 1 January 2016 at 10:00:43 UTC, Kapps wrote:
> 
>> This example shows the difficulty of doing this in D. You can't really
>> have something like `p.Name == "James"`, or `p.Age < 21`
>> translate to SQL properly without language changes, which I believe
>> Walter or Andrei were against. This has been the key problem when
>> things like Linq to Sql for D have been brought up before.
> 
> Not really. There is no translation stage to SQL or any other DSL in the proposal. So this problem doesn't exist and no language changes are needed.

So you want to create the following query:

  people.filter!(x => x.surname == "Slughorn");

And you've got ten million people in the collection, and you want this query to finish soonish. So you need to use an index. But a full index scan isn't so great; you want to do an index lookup if possible.

That's simple enough; we generate proxy types to record what properties you're using and what operations you're performing. PersonProxy records that you're accessing a field 'surname', gives a StringFieldProxy, and that records that you're checking for equality with the string "Slughorn". The lambda returns true when opEquals returns true.

But people write queries that are more complex than that, like:

  people.filter!(x => x.surname == "Slughorn" || x.age <= 17);

First time you run this, x.surname.opEquals("Slughorn") returns true and the expression as a whole returns true. You missed the second part of the expression. That's bad.

So we need to evaluate this lambda twice per parameter. (Actually, thanks to opCmp, sometimes you'll have to evaluate it three times.) We use that to build up a giant truth table, and then we can execute your query.

And that "twice per parameter" thing is exponential, and we build up a truth table that's exponentially large with respect to the complexity of the query. Some queries I've written for production systems would take a week for this system to prepare to execute and require a petabyte of storage space.

This is, shall we say, less than ideal.

You might be able to support queries as large as ten comparisons in a reasonable timeframe. But for all but the most trivial queries, it'll be faster to use SQL.

Fortunately, trivial queries are common. You could write this system and have it work up to, say, five comparisons, and when your query exceeds that limit, it will throw an exception asking you to rewrite the query in SQL.

It wouldn't be able to tell you what the equivalent query is, however. It's using a truth table and doesn't have access to what you wrote. I mean, sure, it could try to work backwards from the truth table, but that's rather expensive.

January 02, 2016

Re: Proposal: Database Engine for D

Posted by Sebastiaan Koppe
in reply to Jacob Carlborg

Sebastiaan Koppe

Posted in reply to Jacob Carlborg

On Saturday, 2 January 2016 at 19:48:26 UTC, Jacob Carlborg wrote:
> I would rather to the opposite. Generate the necessary SQL for a migration based on a struct, not the other way around.

I meant that you generate the struct from the DSL, and then migration code from that struct.

January 03, 2016

Re: Proposal: Database Engine for D

Posted by Jacob Carlborg
in reply to Sebastiaan Koppe

Jacob Carlborg

Posted in reply to Sebastiaan Koppe

On 2016-01-02 21:48, Sebastiaan Koppe wrote:

> I meant that you generate the struct from the DSL, and then migration
> code from that struct.

I don't think I understand, it seems complicated.

-- 
/Jacob Carlborg

January 03, 2016

Re: Proposal: Database Engine for D

Posted by Sebastiaan Koppe
in reply to Jacob Carlborg

Sebastiaan Koppe

Posted in reply to Jacob Carlborg

On Sunday, 3 January 2016 at 14:32:48 UTC, Jacob Carlborg wrote:
> On 2016-01-02 21:48, Sebastiaan Koppe wrote:
>
>> I meant that you generate the struct from the DSL, and then migration
>> code from that struct.
>
> I don't think I understand, it seems complicated.

Suppose you have this:

mixin(db(`
Entity Person
  Fields
    name -> string
    age -> integer
  Query
    byAge(a -> integer) -> age == a
`));

which generates something like this:

struct Person
{
  string name;
  int age
}
auto getPersonByAge(DB db, int a)
{
  return db.prepare!Person("SELECT name,age FROM Person WHERE age = ?").query(a);
}

and then later in time:

mixin(db(`
Entity Person
  Fields
    name -> string
    age -> integer
    phone -> string
  Query
    byAge(a -> integer) -> age == a
`));

Given that you have access to both version it is easy to generate migration code for the phone field.

Maybe it is contrived, but I think it shows you can do more with the DSL than just validating queries.

January 03, 2016

Re: Proposal: Database Engine for D

Posted by Abdulhaq
in reply to Piotrek

Abdulhaq

Posted in reply to Piotrek

On Thursday, 31 December 2015 at 17:14:55 UTC, Piotrek wrote:
> The goal of this post is to measure the craziness of an idea to embed a database engine into the D language ;)
>
> I think about a database engine which would meet my three main requirements:
>   - integrated with D (ranges)
>   - ACID
>   - fast
>
> Since the days when I was working on financing data SW I become allergic to SQL. I though that NoSQL databases would fill the bill. Unfortunately they didn't. And I want to have an ability to write a code like this without too much effort:
>
>   struct Person
>   {
>    string name;
>    string surname;
>    ubyte age;
>    Address address;
>   }
>
>  DataBase db = new DataBase("file.db");
>  auto coll = db.collection!Person("NSA.Registry");
>  auto visitationList = coll.filter!(p => p.name == "James");
>  writeln (visitationList);
>
> And other things like updating and deleting from db. I think you get my point.
>
> So I started a PoC project based on SQLite design:
> https://github.com/PiotrekDlang/AirLock/blob/master/docs/database/design.md#architecture
>
> The PoC code: https://github.com/PiotrekDlang/AirLock/tree/master/src/database
>
> Can you please share your thoughts and experience on the topic? Has anyone tried similar things?
>
> Piotrek

My two pence, if you want it to be fast then it must have a good implementation of indices. Your filter functions should not actually start collecting real records, but instead should simply change the way that the cursor traverses the underlying data store. You will need good query 'compilation' like the big boys do, which work out which tables and indices to use and in which order, based on stats of the data / indices.

If you want ACID then SQL seems like a good approach to me, certainly I wouldn't want anything ORM-like for updating / inserting data.

There a number of good libraries out there already, SQLite obviously springs to mind.

It would be a fun project but perhaps a lot more work than you realised if you really want isolation levels, speed etc.

January 03, 2016

Re: Proposal: Database Engine for D

Posted by Jakob Jenkov
in reply to Piotrek

Jakob Jenkov

Posted in reply to Piotrek

You could just target your database at data analysis. Then you don't need to care about ACID, transactions etc. Just load all the data into memory, and start analyzing it.

Also, you'd typically be scanning over large parts of the data set for each query, so you may not need to support a full query language. Just what is needed for data analysis.

Later you can modify your engine to support ACID, more expressive query language etc.

On one of the projects I am working on right now, we will also implement our own database engine. We need it to integrate tightly with the rest our architecture, and the only way to do that is to roll our own. We will also not be using SQL because SQL is so limiting.

So, I'd say "go ahead" - you can only learn something from the project. I've "reinvented a lot of wheels" over the years, and each time I came out smarter than before. Not every reinvention was a success, but I always learned something from the process.

January 03, 2016

Re: Proposal: Database Engine for D

Posted by Andrei Alexandrescu
in reply to Chris Wright

Andrei Alexandrescu

Posted in reply to Chris Wright

On 1/2/16 3:47 PM, Chris Wright wrote:
> On Sat, 02 Jan 2016 16:40:16 +0000, Piotrek wrote:
>
>> On Friday, 1 January 2016 at 10:00:43 UTC, Kapps wrote:
>>
>>> This example shows the difficulty of doing this in D. You can't really
>>> have something like `p.Name == "James"`, or `p.Age < 21`
>>> translate to SQL properly without language changes, which I believe
>>> Walter or Andrei were against. This has been the key problem when
>>> things like Linq to Sql for D have been brought up before.
>>
>> Not really. There is no translation stage to SQL or any other DSL in the
>> proposal. So this problem doesn't exist and no language changes are
>> needed.
>
> So you want to create the following query:
>
>    people.filter!(x => x.surname == "Slughorn");
>
> And you've got ten million people in the collection, and you want this
> query to finish soonish. So you need to use an index. But a full index
> scan isn't so great; you want to do an index lookup if possible.
>
> That's simple enough; we generate proxy types to record what properties
> you're using and what operations you're performing. PersonProxy records
> that you're accessing a field 'surname', gives a StringFieldProxy, and
> that records that you're checking for equality with the string "Slughorn".
> The lambda returns true when opEquals returns true.
>
> But people write queries that are more complex than that, like:
>
>    people.filter!(x => x.surname == "Slughorn" || x.age <= 17);
>
> First time you run this, x.surname.opEquals("Slughorn") returns true and
> the expression as a whole returns true. You missed the second part of the
> expression. That's bad.
>
> So we need to evaluate this lambda twice per parameter. (Actually, thanks
> to opCmp, sometimes you'll have to evaluate it three times.) We use that
> to build up a giant truth table, and then we can execute your query.
>
> And that "twice per parameter" thing is exponential, and we build up a
> truth table that's exponentially large with respect to the complexity of
> the query. Some queries I've written for production systems would take a
> week for this system to prepare to execute and require a petabyte of
> storage space.
>
> This is, shall we say, less than ideal.

This may in fact be good signal that an approach based on expression templates is not the most appropriate for D. -- Andrei

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation