April 03, 2011
On 3-apr-11, at 16:44, Fawzi Mohamed wrote:

> On 3-apr-11, at 15:59, Christian Manning wrote:
>
>> [...]
>> I was going to reply with a link to your work but you beat me to it.
>> I think it's a great design and incorporating it or something
>> similar into the API may be the way to go.
>
> Indeed ddb looks really nice (I hadn't looked at it yet), given it though, I have to agree that just adding mySQL support is too little and not really innovative for 3 months work...

Looking more maybe I was a bit too harsh, if you define clearly the
goals of your API then yes it might be a good project.
The api doesn't have to be defined yet, but a more detailed definition
of its goals should be there, maybe with code example of some usages.
Questions that should be answered:

* support for static and dynamic types.
how access of dynamic and static types differs, should be as little as
possible, and definitely the access one uses for dynamic types should
work without changes on static types
* class or struct for row object
* support for table specific classes?
* reference to description of the table (to be able to get also
dynamic types by column name, but avoid using too much memory for the
structure)
* Nice to define table structure, and what happens if the db has
another structure.
* you want to support only access or also db creation and modification?

I feel that these things should be addressed in a complete proposal, with possible answers that might be changed later on depending on how things actually go.

Fawzi

April 03, 2011
On 3-apr-11, at 16:52, Piotr Szturmaj wrote:

> [...]
> Thanks. At this time, you can write an interface for MySQL, SQLite or other relational databases, using the same DBRow struct. Naming of course may be changed to DataRow, Row or other, depending on the choice of community.
>
> In regards of base interfaces like IConnection or (semi-)abstract class DBConnection, I think we should have common API for all clients, but only to some extent. There are many features available in some database servers, while not available in others, for example OIDs (object identifiers) are fundamental thing in PostgreSQL, but they simply don't exist in MySQL. So, PGCommand would give you information on lastInsertedOID, while MySQLCommand would not.
> This is also proven in ADO.NET, where each client is rarely used with common base interface, because it blocks many of its useful features.
>
> I think base interface should be defined only after some of the most popular RDBMS clients are finished. Also interface should be choosen to cover the most featured/advanced database client. This is why I started with PostgreSQL, as its the most powerful open-source RDBMS. If base interface will cover it, it will also cover some less powerful RDBMSes.

I think that you project looks nice, but see some of the comments in my other message.
I would for example consider separating table definition from row object, and while your row object is really nice, often one has either a single DB model, described in a few model files or goes with a fully dynamic model.
In large project one does not/should not, define RowTypes on the fly everywhere in the code.
So I would try to improve the way one describes a table, or a full database.
Your DBRow type is definitely nice, and is a good starting point, but there is definitely more work to do (not that you had said otherwise :).

Fawzi
April 03, 2011
Fawzi Mohamed wrote:
> I think that you project looks nice, but see some of the comments in my
> other message.
> I would for example consider separating table definition from row
> object, and while your row object is really nice, often one has either a
> single DB model, described in a few model files or goes with a fully
> dynamic model.
> In large project one does not/should not, define RowTypes on the fly
> everywhere in the code.

There's no need to declare all row types. DBRow support both static and dynamic models. For dynamic rows, DBRow uses Variant[] as its underlying type. This is previous sample code, but changed to use dynamic row:

auto cmd = new PGCommand(conn, "SELECT typname, typlen FROM pg_type");
auto result = cmd.executeQuery;

foreach (row; result)
{
    // here, row subtypes a Variant[]
    writeln(row[0], ", ", row[1]);
}

Btw. I've just updated documentation, so you can take another look :)
April 03, 2011
On 3-apr-11, at 18:37, Piotr Szturmaj wrote:

> Fawzi Mohamed wrote:
>> I think that you project looks nice, but see some of the comments in my
>> other message.
>> I would for example consider separating table definition from row
>> object, and while your row object is really nice, often one has either a
>> single DB model, described in a few model files or goes with a fully
>> dynamic model.
>> In large project one does not/should not, define RowTypes on the fly
>> everywhere in the code.
>
> There's no need to declare all row types. DBRow support both static and dynamic models. For dynamic rows, DBRow uses Variant[] as its underlying type. This is previous sample code, but changed to use dynamic row:
>
> auto cmd = new PGCommand(conn, "SELECT typname, typlen FROM pg_type");
> auto result = cmd.executeQuery;
>
> foreach (row; result)
> {
>    // here, row subtypes a Variant[]
>    writeln(row[0], ", ", row[1]);
> }
>
> Btw. I've just updated documentation, so you can take another look :)

Yes I saw that, that is exactly the reason I was telling about splitting the table definition in another object, so that also in the dynamic case one can use the column names (that normally are known, or can be retrieved from the db schema).
That would only add a pointer to each row (to its description), and would make it much nicer to use.
Your DBRow is very nice to use, and I like how it can accommodate both types, but it degrades too much for dynamic types imho.

Fawzi

April 03, 2011
Fawzi Mohamed wrote:
> Looking more maybe I was a bit too harsh, if you define clearly the
> goals of your API then yes it might be a good project.
> The api doesn't have to be defined yet, but a more detailed definition
> of its goals should be there, maybe with code example of some usages.
> Questions that should be answered:

I know your response is'nt to me, but please let me answer these questions from my point of view, based on my recent work on ddb.

> * support for static and dynamic types.
> how access of dynamic and static types differs, should be as little as
> possible, and definitely the access one uses for dynamic types should
> work without changes on static types

If you mean statically or dynamically typed data row then I can say my DBRow support both.

> * class or struct for row object

I'm using struct, because I think row received from database is a value type rather than reference. If one selects rows from one table then yes, it is possible to do some referencing based on primary key, but anyway I think updates should be done explicitly, because row could be deleted in the meantime. In more complex queries, not all of selected rows are materialized, i.e. they may be from computed columns, view columns, aggregate functions and so on. Allocation overhead is also lower for structs.

> * support for table specific classes?

Table specific classes may be written by user and somehow wrap underlying row type.

> * reference to description of the table (to be able to get also dynamic
> types by column name, but avoid using too much memory for the structure)

My PostgreSQL client already supports that. Class PGCommand has member "fields", which contain information about returned columns. You can even check what columns will be returned from a query, before actually executing it.

> * Nice to define table structure, and what happens if the db has another
> structure.

This is a problem for ORM, but at first, we need standard query API.

> * you want to support only access or also db creation and modification?

First, I'm preparing base "traditional" API. Then I want to write simple object-relational mapping. I've already written some code that generated CREATE TABLE for structs at compile time. Static typing of row fields is very helpful here.
April 03, 2011
Fawzi Mohamed wrote:
>
> On 3-apr-11, at 18:37, Piotr Szturmaj wrote:
>
>> Fawzi Mohamed wrote:
>>> I think that you project looks nice, but see some of the comments in my
>>> other message.
>>> I would for example consider separating table definition from row
>>> object, and while your row object is really nice, often one has either a
>>> single DB model, described in a few model files or goes with a fully
>>> dynamic model.
>>> In large project one does not/should not, define RowTypes on the fly
>>> everywhere in the code.
>>
>> There's no need to declare all row types. DBRow support both static
>> and dynamic models. For dynamic rows, DBRow uses Variant[] as its
>> underlying type. This is previous sample code, but changed to use
>> dynamic row:
>>
>> auto cmd = new PGCommand(conn, "SELECT typname, typlen FROM pg_type");
>> auto result = cmd.executeQuery;
>>
>> foreach (row; result)
>> {
>> // here, row subtypes a Variant[]
>> writeln(row[0], ", ", row[1]);
>> }
>>
>> Btw. I've just updated documentation, so you can take another look :)
>
> Yes I saw that, that is exactly the reason I was telling about splitting
> the table definition in another object, so that also in the dynamic case
> one can use the column names (that normally are known, or can be
> retrieved from the db schema).
> That would only add a pointer to each row (to its description), and
> would make it much nicer to use.
> Your DBRow is very nice to use, and I like how it can accommodate both
> types, but it degrades too much for dynamic types imho.

Ah, I see what you mean :) This is yet to be done feature :)

I assume you mean something like row["typname"]. Soon, I will add support for this.
April 03, 2011
On 3-apr-11, at 19:28, Piotr Szturmaj wrote:

> Fawzi Mohamed wrote:
>> Looking more maybe I was a bit too harsh, if you define clearly the
>> goals of your API then yes it might be a good project.
>> The api doesn't have to be defined yet, but a more detailed definition
>> of its goals should be there, maybe with code example of some usages.
>> Questions that should be answered:
>
> I know your response is'nt to me, but please let me answer these questions from my point of view, based on my recent work on ddb.

I think that your responses are very relevant, as it seems to me that your work is nice, and I find that if a GSoC is done in that direction it should definitely work together with the good work that is already done, let's don't create multiple competing projects if people are willing to work together.

>> * support for static and dynamic types.
>> how access of dynamic and static types differs, should be as little as
>> possible, and definitely the access one uses for dynamic types should
>> work without changes on static types
>
> If you mean statically or dynamically typed data row then I can say my DBRow support both.

yes but as I said I find the support for dynamic data rows weak.

>> * class or struct for row object
>
> I'm using struct, because I think row received from database is a value type rather than reference. If one selects rows from one table then yes, it is possible to do some referencing based on primary key, but anyway I think updates should be done explicitly, because row could be deleted in the meantime. In more complex queries, not all of selected rows are materialized, i.e. they may be from computed columns, view columns, aggregate functions and so on. Allocation overhead is also lower for structs.
>
>> * support for table specific classes?
>
> Table specific classes may be written by user and somehow wrap underlying row type.

well with the current approach it is ugly because your calls would be another type, thus either you remove all typing or you can't have generic functions, accepting rows, everything has to be a template, looping on a table or a row you always need a template.

>> * reference to description of the table (to be able to get also dynamic
>> types by column name, but avoid using too much memory for the structure)
>
> My PostgreSQL client already supports that. Class PGCommand has member "fields", which contain information about returned columns. You can even check what columns will be returned from a query, before actually executing it.

ok that is nice, and my point is that the type that the user sees by default should automatically take advantage of that

>> * Nice to define table structure, and what happens if the db has another
>> structure.
>
> This is a problem for ORM, but at first, we need standard query API.

I am not so sure about this, yes these (also classes for tables) are part of the ORM, but the normal users will more often be at the ORM level I think, and how exactly we want the things look like that the object level can influence the choice of the best low level interface.

>> * you want to support only access or also db creation and modification?
>
> First, I'm preparing base "traditional" API. Then I want to write simple object-relational mapping. I've already written some code that generated CREATE TABLE for structs at compile time. Static typing of row fields is very helpful here.

Very good I think that working on getting the API right there and having it nice to use is important.
Maybe you are right and the current DBRow is indeed the best abstraction, but I am not yet 100% sure, to me it looks like it isn't the best end user abstraction (but it might be an excellent low level object)

April 03, 2011
On 3-apr-11, at 19:54, Piotr Szturmaj wrote:

> Fawzi Mohamed wrote:
>>
>> On 3-apr-11, at 18:37, Piotr Szturmaj wrote:
>>
>>> Fawzi Mohamed wrote:
>>>> I think that you project looks nice, but see some of the comments in my
>>>> other message.
>>>> I would for example consider separating table definition from row
>>>> object, and while your row object is really nice, often one has either a
>>>> single DB model, described in a few model files or goes with a fully
>>>> dynamic model.
>>>> In large project one does not/should not, define RowTypes on the fly
>>>> everywhere in the code.
>>>
>>> There's no need to declare all row types. DBRow support both static
>>> and dynamic models. For dynamic rows, DBRow uses Variant[] as its
>>> underlying type. This is previous sample code, but changed to use
>>> dynamic row:
>>>
>>> auto cmd = new PGCommand(conn, "SELECT typname, typlen FROM pg_type");
>>> auto result = cmd.executeQuery;
>>>
>>> foreach (row; result)
>>> {
>>> // here, row subtypes a Variant[]
>>> writeln(row[0], ", ", row[1]);
>>> }
>>>
>>> Btw. I've just updated documentation, so you can take another look :)
>>
>> Yes I saw that, that is exactly the reason I was telling about splitting
>> the table definition in another object, so that also in the dynamic case
>> one can use the column names (that normally are known, or can be
>> retrieved from the db schema).
>> That would only add a pointer to each row (to its description), and
>> would make it much nicer to use.
>> Your DBRow is very nice to use, and I like how it can accommodate both
>> types, but it degrades too much for dynamic types imho.
>
> Ah, I see what you mean :) This is yet to be done feature :)
>
> I assume you mean something like row["typname"]. Soon, I will add support for this.

yes exactly, great
April 03, 2011
On 4/2/11 3:03 PM, Christian Manning wrote:
> Hello all,
>
> This is my first draft proposal for a Database API for Google Summer Of
> Code. I have never written a document such as this so any and all
> feedback is welcome.
>
> Thanks
[snip]

Thanks for your interest and for sharing your draft proposal.

Fawzi is doing an excellent job at making suggestions for improving the proposal. Let me add some.

Generally you need to create a compelling case that you know what your project entails, you have thoroughly studied the state of the art, and you are able to take the project to completion. Digital Mars' reputation is at stake here - we need to make sure that we're using Google's money and everybody's time to good end.

Here are some more additions to the proposal that would improve it:

* What is your level of understanding of D? How do you believe you could use D's templates for improving the API compared to JDBC? If you choose to copy JDBC's interface, how do you justify relying on dynamic typing alone?

* What coursework did you complete? As a second-year student this makes it easier for us to assess where you are in terms of expertise. Scores would help as well.

* Since you now know of existing work, have you contacted Piotr for collaboration? Would he give you his API to work on? Would he be available to help as a formal mentor or informally? What is the integration plan?

* If there project were totally successful, what features do you expect it has and what would be the impact?

* What is the absolute minimum level of functionality that would still qualify the project as successful?

* Also include Fawzi's suggestions focused on details of API definition.


Thanks,

Andrei

April 03, 2011
On 03/04/2011 19:30, Andrei Alexandrescu wrote:
> * What coursework did you complete? As a second-year student this makes
> it easier for us to assess where you are in terms of expertise. Scores
> would help as well.

By this do you mean you'd like to see my completed courseworks? Or just descriptions and scores (where available)?

I'll be working on my proposal to include yours and Fawzi's suggestions and post it as a reply to my first draft.

Thanks for the help so far in this stage, Andrei, Fawzi and Piotr.

Chris