April 02, 2011 [GSOC] Database API draft proposal | ||||
|---|---|---|---|---|
| ||||
Hello all, This is my first draft proposal for a Database API for Google Summer Of Code. I have never written a document such as this so any and all feedback is welcome. Thanks --------------------------------- Synopsis -------- An API for databases is a common component of many languages' standard library, though Phobos currently lacks this. This project will remedy this by providing such an API and also begin to utilise it with interfaces for some Database Management Systems (DBMS). I believe this will benefit the D community greatly and will help bring attention and developers to the language. Details ------- This project takes influence from the Java Database Connectivity API (JDBC), the Python Database API v2 and other similar interfaces. The idea is that any database interface created for use with D will follow the API so that the only thing to change is the database back-end being used. This will make working with databases in D a much easier experience. I plan to have several interfaces in a database module which are then implemented for specific DBMSs. For example: module database; interface Connection { //method definitions for connecting to databases go here. } Then in an implementation of MySQL for example: module mysql; import database; class Connect : Connection { //implement defined methods tailoring to MySQL. } What goes in to these interfaces will be decided in conjunction with the D community so that there is minimal conflict and it will benefit as many circumstances as possible. I believe this to be the best route to take as I cannot speak for everyone who will be using this. Using the API created I plan to create an example implementation, initially wrapping around the MySQL C API. This will be a good starting point for this project and more can be created, time permitting. About Me -------- My name is Christian Manning and I am a second year undergraduate studying Computer Science at De Montfort University. I've become interested in D over time after reading about it several years ago. I got myself "The D Programming Language" and went from there. Although I've not done anything useful in D as I've learnt mainly C and Java and am unable to use D for my university projects, I think I'm capable of achieving the goals of this project. | ||||
April 03, 2011 Re: [GSOC] Database API draft proposal | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Christian Manning | On 04/02/2011 10:03 PM, Christian Manning wrote: > I plan to have several interfaces in a database module which are then > implemented for specific DBMSs. > For example: > > module database; > > interface Connection { > //method definitions for connecting to databases go here. > } > > Then in an implementation of MySQL for example: > > module mysql; > > import database; > > class Connect : Connection { > //implement defined methods tailoring to MySQL. > } I would recommend to use slightly longer names for generic interfaces, eg "IConnection" or "DBConnection". Then, authors of libraries / implementations for specific DBMS like MySQL can use the shorter ones, eg "Connection", which will be all what library clients will see and use. This also avoids the need for "lexical hacks" like "Connection" versus "Connect". What do you think? > What goes in to these interfaces will be decided in conjunction with the D > community so that there is minimal conflict and it will benefit as many > circumstances as possible. I believe this to be the best route to take as I > cannot speak for everyone who will be using this. > > Using the API created I plan to create an example implementation, initially > wrapping around the MySQL C API. This will be a good starting point for this > project and more can be created, time permitting. I have no idea of the actual size of such an interface design, but I doubt it can make you busy for 3 months full time, especially since there are (probably good) precedents for other languages. Maybe the example implementation should be specified as part of the project? Denis -- _________________ vita es estrany spir.wikidot.com | |||
April 03, 2011 Re: [GSOC] Database API draft proposal | ||||
|---|---|---|---|---|
| ||||
Posted in reply to spir | On 03/04/2011 13:10, spir wrote: > On 04/02/2011 10:03 PM, Christian Manning wrote: > >> I plan to have several interfaces in a database module which are then >> implemented for specific DBMSs. >> For example: >> >> module database; >> >> interface Connection { >> //method definitions for connecting to databases go here. >> } >> >> Then in an implementation of MySQL for example: >> >> module mysql; >> >> import database; >> >> class Connect : Connection { >> //implement defined methods tailoring to MySQL. >> } > > I would recommend to use slightly longer names for generic interfaces, > eg "IConnection" or "DBConnection". Then, authors of libraries / > implementations for specific DBMS like MySQL can use the shorter ones, > eg "Connection", which will be all what library clients will see and > use. This also avoids the need for "lexical hacks" like "Connection" > versus "Connect". > What do you think? When I was writing that it really didn't sit well and "DBConnection" in particular is a much better way of doing it to reduce some confusion there. >> What goes in to these interfaces will be decided in conjunction with >> the D >> community so that there is minimal conflict and it will benefit as many >> circumstances as possible. I believe this to be the best route to take >> as I >> cannot speak for everyone who will be using this. >> >> Using the API created I plan to create an example implementation, >> initially >> wrapping around the MySQL C API. This will be a good starting point >> for this >> project and more can be created, time permitting. > > I have no idea of the actual size of such an interface design, but I > doubt it can make you busy for 3 months full time, especially since > there are (probably good) precedents for other languages. Maybe the > example implementation should be specified as part of the project? I'm aware that it wouldn't take 3 months, but I don't know how long it will take to have the API agreed upon so that there's a general consensus. Another way I could do it is to decide on the API myself and begin implementing DBMSs with it and then adapt to the ideas brought forth by the community. Then, everyone's happy, just in a different time frame. Though, if there are a lot of changes wanted I'd have to change all of my implementations depending on how far I am at the time. What do you think about that path? Thanks for the feedback, it's much appreciated :) Chris | |||
April 03, 2011 Re: [GSOC] Database API draft proposal | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Christian Manning | Well the comments in there are what is important, and will need to be specified better IMHO. The most important part in my opinion is how one chooses to represent a record. A big design choice is if the various fields are defined at compile time or at runtime. Also how does one add special behavior to a record? Do you use a subclasses of the generic record type (as ruby does for example)? D2 adds some more method to allow for generic accessors, so one can have a dynamic implementation, while still using static accessors. Maybe one should allow for both dynamic records and static ones. The efficient storage of results of a db query is an important point. Are you aware of http://dsource.org/projects/ddbi for D1? If one wants to have a nice efficient and well tested interface, supporting more than one DB then I think that there is enough work to do. Fawzi On 3-apr-11, at 14:33, Christian Manning wrote: > On 03/04/2011 13:10, spir wrote: >> On 04/02/2011 10:03 PM, Christian Manning wrote: >> >>> I plan to have several interfaces in a database module which are then >>> implemented for specific DBMSs. >>> For example: >>> >>> module database; >>> >>> interface Connection { >>> //method definitions for connecting to databases go here. >>> } >>> >>> Then in an implementation of MySQL for example: >>> >>> module mysql; >>> >>> import database; >>> >>> class Connect : Connection { >>> //implement defined methods tailoring to MySQL. >>> } >> >> I would recommend to use slightly longer names for generic interfaces, >> eg "IConnection" or "DBConnection". Then, authors of libraries / >> implementations for specific DBMS like MySQL can use the shorter ones, >> eg "Connection", which will be all what library clients will see and >> use. This also avoids the need for "lexical hacks" like "Connection" >> versus "Connect". >> What do you think? > > When I was writing that it really didn't sit well and "DBConnection" in particular is a much better way of doing it to reduce some confusion there. > >>> What goes in to these interfaces will be decided in conjunction with >>> the D >>> community so that there is minimal conflict and it will benefit as many >>> circumstances as possible. I believe this to be the best route to take >>> as I >>> cannot speak for everyone who will be using this. >>> >>> Using the API created I plan to create an example implementation, >>> initially >>> wrapping around the MySQL C API. This will be a good starting point >>> for this >>> project and more can be created, time permitting. >> >> I have no idea of the actual size of such an interface design, but I >> doubt it can make you busy for 3 months full time, especially since >> there are (probably good) precedents for other languages. Maybe the >> example implementation should be specified as part of the project? > > I'm aware that it wouldn't take 3 months, but I don't know how long it will take to have the API agreed upon so that there's a general consensus. Another way I could do it is to decide on the API myself and begin implementing DBMSs with it and then adapt to the ideas brought forth by the community. Then, everyone's happy, just in a different time frame. Though, if there are a lot of changes wanted I'd have to change all of my implementations depending on how far I am at the time. What do you think about that path? > > Thanks for the feedback, it's much appreciated :) > > Chris | |||
April 03, 2011 Re: [GSOC] Database API draft proposal | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Christian Manning | On 04/03/2011 02:33 PM, Christian Manning wrote: > On 03/04/2011 13:10, spir wrote: >> On 04/02/2011 10:03 PM, Christian Manning wrote: >> >>> I plan to have several interfaces in a database module which are then >>> implemented for specific DBMSs. >>> For example: >>> >>> module database; >>> >>> interface Connection { >>> //method definitions for connecting to databases go here. >>> } >>> >>> Then in an implementation of MySQL for example: >>> >>> module mysql; >>> >>> import database; >>> >>> class Connect : Connection { >>> //implement defined methods tailoring to MySQL. >>> } >> >> I would recommend to use slightly longer names for generic interfaces, >> eg "IConnection" or "DBConnection". Then, authors of libraries / >> implementations for specific DBMS like MySQL can use the shorter ones, >> eg "Connection", which will be all what library clients will see and >> use. This also avoids the need for "lexical hacks" like "Connection" >> versus "Connect". >> What do you think? > > When I was writing that it really didn't sit well and "DBConnection" in > particular is a much better way of doing it to reduce some confusion there. > >>> What goes in to these interfaces will be decided in conjunction with >>> the D >>> community so that there is minimal conflict and it will benefit as many >>> circumstances as possible. I believe this to be the best route to take >>> as I >>> cannot speak for everyone who will be using this. >>> >>> Using the API created I plan to create an example implementation, >>> initially >>> wrapping around the MySQL C API. This will be a good starting point >>> for this >>> project and more can be created, time permitting. >> >> I have no idea of the actual size of such an interface design, but I >> doubt it can make you busy for 3 months full time, especially since >> there are (probably good) precedents for other languages. Maybe the >> example implementation should be specified as part of the project? > > I'm aware that it wouldn't take 3 months, but I don't know how long it will > take to have the API agreed upon so that there's a general consensus. Another > way I could do it is to decide on the API myself and begin implementing DBMSs > with it and then adapt to the ideas brought forth by the community. Then, > everyone's happy, just in a different time frame. Though, if there are a lot of > changes wanted I'd have to change all of my implementations depending on how > far I am at the time. What do you think about that path? I would go for the second, especially because there is a Python example (probably one of the best languages out there for such design questions). Just think at usual qualities: clarity / simplicity / consistency (and currently discussed Phobos style guidelines). Also: * Implementation example(s) is source of feedback for the interface quality. * Once you've done it, rewriting the exact same feature with a different design can be very fast (esp if the change is only about interface), because you master the application. I personly would appreciate an example for a simpler and/or non-relational, DBMS (maybe it's only me) (I'm thinking at key:value like Berkeley DB, object DBMS, SQLite...). Denis -- _________________ vita es estrany spir.wikidot.com | |||
April 03, 2011 Re: [GSOC] Database API draft proposal | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Fawzi Mohamed | Fawzi Mohamed wrote: > Well the comments in there are what is important, and will need to be > specified better IMHO. > > The most important part in my opinion is how one chooses to represent a > record. > A big design choice is if the various fields are defined at compile time > or at runtime. > Also how does one add special behavior to a record? Do you use a > subclasses of the generic record type (as ruby does for example)? I'm working on DB API for few months in my spare time. I'm delayed that much by my other projects. Please take a look at my ideas: http://github.com/pszturmaj/ddb Documentation: http://pszturmaj.github.com/ddb/db.html http://pszturmaj.github.com/ddb/postgres.html In my code, row is represented using struct DBRow!(Specs...). Fields may be known at compile time or not. DBRow besides base types, may be instantiated using structs, tuples or arrays. Untyped row (no compile time information) is DBRow!(Variant[]). Typed rows are very useful, for example you don't need to manually cast columns to your types, it's done automatically, e.g.: auto cmd = new PGCommand(conn, "SELECT typname, typlen FROM pg_type"); auto result = cmd.executeQuery!(string, "typName", int, "len"); foreach (row; result) { // here, row DBRow subtypes // a Tuple!(string, "typName", int, "len") writeln(row.typName, ", ", row.len); } What do you think? :) | |||
April 03, 2011 Re: [GSOC] Database API draft proposal | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Piotr Szturmaj | On 03/04/2011 14:42, Piotr Szturmaj wrote:
> Fawzi Mohamed wrote:
>> Well the comments in there are what is important, and will need to be
>> specified better IMHO.
>>
>> The most important part in my opinion is how one chooses to represent a
>> record.
>> A big design choice is if the various fields are defined at compile time
>> or at runtime.
>> Also how does one add special behavior to a record? Do you use a
>> subclasses of the generic record type (as ruby does for example)?
>
> I'm working on DB API for few months in my spare time. I'm delayed that
> much by my other projects. Please take a look at my ideas:
>
> http://github.com/pszturmaj/ddb
>
> Documentation:
> http://pszturmaj.github.com/ddb/db.html
> http://pszturmaj.github.com/ddb/postgres.html
>
> In my code, row is represented using struct DBRow!(Specs...). Fields may
> be known at compile time or not. DBRow besides base types, may be
> instantiated using structs, tuples or arrays. Untyped row (no compile
> time information) is DBRow!(Variant[]).
>
> Typed rows are very useful, for example you don't need to manually cast
> columns to your types, it's done automatically, e.g.:
>
> auto cmd = new PGCommand(conn, "SELECT typname, typlen FROM pg_type");
> auto result = cmd.executeQuery!(string, "typName", int, "len");
>
> foreach (row; result)
> {
> // here, row DBRow subtypes
> // a Tuple!(string, "typName", int, "len")
> writeln(row.typName, ", ", row.len);
> }
>
> What do you think? :)
I was going to reply with a link to your work but you beat me to it.
I think it's a great design and incorporating it or something similar into the API may be the way to go.
| |||
April 03, 2011 Re: [GSOC] Database API draft proposal | ||||
|---|---|---|---|---|
| ||||
Posted in reply to spir | On 03/04/2011 14:16, spir wrote:
> On 04/03/2011 02:33 PM, Christian Manning wrote:
>> On 03/04/2011 13:10, spir wrote:
>>> On 04/02/2011 10:03 PM, Christian Manning wrote:
>>>
>>>> I plan to have several interfaces in a database module which are then
>>>> implemented for specific DBMSs.
>>>> For example:
>>>>
>>>> module database;
>>>>
>>>> interface Connection {
>>>> //method definitions for connecting to databases go here.
>>>> }
>>>>
>>>> Then in an implementation of MySQL for example:
>>>>
>>>> module mysql;
>>>>
>>>> import database;
>>>>
>>>> class Connect : Connection {
>>>> //implement defined methods tailoring to MySQL.
>>>> }
>>>
>>> I would recommend to use slightly longer names for generic interfaces,
>>> eg "IConnection" or "DBConnection". Then, authors of libraries /
>>> implementations for specific DBMS like MySQL can use the shorter ones,
>>> eg "Connection", which will be all what library clients will see and
>>> use. This also avoids the need for "lexical hacks" like "Connection"
>>> versus "Connect".
>>> What do you think?
>>
>> When I was writing that it really didn't sit well and "DBConnection" in
>> particular is a much better way of doing it to reduce some confusion
>> there.
>>
>>>> What goes in to these interfaces will be decided in conjunction with
>>>> the D
>>>> community so that there is minimal conflict and it will benefit as many
>>>> circumstances as possible. I believe this to be the best route to take
>>>> as I
>>>> cannot speak for everyone who will be using this.
>>>>
>>>> Using the API created I plan to create an example implementation,
>>>> initially
>>>> wrapping around the MySQL C API. This will be a good starting point
>>>> for this
>>>> project and more can be created, time permitting.
>>>
>>> I have no idea of the actual size of such an interface design, but I
>>> doubt it can make you busy for 3 months full time, especially since
>>> there are (probably good) precedents for other languages. Maybe the
>>> example implementation should be specified as part of the project?
>>
>> I'm aware that it wouldn't take 3 months, but I don't know how long it
>> will
>> take to have the API agreed upon so that there's a general consensus.
>> Another
>> way I could do it is to decide on the API myself and begin
>> implementing DBMSs
>> with it and then adapt to the ideas brought forth by the community. Then,
>> everyone's happy, just in a different time frame. Though, if there are
>> a lot of
>> changes wanted I'd have to change all of my implementations depending
>> on how
>> far I am at the time. What do you think about that path?
>
> I would go for the second, especially because there is a Python example
> (probably one of the best languages out there for such design
> questions). Just think at usual qualities: clarity / simplicity /
> consistency (and currently discussed Phobos style guidelines).
>
> Also:
> * Implementation example(s) is source of feedback for the interface
> quality.
> * Once you've done it, rewriting the exact same feature with a different
> design can be very fast (esp if the change is only about interface),
> because you master the application.
>
> I personly would appreciate an example for a simpler and/or
> non-relational, DBMS (maybe it's only me) (I'm thinking at key:value
> like Berkeley DB, object DBMS, SQLite...).
>
> Denis
SQLite could definitely be on the table. However, I don't want to be over-ambitious at this stage and then not complete the project, and all advice I've read on applying for GSOC suggests this too. If I could be more certain on the time it would take for the API alone, then I would propose more.
Would it be suitable to have something like: "If the API is not in a good state by xx/xx/2011 then y implementation will not be undertaken" ?
| |||
April 03, 2011 Re: [GSOC] Database API draft proposal | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Christian Manning | On 3-apr-11, at 15:59, Christian Manning wrote:
> On 03/04/2011 14:42, Piotr Szturmaj wrote:
>> Fawzi Mohamed wrote:
>>> Well the comments in there are what is important, and will need to be
>>> specified better IMHO.
>>>
>>> The most important part in my opinion is how one chooses to represent a
>>> record.
>>> A big design choice is if the various fields are defined at compile time
>>> or at runtime.
>>> Also how does one add special behavior to a record? Do you use a
>>> subclasses of the generic record type (as ruby does for example)?
>>
>> I'm working on DB API for few months in my spare time. I'm delayed that
>> much by my other projects. Please take a look at my ideas:
>>
>> http://github.com/pszturmaj/ddb
>>
>> Documentation:
>> http://pszturmaj.github.com/ddb/db.html
>> http://pszturmaj.github.com/ddb/postgres.html
>>
>> In my code, row is represented using struct DBRow!(Specs...). Fields may
>> be known at compile time or not. DBRow besides base types, may be
>> instantiated using structs, tuples or arrays. Untyped row (no compile
>> time information) is DBRow!(Variant[]).
>>
>> Typed rows are very useful, for example you don't need to manually cast
>> columns to your types, it's done automatically, e.g.:
>>
>> auto cmd = new PGCommand(conn, "SELECT typname, typlen FROM pg_type");
>> auto result = cmd.executeQuery!(string, "typName", int, "len");
>>
>> foreach (row; result)
>> {
>> // here, row DBRow subtypes
>> // a Tuple!(string, "typName", int, "len")
>> writeln(row.typName, ", ", row.len);
>> }
>>
>> What do you think? :)
>
> I was going to reply with a link to your work but you beat me to it.
> I think it's a great design and incorporating it or something similar into the API may be the way to go.
Indeed ddb looks really nice (I hadn't looked at it yet), given it though, I have to agree that just adding mySQL support is too little and not really innovative for 3 months work...
Fawzi
| |||
April 03, 2011 Re: [GSOC] Database API draft proposal | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Christian Manning | Christian Manning wrote:
> On 03/04/2011 14:42, Piotr Szturmaj wrote:
>> Fawzi Mohamed wrote:
>>> Well the comments in there are what is important, and will need to be
>>> specified better IMHO.
>>>
>>> The most important part in my opinion is how one chooses to represent a
>>> record.
>>> A big design choice is if the various fields are defined at compile time
>>> or at runtime.
>>> Also how does one add special behavior to a record? Do you use a
>>> subclasses of the generic record type (as ruby does for example)?
>>
>> I'm working on DB API for few months in my spare time. I'm delayed that
>> much by my other projects. Please take a look at my ideas:
>>
>> http://github.com/pszturmaj/ddb
>>
>> Documentation:
>> http://pszturmaj.github.com/ddb/db.html
>> http://pszturmaj.github.com/ddb/postgres.html
>>
>> In my code, row is represented using struct DBRow!(Specs...). Fields may
>> be known at compile time or not. DBRow besides base types, may be
>> instantiated using structs, tuples or arrays. Untyped row (no compile
>> time information) is DBRow!(Variant[]).
>>
>> Typed rows are very useful, for example you don't need to manually cast
>> columns to your types, it's done automatically, e.g.:
>>
>> auto cmd = new PGCommand(conn, "SELECT typname, typlen FROM pg_type");
>> auto result = cmd.executeQuery!(string, "typName", int, "len");
>>
>> foreach (row; result)
>> {
>> // here, row DBRow subtypes
>> // a Tuple!(string, "typName", int, "len")
>> writeln(row.typName, ", ", row.len);
>> }
>>
>> What do you think? :)
>
> I was going to reply with a link to your work but you beat me to it.
> I think it's a great design and incorporating it or something similar
> into the API may be the way to go.
Thanks. At this time, you can write an interface for MySQL, SQLite or other relational databases, using the same DBRow struct. Naming of course may be changed to DataRow, Row or other, depending on the choice of community.
In regards of base interfaces like IConnection or (semi-)abstract class DBConnection, I think we should have common API for all clients, but only to some extent. There are many features available in some database servers, while not available in others, for example OIDs (object identifiers) are fundamental thing in PostgreSQL, but they simply don't exist in MySQL. So, PGCommand would give you information on lastInsertedOID, while MySQLCommand would not.
This is also proven in ADO.NET, where each client is rarely used with common base interface, because it blocks many of its useful features.
I think base interface should be defined only after some of the most popular RDBMS clients are finished. Also interface should be choosen to cover the most featured/advanced database client. This is why I started with PostgreSQL, as its the most powerful open-source RDBMS. If base interface will cover it, it will also cover some less powerful RDBMSes.
| |||
Copyright © 1999-2021 by the D Language Foundation
Permalink
Reply