Jump to page: 1 2 3
Thread overview
[RFC] - mysql-native rewrite
Sep 28, 2013
simendsjo
Sep 28, 2013
Gary Willoughby
Sep 29, 2013
Gary Willoughby
Sep 29, 2013
simendsjo
Sep 29, 2013
Gary Willoughby
Sep 30, 2013
Kagamin
Sep 30, 2013
simendsjo
Sep 30, 2013
Kagamin
Sep 30, 2013
simendsjo
Sep 30, 2013
Kagamin
Sep 30, 2013
Gary Willoughby
Sep 30, 2013
Jacob Carlborg
Sep 30, 2013
simendsjo
Sep 30, 2013
Gary Willoughby
Sep 30, 2013
Jacob Carlborg
Sep 30, 2013
Chris
Sep 30, 2013
Dicebot
Sep 30, 2013
Chris
Sep 30, 2013
Jacob Carlborg
Sep 30, 2013
Chris
Sep 30, 2013
Brad Anderson
Sep 30, 2013
simendsjo
Oct 01, 2013
Jacob Carlborg
Oct 01, 2013
ilya-stromberg
Oct 01, 2013
simendsjo
Oct 01, 2013
Nick Sabalausky
Oct 08, 2013
ollie
Oct 09, 2013
simendsjo
September 28, 2013
I've been working on a more or less complete rewrite of the mysql-native module.

The main focus has been on making the source better decoupled and more maintainable.
In the process, I've changed the API too...

The existing production code can be found here: https://github.com/rejectedsoftware/mysql-native

New code: http://simendsjo.me/files/simendsjo/mysqln-rewrite/mysqln-rewrite.7z
Documentation: http://simendsjo.me/files/simendsjo/mysqln-rewrite/docs/

The files:
* adhoc.d - Execute simple queries. This is meant to be the new high-level public API
* commands.d - Lower level public API
* result.d - Common interface for both Text and Binary MySQL results
* connection.d - Phobos/Vibe.d connection
* db.d - Vibe.d connection pool

The rest are just internal files.

Todo:
* Implement all MySQL types
* Prepared statement release/purge
* Lazily fetch data
* Fix bugs in prepared statement binding
* Implement SEND_LONG_DATA
* More integrationtests and unittests

Most of the above would be pretty simple to implement.

I'm very uncertain about several aspects of my design:
* Class vs. struct
* Returned values from MySQL - e.g. should SELECT TRUE return long as it does in MySQL, or should we interpret it as bool
* Probably a lot I don't remember right now :)

Any comments would be very helpful.
..Or in D terms: DESTROY!
September 28, 2013
On Saturday, 28 September 2013 at 16:39:27 UTC, simendsjo wrote:
> I've been working on a more or less complete rewrite of the mysql-native module.
>
> The main focus has been on making the source better decoupled and more maintainable.
> In the process, I've changed the API too...
>
> The existing production code can be found here: https://github.com/rejectedsoftware/mysql-native
>
> New code: http://simendsjo.me/files/simendsjo/mysqln-rewrite/mysqln-rewrite.7z
> Documentation: http://simendsjo.me/files/simendsjo/mysqln-rewrite/docs/
>
> The files:
> * adhoc.d - Execute simple queries. This is meant to be the new high-level public API
> * commands.d - Lower level public API
> * result.d - Common interface for both Text and Binary MySQL results
> * connection.d - Phobos/Vibe.d connection
> * db.d - Vibe.d connection pool
>
> The rest are just internal files.
>
> Todo:
> * Implement all MySQL types
> * Prepared statement release/purge
> * Lazily fetch data
> * Fix bugs in prepared statement binding
> * Implement SEND_LONG_DATA
> * More integrationtests and unittests
>
> Most of the above would be pretty simple to implement.
>
> I'm very uncertain about several aspects of my design:
> * Class vs. struct
> * Returned values from MySQL - e.g. should SELECT TRUE return long as it does in MySQL, or should we interpret it as bool
> * Probably a lot I don't remember right now :)
>
> Any comments would be very helpful.
> ..Or in D terms: DESTROY!

This sounds great! I use this library a lot so i can probably give you some feedback when i have more time. I'm about to go out now but i'll try to take a look tomorrow.
September 29, 2013
On Saturday, 28 September 2013 at 16:39:27 UTC, simendsjo wrote:
> I'm very uncertain about several aspects of my design:
> * Class vs. struct
> * Returned values from MySQL - e.g. should SELECT TRUE return long as it does in MySQL, or should we interpret it as bool
> * Probably a lot I don't remember right now :)
>
> Any comments would be very helpful.
> ..Or in D terms: DESTROY!

Ok, i've taken a quick look and like where this is going. I'll try and give you a little guide to what i would do regarding the above points.

> * Class vs. struct

I tend to only use structs where the type is purely a data type. Something like a record or a type that can be manifested in different ways (such as an IP address). If i need to model a real world object like an engine or book, etc., i use a class.

> * Returned values from MySQL - e.g. should SELECT TRUE return long as it does in MySQL, or should we interpret it as bool

I would return what mysql yields. This would make sure this framework is not doing to much. Converting ints to bools would be the next layer's job (DBAL, ORM, etc.).

I have two more suggestions that i think is critical to a project such as this and that's documentation and unit tests.

Please from the start thoroughly annotate everything with ddoc to generate nice html docs later. I know this can be a pain when designing and developing an API but is absolutely necessary. If you leave documentation until the end it never gets done!

Write unit tests for everything. I've found that if you find it hard to write a unit test for any 'unit', etc then it's too tightly coupled and probably doing too much. Practise good SOLID design principles and unit tests should be easy to write. Not only that but unit tests provide developers with a good understanding of the public interface.

http://en.wikipedia.org/wiki/SOLID_(object-oriented_design)
September 29, 2013
On Sunday, 29 September 2013 at 15:26:19 UTC, Gary Willoughby wrote:
> On Saturday, 28 September 2013 at 16:39:27 UTC, simendsjo wrote:
>> I'm very uncertain about several aspects of my design:
>> * Class vs. struct
>> * Returned values from MySQL - e.g. should SELECT TRUE return long as it does in MySQL, or should we interpret it as bool
>> * Probably a lot I don't remember right now :)
>>
>> Any comments would be very helpful.
>> ..Or in D terms: DESTROY!
>
> Ok, i've taken a quick look and like where this is going. I'll try and give you a little guide to what i would do regarding the above points.
>
>> * Class vs. struct
>
> I tend to only use structs where the type is purely a data type. Something like a record or a type that can be manifested in different ways (such as an IP address). If i need to model a real world object like an engine or book, etc., i use a class.

I don't think it's that simple in this case. When I implement lazy fetching, both methods have it's advantages and disadvantages.
MySQL doesn't allow multiplexing on a connection. This means a command must complete before issuing another. If reading rows lazily and then issuing a new command, the choice is to either disallow the new command, or invalidate the previous. The simple way would be to just start a new command and invalidate the previous, but is this the best way? If we choose to disallow new commands, that means the user have to explicitly close a lazy command. If using classes, we can safely have several instances for a command (is this neccessary?), but then the destructor wont't be a safe bet, and the user have to call close.
If implemented as a struct, we have to disallow copying.

So... I really don't know what the best design would be.

>> * Returned values from MySQL - e.g. should SELECT TRUE return long as it does in MySQL, or should we interpret it as bool
>
> I would return what mysql yields. This would make sure this framework is not doing to much. Converting ints to bools would be the next layer's job (DBAL, ORM, etc.).

The problem is that MySQL is inconsistent here, and it depends on if it's a field or a constant. SELECT TRUE is not the same as SELECT boolean_field.
Also, it seems every constant integer is returned as LONGLONG. But.. The fields include a length property..

> I have two more suggestions that i think is critical to a project such as this and that's documentation and unit tests.
>
> Please from the start thoroughly annotate everything with ddoc to generate nice html docs later. I know this can be a pain when designing and developing an API but is absolutely necessary. If you leave documentation until the end it never gets done!

Yeah, the documentation isn't very good :/

Thought I'd get feedback on the API before spending a lot of time documenting it.

> Write unit tests for everything. I've found that if you find it hard to write a unit test for any 'unit', etc then it's too tightly coupled and probably doing too much. Practise good SOLID design principles and unit tests should be easy to write. Not only that but unit tests provide developers with a good understanding of the public interface.
>
> http://en.wikipedia.org/wiki/SOLID_(object-oriented_design)

Most of the code is easy to unittest, but for now I've just relied on integration tests against a database. I'll improve on the situation.
September 29, 2013
On Sunday, 29 September 2013 at 15:42:10 UTC, simendsjo wrote:
> I don't think it's that simple in this case. When I implement lazy fetching, both methods have it's advantages and disadvantages.
> MySQL doesn't allow multiplexing on a connection. This means a command must complete before issuing another. If reading rows lazily and then issuing a new command, the choice is to either disallow the new command, or invalidate the previous. The simple way would be to just start a new command and invalidate the previous, but is this the best way? If we choose to disallow new commands, that means the user have to explicitly close a lazy command. If using classes, we can safely have several instances for a command (is this neccessary?), but then the destructor wont't be a safe bet, and the user have to call close.
> If implemented as a struct, we have to disallow copying.
>
> So... I really don't know what the best design would be.

I reckon go with Command being a class and invalidate previous lazy loading on new commands being issued, you are after all issuing a *new* command. If multiplexing is needed a new connection can be made.
September 30, 2013
On Sunday, 29 September 2013 at 15:42:10 UTC, simendsjo wrote:
> the previous, but is this the best way? If we choose to disallow new commands, that means the user have to explicitly close a lazy command.

If a command isn't closed as soon as possible, it sounds like a resource leak. If you silently cancel previous query, it can hide logical bug.

> If using classes, we can safely have several instances for a command (is this neccessary?)

If you execute the same query repeatedly, there's no need to build the same command every time.

> The problem is that MySQL is inconsistent here, and it depends on if it's a field or a constant. SELECT TRUE is not the same as SELECT boolean_field.
> Also, it seems every constant integer is returned as LONGLONG. But.. The fields include a length property..

Is there some hint it's true and not something else?
September 30, 2013
On Monday, 30 September 2013 at 10:34:25 UTC, Kagamin wrote:
> On Sunday, 29 September 2013 at 15:42:10 UTC, simendsjo wrote:
>> the previous, but is this the best way? If we choose to disallow new commands, that means the user have to explicitly close a lazy command.
>
> If a command isn't closed as soon as possible, it sounds like a resource leak. If you silently cancel previous query, it can hide logical bug.

Yeah. We need to choose:
1) Starting a new command while another is in flight is an error
   You need to close explicitly if the command isn't finished
2) If another command has been started, it's er error to continue iteration of a previous command.

I'm in favor of 1).

>> If using classes, we can safely have several instances for a command (is this neccessary?)
>
> If you execute the same query repeatedly, there's no need to build the same command every time.

That's what prepared statements are for. If you don't even have parameters, it's just 9 bytes (4 of these bytes is an undocumented field called iteration_count..).

>> The problem is that MySQL is inconsistent here, and it depends on if it's a field or a constant. SELECT TRUE is not the same as SELECT boolean_field.
>> Also, it seems every constant integer is returned as LONGLONG. But.. The fields include a length property..
>
> Is there some hint it's true and not something else?

Yeah. BOOL is an alias for TINYINT(1). I think it's fair to always assume TINYINT(1) is bool. For other types I'm not quite sure though.. "SELECT 1" is a LONGLONG, but in D a literal is int unless specified otherwise. So there is a mismatch here.
September 30, 2013
On Monday, 30 September 2013 at 11:20:23 UTC, simendsjo wrote:
> Yeah. We need to choose:
> 1) Starting a new command while another is in flight is an error
>    You need to close explicitly if the command isn't finished
> 2) If another command has been started, it's er error to continue iteration of a previous command.
>
> I'm in favor of 1).

That's what .net does. Most of the time it indicates a resource leak.

> Yeah. BOOL is an alias for TINYINT(1). I think it's fair to always assume TINYINT(1) is bool. For other types I'm not quite sure though.. "SELECT 1" is a LONGLONG, but in D a literal is int unless specified otherwise. So there is a mismatch here.

int64 is the right type for a database integer, int32 is too small. After all, it's a database.
September 30, 2013
On Monday, 30 September 2013 at 11:58:48 UTC, Kagamin wrote:
> On Monday, 30 September 2013 at 11:20:23 UTC, simendsjo wrote:
>> Yeah. We need to choose:
>> 1) Starting a new command while another is in flight is an error
>>   You need to close explicitly if the command isn't finished
>> 2) If another command has been started, it's er error to continue iteration of a previous command.
>>
>> I'm in favor of 1).
>
> That's what .net does. Most of the time it indicates a resource leak.

Ok. I'll add the same semantics.

>> Yeah. BOOL is an alias for TINYINT(1). I think it's fair to always assume TINYINT(1) is bool. For other types I'm not quite sure though.. "SELECT 1" is a LONGLONG, but in D a literal is int unless specified otherwise. So there is a mismatch here.
>
> int64 is the right type for a database integer, int32 is too small. After all, it's a database.

I'll just add a hack for TINYINT(1) -> bool and keep everything else MySQL specific.

Have to admit that int32 has been plenty for me in the past. Never worked on databases with > 200M rows.

September 30, 2013
It's not only size. A popular use case is a log: you write lots of entries and delete old ones, total size doesn't grow, but the identifiers...
« First   ‹ Prev
1 2 3