Lateral Programming

Coding out of the box

Posts Tagged ‘database’

Designing databases for flexibility

Posted by eutrilla on April 27, 2008

In the last weeks I’ve noticed several posts about alternative database paradigms: CouchDb, SimpleDb, BigTable… All of them promise infinite scalability. This isn’t really important for me, since the projects I’ve worked for don’t require a big concurrent workload. Or at least, not that big that a  RDBMS can’t handle. But they have one more thing in common: an easier way to store and retrieve data, away from plain SQL.

I do agree that storing objects feels much more natural and easy to understand for the developers than adding rows to one or several tables. I’ve always felt unconfortable about the simplistic SQL statements. I had to spend a lot of time thinking about how tables were connected, instead of how the domain objects were related, so I tried to figure out how I could isolate the application code from the database structure. ORM frameworks simplified things, but not completely: they replace specific SQL queries by specific mappings.

Object-Oriented databases are an interesting concept. Forget about database schemas and ORM mappings, and design your own domain model as you want. Then simply store it all, just like if it was kept in memory, but being able to do queries on it. Tempting, isn’t it? Of course it has a price: they are really slow compared to RDBs.

The cloud databases follow the same lines, but instead of storing full-fledged OOD objects, they handle maps of key-value pairs, containing the named parameters of the object. I guess that this is the trick to improve performance: by limiting the format of the objects, it is possible to optimise the way the data is queried. Also, t the objects only store “simple” values, such as numbers and strings, but not other objects – each entry is limited in size. They may contain the identifier of another object, so you can retrieve it later, but a read operation doesn’t return all the objects stored in a tree, for instance. That’s a strong point, and at the same time a weakness, of OODBs: getting the whole tree is easy, but getting just the root node is comparatively slow, since you are retrieving the whole tree anyway.

The problem about these cloud-based DBs it that their field of application is restricted in general to Web services. In many other cases, we are stuck with relational databases, at least for a while. But the way they store data doesn’t require a distributed database. It can very well implemented over a relational database structure, using a code DAO layer to manage the SQL details. Depending on the features required it would require more or less work, but once finished it can be used for any type of object composed of key-value pairs. During the last years, I’ve implemented and used in my day job several of such database designs, with different functionality sets. They allowed us to speed up development and to make changes easily to the data model, including the creation of new entity types or the addition of properties to existing ones. And of course, it was still a relational database, so we could keep on doing specific SQL queries for critical or complex searches, and use all our usual SQL development tools.

Some say that relational databases are dead. I’d say that they are pretty alive and will be for some time at least, but that the way we use them may be not the best one in all cases, and that new approaches can be tried. The characteristics of RDBMSs may not be the best in terms of abstraction, but make them a good backend over which higher abstraction level databases can be built. In the next posts, I’ll try to share some of the solutions that I’ve found useful in the past.


Posted in databases | Tagged: | 2 Comments »