Relational databases still power most of our enterprise-level systems and this is not about to change. The implication for the development community is that there continues to be an impendence mismatch between the relational model of the database and the object-oriented semantics of the application domain model. This divide is so profound that our popular system development patterns strictly separate the business layer from the data layer. This dichotomy translates into a significant amount of work for a development team such as creating database schemas, tuning database performance, mapping the domain classes to data access classes and maintaining all of the above throughout the system lifetime. Just like we apply agile practices to the business layer, we must apply similar principles to the data layer, of course with a few twists. Here are some tips on how to maintain the database layer in an agile ecosystem:
- Version your database. This may simply mean maintaining an integer value in a table in your database schema to indicate the version of the schema. Over time the database schema will evolve and there will inevitably be a mismatch between the application domain code and the schema it is running against. By checking the supported database version, the application domain code can spot these issues up front.
- Check version before accessing database. Make the database access code smart to check the version of the database before performing any operations. If an older version of the application needs to work with a newer schema of the database, write a compatibility layer that makes the database appear like the old schema to the older version of the application. Placing all data access code in one place helps in this scenario.
- Create database upgrade and migration scripts. Create and maintain incremental database scripts that migrate a database schema from any released earlier version to any other later schema version up to the very latest version. In order to write good Unit Tests, we need to be able create any released version of the database. These scripts generally accomplish two tasks: Update the database schema and Migrate the data.
- Put database access code in one place. Put all database access code in one place and have the application domain code talk to this database access code. That is, do not disperse your SQL Statements all over the place. Furthermore, create a well-defined interface for the database access code. This will facilitate creation of a Mock database access code layer that will be instrumental in creating Unit Tests for the application domain code.
- Use source control. Implement source control over the Database Update scripts just like you would over any other kind of source code. This is essential to traceability of the database changes made.
- Unit test database access code. Write robust Unit Tests for the database access code. These tests are separate from the Unit Tests for the application domain code. The purpose of these tests is to verify the CRUD (Create, Read, Update and Delete) behavior of your database access code.
- Performance test database access code. Write performance tests to write large amounts of data using the database access code to spot inefficiencies.
- Create database utilities and scripts. Create tools and scripts to allow testers and developers to create any released version of the database. A particularly useful script is a script that allows us to copy/move a database in its entirety from one machine to another. If a client reports a problem, an identical environment can then be easily created for diagnosis. These tools and scripts not only create appropriate database schema and but also populate appropriate seed data in the tables.
- Integrate very often. Make database related activities part of the continuous integration cycle, just like your application domain code.
- Automate as much as you can. This includes writing automated tests, continuous integration server and writing simple tools and scripts to recreate a database schema with appropriate configuration/seed data. What cannot be used easily, does not usually get used.
- Write a utility for database health check. Write a utility library to verify the structure of a given database. Even though we versioned our database, it is possible some table is missing a column, a constraint is non-existent or a trigger is not present. A utility to verify the validity of a database at database object level (table, views, stored procedures..) is invaluable to diagnosing problems in testing and at client sites.
- Version compatibility matrix. Keep a compatibility matrix of what version of application works with what version of the database. Make it freely and widely available to development and onsite teams for early diagnosis of any database related issues.
- Use an ORM framework. When accessing database from your application, seriously consider employing an ORM framework. If you use a widely used, reputable ORM library, not only will you end up writing less code, you will also introduce fewer bugs. Building SQL queries by concatenating strings in code is error prone and opens up your application to numerous security vulnerabilities, such as SQL Injection attacks. By employing a reputable ORM, you can greatly reduce the security risk profile of your application.