|
| 1 | +# Add database versioning and migrations to output database |
| 2 | + |
| 3 | +Status: Pending |
| 4 | + |
| 5 | +Version: alpha |
| 6 | + |
| 7 | +Implementation owner: TBD |
| 8 | + |
| 9 | +## Abstract |
| 10 | + |
| 11 | +Proposal to add database versioning and migration capability to the output of operator-registry commands |
| 12 | + |
| 13 | +## Motivation |
| 14 | + |
| 15 | +In an attempt at optimizing the registry database usage in OLM to allow the registry database to be stored in a container image, a method is needed to add to the database over time. Since these database files will be long living, it becomes a requirement that the database can migrate to a new schema over time. This proposal attempts to solve that problem by implementing a method with which database versions can be migrated with existing db files over time. |
| 16 | + |
| 17 | +## Proposal |
| 18 | + |
| 19 | +### Migration Method |
| 20 | + |
| 21 | +At a high level, the migration method is relatively straightforward. The database will contain a current migration version and when any additive operation is called with an existing database as input, the first step will be to run the migration to upgrade the database schema to the latest version. |
| 22 | + |
| 23 | +Rather than reinventing the database migration process, operator-registry will use an existing database migration tool that has already defined a set of semantics and conventions. This proposal suggests using the [golang-migrate](https://github.com/golang-migrate/migrate) tool for this purpose. Golang-migrate uses a set of conventions to define migrations. |
| 24 | + |
| 25 | +Firstly, it defines migrations as a set of sql files that live in a flat database folder. Migrations are individually defined as a pair of up and down migration scripts. These up and down scripts each define a method of going to and back from a particular database version and should be opposites semantically. Each script pair has a unique 64 bit unsigned integer identifier followed by an underscore, as well as a title, a direction, and ends in the `.sql` extension: `${version}_${title}.${direction(down/up)}.sql`. The versions are ordered, with the lowest integer value coming first and the latest version defining the current database migration version in the db. For more details on the migration format, please see https://github.com/golang-migrate/migrate/blob/master/MIGRATIONS.md. |
| 26 | + |
| 27 | +``` |
| 28 | + # example migrations folder |
| 29 | + db_migrations |
| 30 | + ├── 201909251522_add_users_table.up.sql |
| 31 | + ├── 201909251522_add_users_table.down.sql |
| 32 | + ├── 201909251510_first_migration.up.sql |
| 33 | + └── 201909251510_first_migration.down.sql |
| 34 | +``` |
| 35 | + |
| 36 | +Once that migration schema is defined, whenever any additive operation is run against the database we will first use the migrate API to upgrade the schema to the latest version. See https://godoc.org/github.com/golang-migrate/migrate#Migrate.Up for more details on that API. |
| 37 | + |
| 38 | +### Versioning fresh databases |
| 39 | + |
| 40 | +One other consideration of note here is that the operator-registry database is not a long living database in the traditional sense, since the database is often created from scratch. All common migration tools do not generally account for such an edge case, so this proposal also defines a method of initializing the database at a particular migration version. Since the migration version is a matter of convention on file names, we can infer the migration version by parsing the migration folder and finding the latest migration script version. Then, once we initialize the database on startup we can use `golang-migrate` API to force the initial migration version. See https://godoc.org/github.com/golang-migrate/migrate#Migrate.Force for more details on that API. |
| 41 | + |
| 42 | +### Schema Definition |
| 43 | + |
| 44 | +One thing to note is the choice of how the database schema is defined. Currently that schema is defined in source code here in [/pkg/sqlite/load.go](https://github.com/operator-framework/operator-registry/blob/master/pkg/sqlite/load.go#L29) as a list of create table statements. |
| 45 | + |
| 46 | +However, once sql migrations are written the database schema is sometimes generated in two ways (through a migration upgrade or from scratch) as defined above. In that case, there are a few ways to do that. One is to leave the initial schema as is and use the migrations on a clean install as well -- but that means that the schema is not defined in one single human readable place. This can commonly be worked around by using a migration tool that can output an example of the sql schema for the purpose of development. |
| 47 | + |
| 48 | +The other option, and the one that this proposal defines as the solution, is to keep the schema definition in `load.go` in sync with the changes defined in each migration. In this case, it is possible for the database to go out of sync on migration vs from scratch in the case where the migration was written properly. This can be easily mitigated by writing an automated test that always ensures that some initial bundle database can be migrated to a latest version and have the same schema as a clean database. |
| 49 | + |
| 50 | +As a result, this implementation will require writing tests to ensure that the schema versioning is in sync between upgrade and scratch creation. |
| 51 | + |
| 52 | +### Choice of Migration Tool |
| 53 | + |
| 54 | +One point of this proposal is deciding to use the `golang-migrate` project as a method of driving migration conventions and the migrations themselves. Below is a list of criteria that this tool fulfilled when making that choice: |
| 55 | + |
| 56 | +- Popular and well supported |
| 57 | +- Good documentation |
| 58 | +- Supports lots of different database drivers, in the event that we move away from sqlite we can maintain the use of workflows around migrations |
| 59 | +- No need to ship a migration binary, migration can be handled by importing `golang-migrate` and running in source |
| 60 | +- Has a nice API for upgrading and reading the schema version for our scratch scenario that doesn't require us to expect a particular convention of database versioning |
| 61 | +- Allows defining migration versions as timestamps |
0 commit comments