home development system design Linux databases Docker kubernetes about me contact
Databases and SQL

I've done a lot of work on various Databases, including designing schemas, complex queries, stored procedures, triggers, and the like.

Oracle version 4 was my first, back in 1984, then Ingres which morphed into Postgres, and eventually PostgreSQL. I've worked with MySQL and Sqlite (my favorite), and an interesting new DB called Rqlite.

Sqlite is considered by some as a "toy" database, but that's anything but true. It's tested more thoroughly than probably any other DB, and is used in millions of platforms. If you have a smartphone (who doesn't?) you're using Sqlite. It's thread safe, and concurrent reads/writes are not a problem. It has all the major features of the Big Guys, except for Outer Joins, which can be easily worked around. With 281 terabyte capacity, it can hardly be considered small. Sqlite's simplicity is what makes it attractive. All data is kept in a single file, which can be copied and sent via email, for example, or kept under version control via Git.

Rqlite is a distributed self-replicating cluster of DB instances based on Sqlite. It implements the Raft algorithm from the Godfather of relational databases, Dr. Mike Stonebraker at Berkeley and MIT. Raft is a simple but powerful mechanism to synchronize instances, where a leader is chosen by a majority vote from the cluster. I use the Go gorqlite client written by Rqlite's author, and it's amazingly fast and small. Perfect for containers. Currently Rqlite is limited to 2Gb of data per DB instance, though there are workarounds. For mid-sized databases this is worth looking into.

YugabyteDB is a PostgreSQL-compatible replicated database that uses the Raft algorithm. I haven't used it yet, but it looks interesting. Cockroach is another that's similar.

PostgreSQL has been my favorite database for many years. For one, it adheres to the official SQL standard more than any other database. Also, the extensions are significant. I've used the PLV8 extension to write functions in JavaScript, which is pretty cool. There are Go extensions to write stored procedures and triggers, so I would certainly like to try them when the opportunity presents itself.

Cloud data warehouses like Amazon's AWS Redshift and Google's BigQuery are attractive for the same reasons as other cloud services. For large databases used by cloud-hosted Docker and Kubernetes services, this is an obvious good choice. Because they use ANSI standard SQL, queries should be portable across platforms. From looking at usage and specifications, BigQuery looks like a better choice in some ways. BigQuery is serverless, has automatic maintenance and supports many more table columns. Here is an article comparing them.

No-SQL databases are popular, with Mongo probably the most popular, because it has such wonderful administrative tools. They certainly have their place, especially with document databases that don't have much structure. But, they're not a replacement for relational databases when relationships are important, which is sometimes more often than apparent. It's possible that one of the reasons for the popularity of no-SQL databases is because some programmers don't want to learn SQL. At least that's been my experience. No-SQL fans claim theirs are faster than relational databases, but a properly configured PostgreSQL is just as fast. A better reason to go with Mongo is easy admin and simple programming.

A key/value store written in Go is etcd and is especially suited for the cloud. It's meant to be used in clusters, which replicate themselves using the Raft protocol, mentioned above. I read that it's becoming a de facto choice for distributed systems. It uses gRPC for message passing using Protocol Buffers. Protocol Buffers are language agnostic so a service written in Go can communicate with a Java client, for example.

Most large organizations have dedicated Database Administrators who do nothing else, and are real experts. I don't claim to fit into that category, but as I hope you can see, I'm pretty knowledgeable. Whether it's choosing a database design or strategy, setting up and installing servers and software, writing schemas and queries, I can help you.