By: Gigaom
September 14, 2012 at 21:02 PM EDT
Github: database migration sparked outages
Github's planned migration to a new 3-node MySQL cluster didn't go as planned, resulting in outages Monday and Tuesday. In addition, Github's status site, which runs on Heroku, had its own problems, acccording to a Github post-mortem published Friday.

Github’s outages early this week emanated from what was supposed to be a “rather  innocuous” database migration that turned out to be anything but. The company was updating older MySQL databases with a new 3-node MySQL cluster, according to a Friday afternoon post to the Github blog.

The goal of the work was to streamline failover. In the old setup, failing over from one database to another required a cold start of MySQL — the new architecture does not require that.

The blog, written by Githubber Jesse Newland,  provides a detailed post mortem of the events leading up to the snafu, but here’s the gist:

… three primary events contributed to the downtime of the past few days. First, several failovers of the ‘active’ database role happened when they shouldn’t have. Second, a cluster partition occurred that resulted in incorrect actions being performed by our cluster management software. Finally, the failovers triggered by these first two events impacted performance and availability more than they should have.

Another complicating factor was that Github’s status site, which runs independently on Heroku, experienced availability issues on Tuesday when traffic spiked. Github worked with Heroku to add a production database to handle the load and then a database slave was added to safeguard against similar occurrences.

Github’s distributed nature means that an outage at the mothership doesn’t mean all works grinds to a halt. “You can’t pull or push, but you can still make commits and branch to your local repository, then push when it comes back online,” said one GigaOM commenter. “That’s the whole key behind a [distributed version control system.]. And even if you need to share files … you can create .patch files and send them through other means.”


Stock Market XML and JSON Data API provided by FinancialContent Services, Inc.
Nasdaq quotes delayed at least 15 minutes, all others at least 20 minutes.
Markets are closed on certain holidays. Stock Market Holiday List
By accessing this page, you agree to the following
Privacy Policy and Terms and Conditions.
Press Release Service provided by PRConnect.
Stock quotes supplied by Six Financial
Postage Rates Bots go here