In brief: We now have a near-instant copy of Inkbunny's database on another server, to complement our existing nightly on- and off-site backups of both the database and our image/thumbnail archive.
Update (June 14): We added cascaded database replication from the secondary to our off-site backup, minimizing losses in the unlikely event of a disaster impacting both servers in our datacenter.
As Inkbunny grows, it becomes ever more important to be able to resume service quickly and with as little data loss as possible in the event of a technical fault. To this end we've introduced streaming replication to a hot standby, the last major task planned for the secondary server we leased last year.
Inkbunny's main server has mirrored SSDs for database storage. However, if both were to fail, say, due to a power surge, we might have to restore from a nightly backup. No more. Streaming replication provides a continually-updated copy on our secondary server; at the same hosting facility, but in a different hall. It's not synchronous - that'd reduce main server performance - but it's typically within 200ms.
In the meantime, write-ahead log compression has reduced the bandwidth we require for it to ~5MBit/sec - just 1/20th of our main server's capacity, and 1.5TB of our secondary server's 100TB/month allowance.
We benefit from using unlogged tables for search results, which we keep around for a bit, but don't need to persist in the event of a crash - they're muchfaster, in part because their contents aren't replicated.
Of course, replication wouldn't save us from a bug in the site, attacks, or staff error, so we'll retain nightly/weekly off-site database backups. Images are also backed up nightly (if not cached sooner).
This also opens up the possibility of cascading replication, where the secondary server feeds live off-site backups - useful in the event of a catastrophe affecting the entire datacenter. A project for another day!
The change does not significantly decrease disk space on the secondary server. We have enough for over three years' usage at current levels, and we're likely to upgrade both servers by 2018.
Technically it's the same butt, just mirrored. Since it's a hot standby, you can actually hit them both at the same time, but only one leaves a mark. We could use the standby as a read-only copy for testing, or potentially to process long-running search queries to relieve the main server of them.
Eventually we might even use it to providing a read-only mode for long-running database maintenance, but we haven't done any work towards that yet. Most of our updates run fairly quickly as it is.
Technically it's the same butt, just mirrored. Since it's a hot standby ( http://www.postgresql.org
That'd be where "read-only copy" comes in. The tricky part is that many of the changes we make (for example, in the current release) also involve database transformations that would make the site look broken, if not actually break it; and those would normally be replicated to the standby. I guess we could we could promote the standby temporarily, cutting the replication, then wipe it and rebase it once we were done.
As for length of upgrade, we had a few scripts to run for this particular release that had to generate thumbnails and do some heavy database lifting. It would've delayed the release further to make it something we could feasibly run iteratively. But there are other upgrade processes that don't work so well with one million files that we may need to reconsider.
That'd be where "read-only copy" comes in. The tricky part is that many of the changes we make (for
I usually take pride in being able to understand some technobabble... ^T.T^
All I can see from this is: "Technically possible, important details (that might put a damper on it), also those huge preview icon (<3), some upgrade processess might be reconsidered"
That sound about right? ^o.o^
I usually take pride in being able to understand some technobabble... ^T.T^ All I can see from th