Welcome to Inkbunny...
Allowed ratings
To view member-only content, create an account. ( Hide )
GreenReaper

Squeezing the last bit from Inkbunny's 64GB SSDs

Yesterday we reclaimed 8GB of space for Inkbunny's database and session storage by converting a 10GB SSD swap partition into two swap files - 2GB on our main SSD partition (now 10GB bigger) and 4GB on our hard disk array. This should mean we're good for the next few years with disk space.

Making the most of what we've got

Inkbunny's main server has 2x64GB SSDs (or 59.6GB in "real money") in a RAID1. This isn't a heck of a lot - I have a 500GB SSD in my home PC - but it's what we could afford two-and-a-bit years ago.

When setting the server up we made a guess at an appropriate allocation based on past usage - ~50GB for the database, ~10GB for swap. Over time the DB has grown, despite our best efforts to keep it small; we also moved PHP session files to the SSD. As such, space got tight, so we began eyeing that 10GB.

The basic idea of swap - more-accurately known as paging - is to ensure that RAM doesn't run out and is used efficiently. You can use paging to fit more applications and data "in memory" than you have RAM for - operating systems move data in and out of RAM, typically 4KB at a time. Operating without swap is possible, but tends to reduce performance; extreme memory pressure leads to process termination.

From our perspective, swap's best used for keeping stuff around that might be useful - such as infrequently-accessed database tables - but isn't as useful as, say, caching frequently-accessed image files or user data. However, it's not very useful to have a lot of swap if you're running out of disk space.

UNIX-based systems have historically used dedicated "swap partitions", similar to a dedicated ReadyBoost device (although technically this uses a large swap file). Modern versions of Linux treat swap files in much the same way as swap partitions, and, unless fragmented, performance should be similar.

We usually don't use more than 2GB of swap, so we put a 2GB swap file on the SSDs, with another 4GB on our hard disk array at a lower priority, just in case - it'll used once the first 2GB runs out. The trickiest part was getting the system to recognize the new partition table - we avoided a reboot by using partprobe, but we still had to stop everything which used the SSD.

The big win was the 8GB freed up for use by the DB and sessions - a 16% increase in the size of our SSD partition, but a 45% increase in free space, and hence in endurance. There was a corresponding increase in available filesystem inodes - important because we have over a million session files in play, and, as they now fit in 4KB, they increase disproportionately to disk usage with bursts of traffic.

So what does it all mean?

Elevated activity over the past six months meant our prior estimates were no longer valid; we risked running out of SSD space within two years. We now have close to three years before we need to upgrade our database and session storage, and a reduced risk of resource exhaustion issues in the meantime.

This means we'll be fine waiting until our current contract runs out in late 2017, at which time we might increase our SSD size or acquire an entirely new system. I'd like us to transition to a 1U, single-CPU system with 4x2TB HDD and  ~2x128/256GB M.2 PCI-e SSDs… we'll see if that's an option by then!

In other news, disk utilization fell so low since we moved public file access to our caches that we've turned on an option to actively check for errors. Now that we're on RAID5, it's more important that any physical data corruption be found as soon as possible, which requires reading the files now and then. We're also keeping an eye on the S.M.A.R.T. values for one drive with a couple of reallocated sectors.
Viewed: 144 times
Added: 8 years, 10 months ago
 
Lyserdigi
8 years, 10 months ago
you're so awesome for doing these kinds of tech updates ^^
i personally love reading what is happening behind the scenes
Christoph
8 years, 10 months ago
*head spins* @.@
petsis
8 years, 10 months ago
I love reading your technical updates. It's all well explained and with links to everywhere. I like the effort you put in this posts. Being a young begineer sysadmin, i always learn something! :3

And people said i was wasting my time "on that furry site"... :P
GreenReaper
8 years, 10 months ago
Furry is web-scale! Running Inkbunny has given me a great deal of experience - it leads on my CV. :-)

In a corporate environment you typically get pigeon-holed into one speciality or another. Fan sites let you experience the whole stack, and their activity means that you get to deal with real-life issues. Hardware and development resources are also very limited, which promotes an optimization mindset.
petsis
8 years, 10 months ago
First of all: Oh god, i think i'm a link-addict. I followed almost every link there. Maybe your journals are like my link-dose :0

About being pigeon-holed, i work on a municipality. And, altough i'm on charge of some important systems, my real work is on helpdesk, but we are all people who know about everything, and are specialized in something :P
With "real-life issues"... two different servers from two different services stopped working in my first two months on the job, and i was the one who knew how to fix it (and thats how i got to be the sysadmin).
Why it broke down? Argentina! Old hardware, bad providers, zero funding. Make everything work from what we have!

It's all the opposite of a corporation haha
And i love it.

Also, some of your journals helped me in the part of making the servers work, so thank you!
Eiko
8 years, 10 months ago
Very nice. Thank you for the updates and for your work keeping inkbunny running smoothly! ^^
fluffdance
8 years, 10 months ago
Magic-man, at it again!

Thanks a bunch for all your hard work, Reaps!  And the rest of the staff too!  :D

Now, if you wanna squeeze s'more out of those drives, I just got a new hydraulic press...  >:3
rosebuster
8 years, 10 months ago
Thanks for doing all the wonderful job! <3
maxinered
8 years, 10 months ago
That is a hell of a database. I am curious what software is used for it though. Any chances you reveal that bit of information?
GreenReaper
8 years, 10 months ago
Sure. We use PostgreSQL for the main database, which is about 20GB and grows ~400MB/month. The biggest tables and indexes relate to unread submission notifications and +fav tracking. There's a separate 6GB Piwik visitor tracking database stored in MySQL - we do it ourselves rather than send your data to Google. The rest is sessions - we use igbinary to make them a little smaller.

PostgreSQL has a bunch of types which make some operations neat. For example, we use geometric types point and box along with the contains operator in a query to determine the content server to use.
maxinered
8 years, 10 months ago
Yeah, I love PostgreSQL too. So many great types and so easy to look up tables, roles and databases. Not to mention easy login with system users (which is quite handy for system users like www-data). I'm surprised you use so many different though.

Is that out of historical reason or performance reasons?
GreenReaper
8 years, 10 months ago
It's because Piwik doesn't support PostgreSQL, while Inkbunny doesn't support MySQL
New Comment:
Move reply box to top
Log in or create an account to comment.