Technical Computing
Filesystems and Performance - Part One: I/O Bound Systems.
by Scott Nolin
For science computing systems, system performance is often limited by the filesystem. This month I’ll talk about something that we’ve helped people work through on many systems at SSEC – the problem of the I/O bound system.
You have 100 quick and simple jobs that you run hourly that read some data, processes it, and puts it somewhere else. It run fine, all is well. You know these jobs take very little CPU time and you have plenty of CPU power to spare. You want to take advantage of your extra processing power, so you add 1000 jobs that run hourly. Now none of your jobs ever seem to get done! Perhaps you even purchased a new computer that is extremely high performance, and moved your jobs to this new system, but it still does not perform well!
Why might this be? In this case above and many others, it’s possible your system may be i/o bound.In TC we often see that especially If you do a lot of routine processing via cron, getting a new system and adding workload to it or simply adding work to an existing system may result in becoming i/o bound.
What this means is your system is spending most of it's time actually waiting around for data to be written to or read from the disk drives. This can normally be detected by examining the system with tools such as iostat.
How to deal with this?
First, come talk to us in Technical Computing. We can look carefully at your problem and likely provide some really useful details about how to optimize it. Ideally talk with us before you purchase a new system and we can try to help you plan ahead if possible.
Some of the fixes that we have helped people implement:
- coding changes - Often simply writing the code with i/o in mind is all that is required. If the code is too difficult to change or not under your control, you must look at other solutions, but understanding the processing you do is the first step to tuning performance.
- faster disk - This can sometimes help, but is limited by how much you can spend and how fast a disk you can find. Flash based disks also fall in this category, and while they’re popular they have a fairly narrow usage case.
- ram disk - If a problem is of a particular size the judicious use of a ram disk can be very helpful.
- spread load across multiple disks - The SSEC mail server is a perfect example of a worst case disk workload. We process millions of tiny reads, which is a terrible task from a performance standpoint. We simply spread out the accounts on multiple disk arrays and achieved the required performance. The same can be done for any workload.
- Use a scheduling system - Typically a scheduler is used on a cluster system, but with new multicore systems it often makes sense to use a scheduler on a single server. This does require some work to learn the scheduler, but is a powerful and convenient tool for spreading out your workload across resources such as processors and disks or simply using your disk i/o more efficiently.
Next month In Filesystems and Performace - Part Two I'll talk about filesystem metadata performance problems. In TC we have found some specific and hard to resolve problems related to filesystem metadata for some workloads, especially on Linux systems. This topic may not affect as many people as this month's, but the problem is fiendishly troublesome for high performance computing systems.