Monday, June 23, 2008

SAN Nightmare, Part 0

Note: This is part 0 of an 8 part series. Read them in order, it'll make more sense. Part 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8

Part 0: Introduction

This series of entries timelines select events that happened between May 9 and June 23, 2008. The purpose of this series of entries is primarily so I can remember how bad the last two months have really been. In the process, maybe someone will stumble along this and decide that xSAN has as the potential to have as much of a "distaste for your environment" (Apple's words) as it did for mine.

A bit of background: We had been running xSAN 1.4.2 software since late November, 2007. Due to some issues we had with it (explained later on) we decided to upgrade to the new version in hopes of a fix. When we originally implemented the xSAN solution, we did so because we were intrigued by the idea of allowing multiple servers to share a single large pool of disk space. This would, in theory, allow us to do things like sharing a single public folder across several servers or move groups of people from one server to another for load-balancing reasons without having to move their data. Furthermore, it allowed us to set up a model in which a computer on the SAN was dedicated specifically to being the computer that Retrospect sent all its backup requests through. Structuring backups in this way freed up a considerable amount of CPU space on the file server itself to do things such as serve files in a timely fashion.

No comments: