Posted: March 1st, 2010 | Author: charlie | Filed under: IT Management | Tags: capacity, performance, tuning, virtualization | 3 Comments »
When purchasing server hardware, do you tend to purchase more power than you need, or not enough? Specifying the correct server for your current need is a fine art, and it’s easy to get wrong. Here are some helpful hints and considerations to remember that will ensure you make the right server purchasing decision.
We’re going to focus on standalone (non-blade) servers for the moment, but many aspects are also applicable to blade servers. Blade servers are wonderful for centralized management of the hardware, but the specs of the individual server blades can vary tremendously.
Want to avoid trudging down to the datacenter late at night, or even worse, across the world if something breaks? Then don’t skimp on the management controller, lights out manager, or whatever the vendor is calling it. Many vendors ship a simple version by default: it may allow serial console access only, for example. Make sure to get the full-featured controller, because even if the hardware is only a few doors down, getting up from your desk should never be necessary.
If you aren’t thinking of switching vendors any time soon, you might think that the management interface will always work the same as it has on all your other servers. Unfortunately, that’s not the case. Sun x86 hardware, for example, has many different hardware management controllers to choose from. The more expensive and feature-rich servers have the better controllers, but don’t make the mistake of thinking the interface never changes. The unfortunate part is that you never know how well it works until you get a server on-site.
Hardware management comes in two forms: IPMI (most support), and the user interface. The user interface is more often than not, a Web-based java application that provides remote console access. Some are extremely buggy, and others work quite well from all Web browsers. We can’t make a recommendation, though, because these things change often.
Shucks, this one is a no-brainer: as much as you can afford. Within reason, that is. If you aren’t going to run virtual machines, and this server’s only job is to serve up some simple Web pages, then 16GB of RAM is likely overkill. Likewise, make sure you know what your application can support. Many java applications are limited to a heap size of 2 or 4GB.
It’s also overkill to purchase more than 4GB of RAM if you need to run a 32-bit operating system. Yes, Windows Server does some tricks and it can use more than 4GB, but it’s a huge performance it.
If virtualization is in your future, load up as much as possible. You also want to pay attention to how many DIMM slots the server has. The 8GB DIMMs are horribly expensive now, so you’ll probably want to stick with 4GB sticks. Just remember, if you fill all the slots in the server, the only memory upgrade path is to buy higher capacity DIMMs.
Do you want to run many threads at an even pace or just a few threads as fast as possible? Sun’s T2 processors aren’t fast by any measure, but they can run many threads at the same speeds consistently. These are ideal for database servers, but not for Web servers.
Will this server be executing a wide variety of processes over and over again, as opposed to just running the same big application server constantly? If so, make sure you pay attention to the amount of cache each core of the CPU has.
For virtualization, you want the fastest multi-core processors available, with the largest amount of L2 cache. Cache is very important as it minimizes the number of times the CPU needs to fetch data from slower RAM. It makes a very noticeable difference on heavily used servers.
Disks, Controllers, and RAID
If you need local storage, do pay attention to the type of disks you’re ordering. A SATA disk is likely to disappoint if you have an IO-heavy workload. SAS, and FC disks should perform equally well, since they are both SCSI disks underneath.
Even if you don’t need much local storage, you should always buy a server with a RAID controller that can mirror the operating system disks, unless you’re SAN booting of course. You don’t want the OS to crash just because of a failed disk. Likewise, if you’re keeping tons of local storage for some reason, make sure to get a RAID card that does RAID-5, so that you can at least lose one disk at a time without losing data. If performance is a concern you should really be using iSCSI or SAN storage, but you may also think about a RAID 0+1 configuration to avoid the slower RAID-5 parity calculations.
If you’re attaching to a SAN, make sure to include the correct HBA as well.
When servers started showing up with two or four gigabit NICs I must admit, I was confused. Why would someone need that many? Aside from large servers that do a lot of network IO, you might also want to separate out your iSCSI traffic from normal Ethernet. It’s also important these days to make sure that the network cards support TOE, or a TCP Offload Engine. This will task the network card with computing TCP checksums, freeing your CPUs for more important things.
In summary, most of these things may seem common sense, but you need to remember to ask all the right questions every time you spec a server. Here’s a good checklist:
- Adequate hardware management controller
- Enough (but not too much) RAM, that’s fast enough, but not faster than the CPU’s front-side bus
- Enough memory slots for expansion, if that seems likely
- Correct CPU for this server’s needs
- RAID-1 for the OS, and (optionally) other RAID levels for other local storage
- FC HBAs?
- Multiple gigabit NICs with TOE capabilities
3 Comments »
Posted: February 15th, 2010 | Author: charlie | Filed under: IT Management, Linux / Unix | Tags: configuration management, linux, virtualization | No Comments »
Virtualization (in the cloud or locally) is great; that much we can all agree on. Virtual machines (VMs) can tend to grow out of control, however, now that it’s so easy to create them. This should not be all that surprising, but many small to medium businesses are also dabbling in VMs, and they are suddenly overwhelmed by the VM growth.
Each VM is another server that an administrator must manage. Security updates must be applied and global configuration changes now need to be propagated to all these new machines. While it’s easy to create 3-4 (or more) servers on one physical piece of hardware, you’ll certainly struggle if you aren’t already set up to scale.
The number of physical machines in a small company may drop dramatically; maybe 40%, when virtualization is implemented. Unfortunately, the number of OS instances will generally increase by two-fold or more at the same time. The power and cooling savings are realized, as was promised by virtualization, but taking 20 servers to 12 servers, for example, will means you may soon have 40 OS instances to manage.
Puppet, from Reductive Labs
The reasons for VM proliferation depend on your culture, but the most common reason is that delegating control of an entire OS is easier than managing an application for customers. IT customer, be they engineers, application developers, or smaller IT units within an organization, frequently need more access then cenral IT is willing to give. The easy solution: give them a server of their own. Test environments, too, are best served by virtual machines.
To keep hardware (and power and cooling) costs down, many companies implement policies about the implementation of new services. New applications and servers need to be run on VMs first, unless it’s really requires its own server. Policies such as these are good, in that they limit wastefulness, but they do tend to exacerbate VM sprawl.
Sprawl aside; it’s worth noting that higher utilization levels on your servers does not mean that they’ll use an appreciably larger amount of power. In fact, the power savings claims are really true, and can be even greater if your utilization is low and you use VirtualCenter’s power management features. VMWare can migrate VMs to fewer servers if utilization isn’t high enough, and actually power off unnecessary servers. This works best with Dell hardware, but other large vendors are supported as well. Imagine: all your VMs migrating to a few blades in a blade server during the nighttime, and then as utilization increases during the day, blades quickly boot up and take the load as needed. Granted, I don’t personally know any enterprise environments that are brave enough to try it yet, but in theory the concept is wonderful.
Something magical happens when a company grows to around 50 operating systems. It’s too many to manage by simply logging in and running commands, so people start to write scripts. In Windows land, if it hasn’t already happened, you must implement Active Directory. For the Unix/Linux servers, configuration management becomes even more important. Writing a script that SSH’s to each server and runs a command doesn’t scale, no matter how hard people want it to. You need a real configuration management system (such as puppet or cfengine) to ensure that servers are configured exactly how you want, and that they will remain that way.
If you already operate in a large environment with good automated installations and configuration management systems, chances are scaling 100-fold won’t be a problem. Barring scaling issues with the management software itseld, that is. A good network-booting deployment system is only half the battle, because every server isn’t going to be configured identically. If you’re “doing it right,” you should be able to arbitrarily reinstall any server, walk away, and know that it’ll come back up patched and running all the services it’s supposed to. Servers, or rather the OS that runs on them, should be truly disposable.
Management of a “golden image” is promised by VMWare, probably because ITIL mentions it, but it doesn’t really help in practice. You have to create your images (somehow). There’s no mechanism to update a golden image with security patches and apply them to existing systems; you’ll generally have to reinstall the OS instances. And that’s what you should do periodically, but without some kind of configuration management system, you’ll also be manually installing and configuring the services that the VMs used to provide in order to restore service functionality.
VM growth, therefore, is no different from server growth. It may be easier and cheaper, but from the OS management viewpoint, you’re doing the same thing. Likewise, the availability of your services is also in danger. Running five VMs on a single piece of hardware means that a hardware failure takes out five servers instead of one. VMWare and Xen can both be clustered and run from shared storage, such that a hardware failure will result in the VMs immediately (instantly, even) being migrated to other servers. The problem is that VMotion requires the most expensive VMWare license, and a VirtualCenter server. Shared storage isn’t as big of an issues these days with iSCSI, but its still another aspect that must be configured. We’ll cover this issue in-depth in a future article, focusing on Xen and RHEL Clustering Services.
The point is: dealing with VM sprawl is no different than dealing with scaling up to support more physical servers. Use whatever mechanisms are available on your given platforms, and “do it right.” A VM is, and always will be, just another server.
No Comments yet... be the first »