serving files for smb

iananderson1812 · 2011-08-14T23:52:05.000Z

Howdy all.

Looking to move from a SBS2003 box to a pair of virtual hosts with the next server upgrade. Been following conversations in the storage and virtual groups for some time now, so I’m beginning to become wise in the ways of things not to do :). No san required, and no raid 5 for me.

Question though: for best file server performance, now and into the future, am I better off

a) using a guest windows server with local storage (assuming we have a license free, or a lightly loaded DC for instance), or

b) stretching the budget to include additional dedicated hardware for a nas (nexenta/whatever integrates with AD)?

So many sw questions for the smb sector are for IT shops that have the needs of medium businesses, but I’m at the bigger end of non-it small business. 30 local users, 1x SBS box that does everything, 1x TS box, 1Gb network. We don’t have huge amounts of data (~500GB). All windows environment, bar a few test machines.

Management has always been used to the idea of a second standby server, so having 2 hosts won’t be a problem. Yes, I know everything could run off of one, but we want to minimise downtime to a sensible degree, at least.

It’s confusing, because people generally recommend windows server over nas for speed, but not virtualising file servers because of speed…

Appreciate some feedback.

divide-and-realign · 2011-08-15T00:55:05.000Z

Meh,personally i’d go with a NAS for any Fileserver duty for that size enterprise.

It’s handy in a pickle if you need to reboot any of the VM’s or hosts to have completely seperate storage

iananderson1812 · 2011-08-15T01:20:22.000Z

Goodo, thanks for that. Still interested in knowing what performance is like though?

I’m getting the feeling that someone is going to say that performance in a VM depends on the workload of the rest of the hosted virtual machines… Can we assume they won’t be all that heavy for comparison’s sake?

Issue with NAS for our size enterprise is that anything decent costs money, and we haven’t needed it before.

divide-and-realign · 2011-08-15T01:29:02.000Z

The average Nexenta throughput on large files is 20Mb/s+ on Gigabit networks.

Assuming non-overpowering loads ( Autocad drawings being rendered every minute I would class as huge FS IO ) you wouldn’t take a big performance hit if any.

Pricewise you might find it easier to use a smaller NAS with SATA drives than getting near-line SATA or full SAS drives for low IO loads. Take out the cash and buy the expensive hardware for seconds-critical loads like high-usage SQL databases.

Run a performance benchmark on your current system drives and see what your transfers are like and what the IO load is on your current server

iananderson1812 · 2011-08-15T02:29:43.000Z

Thanks PsyWulf. Nice to see some figures. I like nexenta because of the tricky caching options, but I guess I’d like to know that it’s best and worst case scenarios for speed would be well and truly acceptable for network file performance. Especially compared with (virtualised) server 2008 r2 running on local storage.

I’d particularly like to know where to find out figures like posted above. My brain suggests to me that the transfer rate of large files (contiguous read) should hit the limit of the network card(s) provided there’s enough drives. Smaller files would be the same if you’re lucky with cache hits. No?

Transfer speed is only the half of it though. My users would be up in arms if they had to wait half a second each time they tried to navigate one folder further down a directory tree (as an extreme example).

divide-and-realign · 2011-08-15T02:37:51.000Z

Many factors including the amount of RAM and Processor on the NAS would affect the speeds too ( Cheaper units are SW Raid rather than HW ). Best option to get real-world numbers is to check out reviews on the unit you’re interested in.

I know from experience the Thecus N4100Pro unit i’ve got running for one of our Offices ( 30 users, 1.4TB worth of Office docs ) runs like a jet and it’s one of the cheaper home-user ones. Nexenta blows it out of the water so I can only imagine more picky users wouldn’t notice

johnwhite · 2011-08-15T19:16:57.000Z

Just use a VM appliance on the host. Your performance will be just fine.

iananderson1812 · 2011-08-15T20:48:28.000Z

Ok, so I’ve been testing IO loads and performance of our current gear, as best I can. Got some unusual results:

Server 1: [physical] sbs2003, 32-bit on a xeon 5130 (2.0GHz, 2 core + HT), 4GB ram

Server 2: [virtual] server 2008 r2, 1 processor, 2GB ram on vmware esxi 4, physically identical to above

Both on the same network, accessed from the same machines.

Seeing as though server 1 does everything and is chronically slow, and server 2 is just a secondary DC, I’d have thought there’d be an even chance server 2 would be faster at disk accesses across the network. No. At least 25% slower, and up to 300% slower for file transfer speed. IOmeter’s total IOPS indicated much worse figures too.

Obviously I underestimated the importance of the number of cores (and possibly server08’s need of more ram). What specs do I need to throw at a VM before it’ll be comparable to a non-hosted file server then?

iananderson1812 · 2011-08-15T20:49:09.000Z

Thanks for responding, John. Would you be suggesting using a virtual nas appliance, as opposed to a windows vm that’s only doing AD?

John White wrote:

Just use a VM appliance on the host. Your performance will be just fine.

johnwhite · 2011-08-15T21:15:39.000Z

ando.spice wrote:

Thanks for responding, John. Would you be suggesting using a virtual nas appliance, as opposed to a windows vm that’s only doing AD?

Yes. Let’s you partition off the NAS work from the AD work. And keeps a NAS freakout from affecting your AD workflow.

iananderson1812 · 2011-08-16T14:50:24.000Z

Ok, so tell me: when the demand for a file server increases, what’s the best option to be working from?

I’m still trying to test what hosted vs non-hosted nexenta will be like for our network. Bit short on decent resources. Is there anything on the web people know of that shows the performance hit you take when virtualising file servers?

I think I remember seeing SAM post once about how virtualising file servers is the worst use of the technology, unless you want to host something else on the same hardware. Half expected him to pipe in with a recommendation for a readynas or something.

No offense to Psywulf and John, but I’ve only heard two points of view, and they’re opposed. I’m still in limbo. Help?

johnwhite · 2011-08-16T17:11:36.000Z

Ok, let me reset.

Here’s what I’d do:

Buy a single box for all your servers and another as a warm backup. HP DL180 G6. RAID 10 the 12-14 drives you have on it. The configuration depends on the CPU and I/O requirements you come up with by using Platespin Recon .

Migrate all your servers to the virtualization host.

Use a best-of-breed backup solution to backup the workloads from the primary host to backup host on a regular, automated basis. A best of breed solution will include tools to test restores.

No NAS. Just local storage on the host which gets carved up and allocated to specific VMs.

iananderson1812 · 2011-08-16T18:09:08.000Z

This was almost exactly my intention. Probably be tempted to try and run some of the VMs on both and just backup in both directions, provided the backup solution allows it.

So clearly I don’t need shared storage for vms, but I was curious if there was a speed benefit for file serving if it’s running off a non-hosted nas, as compared to a hosted box.

So I guess it comes down to, is windows server better at serving files to windows clients than anything else?

And, am I right in thinking that any performance degradation because of multiple VMs hitting the same drives can be mitigated by throwing more drives at the host? Is the delay caused by the added virtualisation layer minimal?

Sorry to keep harping on. It is the point of the post though

johnwhite · 2011-08-16T18:29:51.000Z

ando.spice wrote:

This was almost exactly my intention. Probably be tempted to try and run some of the VMs on both and just backup in both directions, provided the backup solution allows it.

Nooooo. You should size the primary host to run them all comfortably with some headroom. If there’s resource contention, you’re not doing it right.

So clearly I don’t need shared storage for vms, but I was curious if there was a speed benefit for file serving if it’s running off a non-hosted nas, as compared to a hosted box.

Well, if you have a NAS inside the VM host umbrella, you’re bypassing physical network issues (hitting the virtual switches instead, right?). Appropriately sized host doesn’t have CPU or RAM constraint, and appropriately profiled systems don’t have I/O constraints. So … what’s the problem? And it should be cheaper since you’re buying the incrementally more CPU, RAM, and storage infrastructure instead of another box. Buuuut, if the resources are on the VM host, just … give them to the VMs and bypass the NAS completely…

So I guess it comes down to, is windows server better at serving files to windows clients than anything else?

Goodness no. NAS appliance builders don’t use Windows Server. Even if they use Windows, it’s a special stripped-down OEM version.

And, am I right in thinking that any performance degradation because of multiple VMs hitting the same drives can be mitigated by throwing more drives at the host? Is the delay caused by the added virtualisation layer minimal?

There is a cost to using a bare metal hypervisor, but that’s not it. Again, profile your I/O ahead of time with Platespin Recon. That’ll tell you what iops you need from the array. Then you can sit down with array configuration and figure out what combination of drive and array configuration you need to handle those iops (plus headroom). 15K SAS? 10K SAS? NL SAS? 6 x 15K SAS in one array with the rest in a NL SAS array? Not a problem. Just profile what you need first.

scottalanmiller · 2011-08-16T20:22:45.000Z

ando.spice wrote:

Ok, so I’ve been testing IO loads and performance of our current gear, as best I can. Got some unusual results:

Server 1: [physical] sbs2003, 32-bit on a xeon 5130 (2.0GHz, 2 core + HT), 4GB ram

Server 2: [virtual] server 2008 r2, 1 processor, 2GB ram on vmware esxi 4, physically identical to above

Both on the same network, accessed from the same machines.

Seeing as though server 1 does everything and is chronically slow, and server 2 is just a secondary DC, I’d have thought there’d be an even chance server 2 would be faster at disk accesses across the network. No. At least 25% slower, and up to 300% slower for file transfer speed. IOmeter’s total IOPS indicated much worse figures too.

Obviously I underestimated the importance of the number of cores (and possibly server08’s need of more ram). What specs do I need to throw at a VM before it’ll be comparable to a non-hosted file server then?

You mention several specs but leave out the most important ones, the storage subsystems. Yes, cores is dramatic as you has several real cores and the one is time sharing a single core - quite a dramatic difference.

But nothing matters like the drives. How do they compare?

scottalanmiller · 2011-08-16T20:27:25.000Z

There are benefits and cons to both approaches here. Putting in a dedicated NAS device will cost quite a bit more but will provide for zero contention for the storage systems. Windows is not the big performer, it is big on features. The biggest Windows storage boxes are small. High end systems never use Windows. About the highest end that you get are the bigger HP X3xxx series boxes. Bigger than you are doing here, but tiny in the storage world.

Solaris is chosen for raw speed and power. That includes Nexenta. Linux comes next. Windows is fine for speed, but not as fast as those two generally. But you really aren’t in a range here where you are looking for those little speed boosts, typically.

If you are going to put in a dedicated NAS box, just about anything will do. You could easily make do with Nexenta or Windows on a SAM-SD or a ReadyNAS or any number of low cost, smaller options, that will keep costs down in the $2K-$2.5K range.

scottalanmiller · 2011-08-16T20:29:39.000Z

If you virtualize your storage you get the best cost effectiveness but you put a huge IO strain on your virtualization box. Virtualizing IO is its own special performance hell. You need to be careful with your drive subsystems and make sure that you have plenty to deal with your load peeks plus contention. You might consider splitting those drives off into their own array depending on your access patterns.

But overall, you will get more bang for the buck virtualizing, but you will put a very large load on your virtualization host too.

scottalanmiller · 2011-08-16T20:30:36.000Z

When virtualizing, don’t assign small numbers of CPUs to limit impact. This will not work as you intend. Instead use priorities to do this. Make sure that boxes all have more than enough CPU or else you will have performance issues while the virtualization host is still idle.

iananderson1812 · 2011-08-16T21:06:09.000Z

Scott Alan Miller wrote:

ando.spice:

…

You mention several specs but leave out the most important ones, the storage subsystems. Yes, cores is dramatic as you has several real cores and the one is time sharing a single core - quite a dramatic difference.

But nothing matters like the drives. How do they compare?

Thanks for posting, Scott. Very informative.

Wouldn’t have thought the storage type would matter, given that they’re identical. But, and I point out that I inherited this setup, 5x 320GB SATA II 7200rpm, in a raid 5 on a 64MB intel raid card (SRCS16).

Can different OSs handle hardware raid better than others? That intel card appears to actually have a chip in it.

iananderson1812 · 2011-08-16T21:12:02.000Z

Scott Alan Miller wrote:

If you virtualize your storage you get the best cost effectiveness but you put a huge IO strain on your virtualization box. Virtualizing IO is its own special performance hell. You need to be careful with your drive subsystems and make sure that you have plenty to deal with your load peeks plus contention. You might consider splitting those drives off into their own array depending on your access patterns.

But overall, you will get more bang for the buck virtualizing, but you will put a very large load on your virtualization host too.

So this is why I was considering getting twin hosts that could handle the load happily solo, but with vms spread between them, I would be helping to reduce the contention and peak loads. So if a box falls over, I can run well enough, but I’d get better performance with both up. That sound reasonable?