Category Archives: Storage Systems

Solid State Storage: Enterprise State Of Affairs

Here In A Flash!

Its been a crazy last few years in the flash storage space. Things really started taking off around 2006 when NAND flash and moores law got together. in 2010 it was clear that flash storage was going to be a major part of your storage makeup in the future. It may not be NAND flash specifically though. It will be some kind of memory and not spinning disks.

Breaking The Cost Barrier.

For the last few years, I’ve always told people to price out on the cost of IO not the cost of storage. Buying flash storage was mainly a niche product solving a niche problem like to speed up random IO heavy tasks. With the cost of flash storage at or below standard disk based SAN storage with all the same connectivity features and the same software features I think it’s time to put flash storage on the same playing field as our old stalwart SAN solutions.

Right now at the end of 2012, you can get a large amount of flash storage. There is still this perception that it is too expensive and too risky to build out all flash storage arrays. I am here to prove at least cost isn’t as limiting a factor as you may believe. Traditional SAN storage can run you from 5 dollars a Gigabyte to 30 dollars a Gigabyte for spinning disks. You can easily get into an all flash array in that same range.

Here’s Looking At You Flash.

This is a short list of flash vendors currently on the market. I’ve thrown in a couple non-SAN types and a couple traditional SAN’s that have integrated flash storage in them. Please, don’t email me complaining that X vendor didn’t make this list or that Y vendor has different pricing. All the pricing numbers were gathered from published sources on the internet. These sources include, the vendors own website, published costs from TPC executive summaries and official third party price listings. If you are a vendor and don’t like the prices listed here then publicly publish your price list.

There are always two cost metrics I look at dollars per Gigabyte in raw capacity and dollars per Gigabyte in usable capacity. The first number is pretty straight forward. The second metric can get tricky in a hurry. On a disk based SAN that pretty much comes down to what RAID or protection scheme you use. Flash storage almost always introduces deduplication and compression which can muddy the waters a bit.

Fibre Channel/iSCSI vendor list

Nimbus Data

Appearing on the scene in 2006, they have two products currently on the market. the S-Class storage array and the E-Class storage array.

The S-Class seems to be their lower end entry but does come with an impressive software suite. It does provide 10GbE and Fibre Channel connectivity. Looking around at the cost for the S-Class I found a 2.5TB model for 25,000 dollars. That comes out to 9.7 dollars per Gigabyte in raw space. The S-Class is their super scaleable and totally redundant unit. I found a couple of quotes that put it in at 10.oo dollars a Gigabyte of raw storage. Already we have a contender!

Pure Storage

In 2009 Pure Storage started selling their flash only storage solutions. They include deduplication and compression in all their arrays and include that in the cost per Gigabyte. I personally find this a bit fishy since I always like to test with incompressible data as a worst case for any array. This would also drive up their cost. They claim between 5.00 and 10.00 dollars per usable Gigabyte and I haven’t found any solid source for public pricing on their array yet to dispute or confirm this number. They also have a generic “compare us” page on their website that at best is misleading and at worst plain lies. Since they don’t call out any specific vendor in their comparison page its hard to pin them for falsehoods but you can read between the lines.

Violin Memory

Violin Memory started in earnest around 2005 selling not just flash based but memory based arrays. Very quickly they transitioned to all flash arrays. They have two solutions on the market today. The 3000 series which allows some basic SAN style setups but also has direct attachments via external PCIe channels. It comes in at 10.50 dollars a Gigabyte raw and 12 dollars a Gigabyte usable. The 6000 series is their flagship product and the pricing reflects it. At 18.00 dollars per Gigabyte raw it is getting up there on the price scale. Again, not the cheapest but they are well established and have been used and are resold by HP.

Texas Memory Systems/IBM

If you haven’t heard, TMS was recently purchased by IBM. Based in Houston, TX I’ve always had a soft spot for them. They were also the first non-disk based storage solution I ever used. The first time I put a RamSan in and got 200,000 IO’s out of the little box I was sold. Of course it was only 64 Gigabytes of space and cost a small fortune. Today they have a solid flash based fibre attached and iSCSI attached lignup. I couldn’t find any pricing on the current flagship RamSan 820 but the 620 has been used in TPC benchmarks and is still in circulation. It is a heavy weight at 33.30 dollars a Gigabyte of raw storage.

Skyera

A new entrant into this space they are boasting some serious cost savings. They claim a 3.00 dollar per Gigabyte usable on their currently shipping product. The unit also includes options for deduplication and compression which can drive the cost down even further. It is also a half depth 1U solution with a built-in 10GbE switch. They are working on a fault tolerant unit due out second half of next year that will up the price a bit but add Fibre Channel connectivity. They have a solid pedigree as they are made up of the guys that brought the Sanforce controllers to market. They aren’t a proven company yet, and I haven’t seen a unit or been granted access to one ether. Still, I’d keep eye on them. At those price points and the crazy small footprint it may be worth taking a risk on them.

IBM

I’m putting the DS3524 on a separate entry to give you some contrast. This is a traditional SAN frame that has been populated with all SSD drives. With 112 200 GB drives and a total cost of 702908.00 it comes in at 31.00 a Gigabyte of raw storage. On the higher end but still in the price range I generally look to stay in.

SUN/Oracle

I couldn’t resist putting in a Sun F5100 in the mix. at 3,099,000.00 dollars it is the most expensive array I found listed. It has 38.4 Terabytes of raw capacity giving us a 80.00 dollars per Gigabyte price tag. Yikes!

Dell EqualLogic

When the 3Par deal fell apart Dell quickly gobbled up EqualLogic, a SAN manufacturer that focused on iSCSI solutions. This isn’t a flash array. I wanted to add it as contrast to the rest of the list. I found a 5.4 Terabyte array with a 7.00 dollar per Gigabyte raw storage price tag. Not horrible but still more expensive that some of our all flash solutions.

Fusion-io

What list would be complete without including the current king of the PCIe flash hill Fusion-io. I found a retail price listing for their 640 Gigabyte Duo card at 19,000 dollars giving us a 29.00 per usable Gigabyte. Looking at the next lowest card the 320 Gigabyte Duo at 7495.00 dollars ups the price to 32.20 per useable Gigabyte. They are wicked fast though :)

So Now What?

Armed with a bit of knowledge you can go forth and convince your boss and storage team that a SAN array fully based on flash is totally doable from a cost perspective. It may mean taking a bit of a risk but the rewards can be huge.

 

The Fundamentals of Storage Systems – Shared Consolidated Storage Systems

Shared Consolidated Storage Systems – A Brief History

Hey, “Shared Consolidated Storage Systems” did you just make that up? Why yes, yes I did.

For as long as we have had computers there has been a need to store and retrieve data. We have covered the basics of hard disks, RAID and solid state storage. We have looked at all of this through the aspect of being directly attached to a single server. It’s time we expand to attaching storage pools to servers via some kind of network. The reason I chose to say shared and consolidated storage instead of just SAN or Storage Area Network was to help define, broaden and give focus to what we really mean when we say SAN, NAS, Fibre Channel or even iSCSI. To understand where we are today we need to take a look back at how we got here.

Once, There Were Mainframes…

Yep, I know you have heard of these behemoths. They still roam the IT Earth today. Most of us live in an x86 world though. We owe much to Mainframes. One of these debts is networked storage. Way back when, I’m talking like the 1980’s now, Mainframes would attach to their storage via a system bus. This storage wasn’t internal the way we think of direct attached storage though. They had massive cables running from the Mainframe to the storage pods. The good folks at IBM and other big iron builders wanted to simplify the standard for connecting storage and other peripherals.

 

Who doesn’t love working with these cables?

You could never lose this terminator!

Out With The 1960’s And In with the 1990’s!

Initially IBM introduced it’s own standard in the late 80’s to replace the well aged bus & tag and other similar topologies with something that was more robust and could communicate over optical fiber. ESCON was born. The the rest of the industry backed Fibre Channel which is a protocol that works over optical fiber or copper based networks, more importantly it would be driven by a standards body and not a single vendor. Eventually, Fibre Channel won out. In 1994 Fibre Channel was ratified and became the defacto standard even IBM got on board. Again, we are still talking about connecting storage to a single Mainframe, longer connections were possible and the cabling got a lot cleaner though. To put this in perspective, SQL Server 4.2 was shipping at that point with 6.0 right around the corner.

High Performance Computing  and Editing Video.

One of the other drivers for Fibre Channel was the emerging field of High Performance Computing (HPC) and the need to connect multiple mainframes or other compute nodes to backend storage. Now we are really starting to see storage attached via a dedicated network that is shared among many computers. High end video editing and rendering farms also drove Fibre Channel adoption. Suddenly, those low end pc-based servers had the ability to connect to large amounts of storage just like the mainframers’.

Commodity Servers, Enterprise Storage.

Things got interesting when Moore’s Law kicked into high gear. Suddenly you could buy a server from HP, Dell or even Gateway. With the flood of cheaper yet powerful servers containing either an Intel, MIPS, PPC or Alpha chip you didn’t need to rely on the mainframe so heavily. Coupled with Fibre Channel and suddenly you had the makings for a modern system. One of the biggest challenges in this emerging commodity server space was storage management. Can you deal with having hundreds of servers and thousands of disks without any real management tools? What if you needed to move some unused storage from server A to Server B? People realized quickly that maintaining all these islands of storage was costly and also dangerous. Even if they had RAID systems if someone didn’t notice the warnings you could lose whole systems and the only people who knew something was up was the end user.

Simplify, Consolidate, Virtualize and Highly Available

Sound familiar? With the new age of networked storage we needed new tools and methodologies. We also gained some nifty new features. Network attached storage became much more than a huge hard drive. To me, if you are calling your storage solution a SAN it must have a few specific features.

Simplify

Your SAN solution must use standard interconnects. That means if it takes a special cable that only your vendor sells it doesn’t qualify. In this day and age, if a vendor is trying to lock you into specific interface cards and cables they are going to go the way of the dodo very quickly. Right now the two main flavors are Fiber Optics and copper twisted pair a.k.a Ethernet. It must also reduce your management overhead this usually means a robust software suite above and beyond your normal RAID card interface.

Consolidate

It must be able to bring all your storage needs together under one management system. I’m not just talking disks. Tape drives and other storage technologies like deduplication appliances are in that category. The other benefit to consolidation is generally much better utilization of these resources. Again, this falls back to how robust the software stack that your SAN or NAS comes with.

Virtualize

It must be able to abstract low level storage objects away from the attached servers allowing things like storage pools. This plays heavily into the ability to manage the storage that is available to a server and maintain consistency and up time. How easily can I add a new volume? Is it possible to expand a volume at the SAN level without having to take the volume off-line? Can other resources share the same volumes enabling fun things like clustering?

Highly Available

If you are moving all your eggs into one HUGE basket it better be one heck of a basket. Things like redundant controllers where one controller head can fail but the SAN stays on line without any interruption to the attached servers. Multiple paths into and out of the SAN so you can build out redundant network paths to the storage. Other aspects like SAN to SAN replication to move your data to a completely different storage network in the same room or across the country may be available for a small phenomenal add on fee.

If your SAN or NAS hardware doesn’t support these pillars then you may be dealing with something as simple as a box of disks in a server with a network card. Realize that most SANs and NAS’es are just that. Specialized computers with lots of ways to connect with them and some really kick-ass software to manage it all.

Until Next Time…

Now that we have a bit of history and a framework we will start digging deep into specific SAN and NAS implementations. Where they are strong and where they fall flat.

Speaking at PASS Summit 2012

It’s Not A Repeat

Speaking at the PASS Summit last year was one of the highlights of my career. I had a single regular session initially and picked up an additional session due to a drop in the schedule. Both talks were fun and I got some solid feedback.

The Boy Did Good

I won’t say great, there were some awesome sessions last year. I did do well enough to get an invite to submit for all the “invite only sessions”. I was stunned. I don’t have any material put together for a half day or a full day session yet and the window to submit sessions was a lot smaller this year. But I do have three new sessions and all of them could easily be extended from 75 minutes to 90 minutes. So, I submitted for both regular sessions and spotlight sessions and got one of both! WOO HOO!

The Lineup

I’ll be covering two topics near and dear to my heart.

How I Learned to Stop Worrying and Love My SAN [DBA-213-S]
Session Category: Spotlight Session (90 minutes)
Session Track: Enterprise Database Administration & Deployment

SANs and NASs have their challenges, but they also open up a whole new set of tools for disaster recovery and high availability. In this session, we’ll cover several different technologies that can make up a Storage Area Network. From Fibre Channel to iSCSI, there are similar technologies that every vendor implements. We’ll talk about the basics that apply to most SANs and strategies for setting up your storage. We’ll also cover SAN pitfalls as well as SQL Server-specific configuration optimizations that you can discuss with your storage teams. Don’t miss your chance to ask specific questions about your SAN problems.

I’ve built a career working with SAN and System Administrators. The goal of this session is to get you and your SAN Administrator speaking the same language, and to give you tools that BOTH of you can use to measure the health and performance of your IO system.

 

Integrating Solid State Storage with SQL Server [DBA-209]
Session Category: Regular Session (75 minutes)
Session Track: Enterprise Database Administration & Deployment

As solid state becomes more mainstream, there is a huge potential for performance gains in your environment. In this session, we will cover the basics of solid state storage, then look at specific designs and implementations of solid state storage from various vendors. Finally, we will look at different strategies for integrating solid state drives (SSDs) in your environment, both in new deployments and upgrades of existing systems. We will even talk about when you might want to skip SSDs and stay with traditional disk drives.

I’ve spoken quite a bit on solid state storage fundamentals this time around I’ll be tackling how people like myself and vendors are starting to mix SSD’s into the storage environment. Where it makes sense and where it can be a huge and costly mistake.

Finally

I hope to see you at the Summit again this year! Always feel free to come say hi and chat a bit. Networking is as important as the sessions and you will build friendships that last a lifetime.

Building A New Storage Test Server

We’re Gonna Need A Bigger Boat

Not to sound too obvious, I test IO systems. That means from time to time I have to refresh my environment if I want to test current hardware. Like you, I work for a living and can’t afford something like a Dell R910 Heck, I can’t afford to shell out for the stuff that Glenn Berry gets to play with these days. Yes, I work for the mighty Dell. No, they don’t give me loads of free hardware to just play with. That doesn’t mean I, or you, can’t have a solid test system that is expandable and a good platform for testing SQL Server.

The hardware choices, inexpensive doesn’t mean cheap

Well, most of the time. Realize I’m not building what I would consider a truly production ready server. Things like ECC memory and redundant power supplies are a must if you are building a “fire and forget” server to rack up. A good test server on the other hand doesn’t have the same up time requirements.

Case

A couple of years ago I would have bought something like a Aerocool Masstige. It will take a full size motherboard and has 10 5.25 bays. This allows me to then put something like this 3×5 5.25 to 3.5 mobile rack. with 10 bays I can put 15 hard drives in plus have one bay left over for something like a CD-Rom drive or another hard drive. The Aerocool Masstige does have two internal hard drive bays as well making for a total of 18 3.5″ drives in one case. The cost does add up though. The case has been discontinued but can still be found for around 110.00. The three drive cadges will run you another 100.oo. Oh, and you need a power supply that’s another 100.00. That brings the cost up to 510.00. Considering that a 3U Supermicro case with 15 bays will run you 700.00 easily. Not horrible for the amount of drive bays but there are better options now.

Norco RPC-4224 4U Server Case
This thing is big, I mean really big. It is deep and tall. It was designed to be a rack mount server but sits just fine on a shelf if you have clearance in the back. I was looking at another version of this same case that houses 20 drives but the price difference just made this hard pass up. This case isn’t a Supermicro case. It doesn’t have the build quality. To be honest though, I’m fine with that. What it does have is the ability to take a large range of ATX motherboards and a standard ATX power supply. Right now Newegg has this case on for 400.00. With a power supply that brings the total up to 500.00 still cheaper than the Supermicro with a ton of drive bays to boot. If you have worked with servers and had to cable them up you may notice that the RPC-4224 has a very different backplane layout. Every four drives has its own backplane and four lane SFF-8087 connector. Usually, most back planes have a single or maybe two connectors for 8 lanes shared via on board SAS expander. Since this doesn’t have that feature it actually makes it easier to build this thing for maximum speed. I can ether buy a very large RAID controller with 24 SAS ports or I can buy my own SAS expanders. The only down side to the backplanes on this server is the fact they are SAS 3Gb/s and not the newer 6Gb/s ports. For spinning drives it isn’t that big of an issue but if you are planning on stacking some SSD’s in those bays it can hurt you if the SSD’s support the newer protocol.

The one warning I’ll make is this thing is very front heavy. Oddly enough having 24 drives stuffed in the front doesn’t make for good weight distribution.  Pro tip, don’t put the hard drives in until the server is where you want it. It is a lot easier to move the case if it isn’t as heavy as two car batteries.

CPU

Just like Glenn, I think the Core i7 2600k is a very good choice for this build. At 314.00 you are only paying a slight premium over the 2600 for a lot more flexibility, *cough*overclocking*cough*.

Motherboard

I thought long and hard on this one and settled on a GIGABYTE GA-Z68A-D3H-B3. This is a very reasonably priced motherboard at 129.00 with some nice features. First, it is based off of the Intel Z68 chipset which means I have video built into the system and don’t have to give up a PCIe slot to video. Secondly, it has USB 3.0 which makes it easy to hook up an external USB 3.0 drive and get some livable speeds. Thirdly, it has SATA III 6Gb/s ports native. It only has two out of the six ports available at that speed but it does give me a few more drive options outside a add on RAID controller. Lastly, the PCIe slots on board are upgradeable to the new PCIe 3.0 standard. This means I don’t have to change my motherboard out to get a nice little bump in speed from newer PCIe RAID controllers or solid state cards.

Memory

Another perk of the Z68 chipset is that it will support up to 32GB of DDR3 RAM, when it becomes available that is. In the short to mid term I’ve got 16GB of Kingston HyperX 1600 DDR3 installed. That’s 115.00 in memory. I could have shaved a few dollars off but buying this as a four piece kit saves me from having to play the mix and match game with memory and hoping that it all works out.

IO System

This is where things get a little complicated. Since I need a lot of flexibility I need to have some additional hardware.

RAID Controller

I have an LSI MegaRAID 9260 6Gb/s card in the server now. At 530.00 it is a lot of card for the money. If you wanted to skip the SAS expanders and get a 24 port card you would be looking between 1100.00 to 1500.00. What’s worse, you really won’t see a huge jump in performance. Hard disks are a real limiting factor here.

SAS Expanders

SAS expanders are a must. There will be times where I will power all 24 drives from a single RAID card that has 24 lanes. There will also be times where I have smaller controllers installed and need to aggregate those drives together across or two connectors on a RAID controller. There are a couple of choices available to you. I opted for the Intel RES2Sv240 expander over the HP 468406-B21. The Intel expander supports the SAS 6Gb/s protocol and has one additional killer feature, it doesn’t require a PCIe slot to run. It was designed to work in cases that support the MD2 form factor. That means it could be mounted on a chassis wall and fed with a standard molex power connector. Why is such a big deal? It means I can stack these in my case and keep my very valuable PCIe slots free for RAID controllers and SSD cards. Newegg has them at 279.00 but you can find them cheaper. The HP expander is listed at 379.00 and requires a PCIe slot for power.

Hard Drives

I opted for smaller 73GB 15,000 RPM Fujitsu drives. They aren’t the fastest drives out since they are a generation behind. What they lack in speed they make up in price. Normally, these drives new cost 150.00 a pop. But, I’m a risk taker. You can find refurbished or pulls for as little as 22 bucks a drive. Make sure you are dealing with a seller that will take returns! I personally have had pretty good luck dealing with wholesale companies that specialize in buying older servers and then reselling the parts. Almost all of them will offer at least a 30 day return. That means you need to do a little more work on your end and validate the drives during your return window. Now I have 24 15k drives for under 600.00 bucks.

I’m using a 2.5″ 7200RPM drive as my boot drive mounted inside the case.

SSD’s

You didn’t think I’d put together a new system and not have some solid state in it did you? I’ve got a few SSD’s floating around but wanted to buy the latest in consumer grade drives and see if they have upped the game any. I opted for the Corsair Force GT 60GB drive, four of them. At 125.00 they are a solid buy for the performance you are getting. Based on the new Sanforce SF2280 controller and able to deliver 85k IOps and 500MB/sec in reads and writes they are a mighty contender. The other thing that pushed me to this drive was the fact it uses ONFI synchronous flash. I won’t hash out why it is better other than to say it produces more reliable results and is faster than its asynchronous or toggle NAND brothers.

Again, the case is so big on the inside I mounted two 1×2 3.5″ to 2.5″ drive bays to house them. That was an extra 50.00 a pop.

Lets Recap

Case 400.00
Powersupply 100.00
Motherboard 130.00
CPU 314.oo
Memory 115.00
RAID HBA 530.00
SAS Expanders 558.00
24 15K drives 558.00
4 SSD’s 500.00

Grand total: 3205.00

What does this buy me? A server that can do 2GB/s in reads or writes and 160k IOps or more. I’ll let you in on another little secret, shop around! Don’t think you have to buy everything at once. Don’t be afraid to wait a week for your parts if you get free shipping. By taking a month to put this machine together I paid about 2700.00. A huge discount over the listed price getting 30% or more off some stuff like the expanders, RAID controller, SSD’s, Case and CPU.

Just in case you were wondering what it looks like:

With the bonnet off (early test setup):

The SAS Backplanes cabled up:

Understanding Benchmarks

That Means What?

Vizzini: HE DIDN’T FALL? INCONCEIVABLE.
Inigo Montoya: You keep using that word. I do not think it means what you think it means.
– Princess Bride

If you are like me, you are constantly reading up on the latest hardware. Each site has it’s own spin on what makes up its review. All of them use some kind of synthetic benchmarking software. Some don’t rely to heavily on them because they can show the real world performance using playback tools. This method is used heavily on gaming hardware sites like [H]ard|OCP where they decided long ago that using purely synthetic benchmarks were at best inaccurate and at worst flat misleading. In the graphics card and processor space this is especially so. Fortunately, on the storage side of the house things are a little simpler.

 

 

What’s In A Workload

In the processor space measuring performance is a complicated beast. Even though every processor may be able to run the same software they can vary wildly in how they do it. On the processor side of things I favor Geekbench right now since it uses known mathematical algorithms. John Poole is very open on how Geekbench works Are the benchmarks relevant to database workloads? I’ll be exploring that in a future post.

In the storage space we have a pretty standard benchmarking tool in Iometer. This tool was initially developed by Intel and spread like wildfire throughout the industry. Intel quit working on it but did something very rare, turned it over to the Open Source Development Lab for continued development. You may ask why I favor Iometer over SQLIO? The answer is simple, complexity. Iometer allows me to simulate diffrent read/write patterns in a very predictable manor. SQLIO doesn’t simulate complex patterns. It does reads or writes, random or sequential for a fixed duration. This is fine for finding the peak performance of a specific IO size but doesn’t really tell you how your storage system might respond under varying workloads. You my notice that they only sites that use SQLIO are SQL Server sites. While the rest of the world generally uses Iometer. The problem is none of the sites that I regularly visit publish the exact Iometer settings they used to get the results they publish. Tom’s Hardware, Anandtech, Ars Technica and Storage Review all use Iometer in some fashion. Doing some digging and testing like hard drives I think most of the sites are using a mix 67% reads 33% writes 100% random at an 2KB block which was defined by Intel and represents an OLTP workload. Storage Review did a nice writeup a decade ago on what they use for I/O patterns and Iometer. This isn’t the best fit for a purely SQL Server workload but isn’t the worst ether. By moving from a 2KB block to an 8KB block we are now squarely in SQL Server I/O land.

SQL Server Specific

Now we are starting to get to the root of the problem. All the main hardware review sites don’t focus on us at all. If we are lucky there will be a single column marked “Database workload”. So what do we do? You read, research and put together your own test suite. SQL Server I/O access patterns are pretty well documented.  So, I put those general patterns in a Iometer configuration file and keep it in my back pocket. I have posted a revised file in the My Tools section here on the site.

For the storage stuff that is fine but what about CPU and memory throughput? Things get a little murky here. Like Glenn Berry(blog|twitter) and I you can use Geekbench to get a baseline on those two things but again, this isn’t a SQL Server specific tool. In most cases sampling a workload via trace getting a baseline on performance then replaying that same workload on different servers will help but only tells you about your application. If you are looking for general benchmarks I personally wouldn’t put much stock in the old TPC-C tests anymore. They aren’t a realistic assessment of database hardware at this point. It is pretty easy to stack a ton of memory and throw a bunch of CPU’s at the test to get some ridiculous numbers. I personally look at TPC-E for OLTP tests since there is a decent sampling of SQL Server based systems and TPC-H for data warehouse style benchmarks. As always don’t expect the exact same numbers on your system that you see on the TPC benchmark scores. Even TPC tells you to take the numbers with a grain of salt.

My Personal Reader List

I personally follow Joe Chang (blog) for hard core processor and storage stuff. He has a keen mind for detail. I also read Glenn Berry(blog|twitter) he has some deep experience with large SQL Server deployments. Also, Paul Randal (blog|twitter) because he has more hardware at his house than I do and puts it to good use. I would advise you to always try and find out how the benchmark was performed before assuming that the numbers will fit your own environment.

What’s On My Todo List

I wrote a TPC-C style benchmark quite a while back in C#. I’m currently building up instructions for TPC-E and TPC-H using the supplied code and writing the rest myself in hopes of building up a benchmark database. This will be in no way an official TPC database or be without bias. I’m also always updating my Iometer and SQLIO tools as well with full instructions on how I run my tests so you can validate them yourself.

As always if you have any suggestions or questions just post them up and I’ll do my best to answer.

Materials from SQL Server, Storage and You Part III

Thanks again to everyone who attended. Technical problems aside I had a great time and there were some great questions!

If you have a question please feel free to contact me, I’ll do my best to answer it.

Slide Deck

 

SQL Server, Storage and You Part 2!

I saw 500 faces an rocked them all!

To those who attended my webinar today THANK YOU! If you didn’t…. There is a third one coming on solid state! As always the slide deck is posted here. I’ll be adding this information and some additional details to my ongoing series on storage as well.

Pliant Technology, Enterprise Flash Drives For Your SQL Server: Part 1

Pliant Technology, New Kid On The Block

If you have been reading my storage series, and in particular my section on solid state storage, you know I have a pretty rigid standard for enterprise storage. Several months ago I contacted Pliant Technology about their Enterprise Flash Drives. It didn’t surprise me when they made the recent announcement about being acquired by SanDisk. Between Pliants’ enterprise ready technology and SandDisks’ track record at the consumer level I think they will be a new force to be reckoned with for sure. Pliant drives are already being sold by Dell and now will have a much larger channel partnerships with the new acquisition. They are one of the very few offering a 2.5″ or even more rare 3.5″ form factor using a  dual port SAS interface. I have been hammering on this drive for months now. It has taken everything I can throw at it and asked for more.

Enterprise Flash Drives

Pliant send me a Lightning LS 3.5″ 300S in a nondescript box. What surprised me is how heavy the drive is. I was expecting a featherweight drive like all the rest of the 2.5″ SSD’s I’ve worked with. This drive is very well made indeed. Another thing was the fins on top of the drive, something I’m use to seeing on 15,000 RPM drives but not on something with no moving parts. It never got hot to the touch so I’m not sure if they are really needed. The bottom of the drive has all the details on a sticker.

If you look closely at the SAS connector you will see many more wires than visible pins. This is because it is a true dual port drive. If you could see the other side of the SAS connector you would see another set of little pins in the center divider for the second port.

Normally, this port is used as a redundant path to the drive so you can lose a host bus adapter and still function just fine. Technically, you could use Multi-Path IO to use both channels in a load balancing configuration. Something I’ve never done on a traditional hard drive since you get zero benefit from the extra bandwidth at all. Solid state drives are a different beast though. A single drive can easily use the 300 megabytes available to a SAS 1.0 port. If you look at the specification sheet for this drive you will see they list read speeds of 525 MB/Sec and write speeds of 320 MB/Sec both above the 300 MB/sec available to a single SAS port. MPIO load balancing makes the magic happen. Since this drive was finalized before the 600 MB/Sec SAS 2.0 standard was in wide production it only makes since to use both ports for reads and writes. Since it doesn’t seem to be hitting more than 525 MB/Sec for reads I don’t know how much the drive would benefit from an upgrade to SAS 2.0.

Meet The HBA Eater

The big problem isn’t the MB/Sec throughput it is the number of IO’s this beast is capable of. Again, according to the spec sheet a single drive can generate 160,000 IO/Sec. That isn’t a typo. Even latest and best consumer grade SSD’s aren’t getting anywhere near that number, most top out in the 35,000 range with a few getting as high as 60,000. Lucky for us LSI has released a new series of host bus adapters capable of coping. The SAS 9211-4i boasts four lanes of SAS 2.0 and a throughput of more than 290,000 IO/Sec. More than enough to test a single LS 300S.

That answers the IO question but we still have to deal with the dual port issue if we wish to get every ounce out of the LS 300s. I tried several different approaches to get the second port to show up in windows as a usable active port. The drive chassis I had said they supported the feature but all of them had issues. I actually bought an additional drive cage that also reported to support dual port drives in an active/active configuration. Alas, it had issues as well. I was beginning to think there may be something wrong with the drive Pliant sent me! I finally just bought a mini-sas cable that supported dual port drives.

As you can see this cable is different. The two yellow wires are each a single SAS channel the other wires are for power. That means on my four port card I can hook up two dual port drives. Finally, windows saw two drives and I was able to configure MPIO in an active/active configuration!

Until Next Time….

Now that we have all the hardware in place and configured we will take a look at the benchmarks and long term stress tests in the next article.

SQL Server, Storage and You

Just a note that I will start my three part webcast, SQL Server, Storage and You next week April 13th at 2PM CST. I’m excited to have this opportunity to speak to a much wider audience on something that I love so much. When Idera approached me last year about doing a three part series I was nervous to say the least. I’ve always taught in a live setting with students or attendees right in front of me. Luckily, this isn’t my first time doing something like this. As some of you know I was actually a mass communications/theater major in college and worked in radio. I’m having to reach back and dust off some of these skills. I am confident that it will go smoothly. Registration is free and they record the session for later viewing as well.

Register now

SQL Server, Storage and You – Part I: Storage Basics

Just like building a house we must first lay the foundation. This presentation will take you through low level fundamentals that we will use later on as we grow your storage knowledge. Starting with how data moves inside your server. How hard disks work. You will also get a primer on RAID configuration and how to mitigate drive failures and data loss. Wrapping up with a file system primer and how to configure your storage with SQL Server in mind.

Thanks again to Idera and MSSQLTips!

Idera logo MSSQL Tips logo

SQLSaturday #63, Great Event!

So,

I actually had a early morning sessions and gave my Solid State Storage talk and had a great time. The audience was awesome asked very smart questions and I didn’t run over time. The guys and gals here in Dallas have put on another great event and it isn’t even lunch time yet!

As promised here is the slide deck from todays session. As always if you have any questions please drop me a line.

Solid State Storage Deep Dive