Mike Workman
 

May 09, 2008

QoS Contention, Or Sharing of High Performance LUNS’s and Filesystems

OK, so I will assume you accept that the old model of no prioritization of applications using storage in a storage pool is dumb. I will assume you’ve been reading my blog a bit and you now realize that there is a reason why networking instituted QoS, why airlines have first, business, and economy classes of service. Right? If not start here. If you don’t want to follow another stinking link, then let me summarize: Not all passengers, packets, LUNs or Filesystems are equal, and this inequality allows the Axiom to extract more value for the customer out of the disk drives using QoS for Storage.

So, for the skeptical folks out there, they try and shoot holes in the concept, which is a reasonable thing to do. Let’s pick a favorite, how is the Axiom any better than another system without QoS if for example you are sharing two high priority (call it Premium) LUNs?

To first order, it is not! If all you have is people that want first class tickets, there is no gain in differentiation, because you have defined a problem that allows none. That is simple! So if every Filesystem or LUN is equal, an old style system is just about as good as you can get anyway. Make sense? Well, with Axiom, the data placement on the platter and purposeful automatic striping across different spindles will give you a better and more automatic result than most of our competitors. On the other hand, the differentiation under those conditions is obtainable on any system as long as the data is laid out properly on the disk (manually if you aren’t using an Axiom), and you have enough cache.

OK, so let’s say we don’t define a silly example of two equal, high priority LUNs forced to share spindles. Let’s say we have 10 LUNs, and two are high priority relative to the rest. Let’s also assume that the Axiom is forced to share the spindles that they all reside on (it won’t share them unless it has to). Is this different? How can this be better than a system without QoS since there is sharing going on?

The answer is simple, and clear. To lay it out would be complex, but an example should make it abundantly obvious: Let’s say you are about to walk up to the ticket counter at an airline. Would you rather be in the first class line, or the main cabin? Well, assuming normal distribution of passengers, the answer is obviously first class. We all intuitively know that although you may be sharing the agent(s) with 2 or three other passengers, you aren’t sharing with 100 other passengers. Same applies for Axiom’s QoS. The difference is real, and worth money.

Look, this is not some mumbo jumbo cloud of smoke. Spindles and cache equal IOPS. If you can differentiate the application requirements by IO, capacity, and workload, a storage pool can meet a much wider range of simultaneous application requirements using QoS than without it for a given quantity of spindles. This means that in a world of ever increasing disk drive sizes, Axiom is the architecture that extracts the maximum value out of the storage technology you can afford. Period.

And this, my friends, is worth money - to you.

May 01, 2008

EMC has a Multiprotocol Array? Good one.

Our goal is to make your life simpler. We put the concept of true NAS/SAN integration on the map in my opinion. I think NetApps* beat us with the function, but their implementation was poor. Of course, we had a great SAN product, but 2.5 years ago we had one customer, so nobody really took us seriously either. So I will give them credit for being the first, but Pillar I think deserves credit for having a real SAN. Being fair, NetApp has improved their SAN implementation quite a bit in the last couple years.

One criticism I have heard from a BIG independent storage vendor is that we build a utility platform that competes against NS, CX and low-end DMX products. Duh? That is the whole point! The critics say we “have to” support SAN/NAS together. I say we do support SAN/NAS together … and the competition doesn’t. But I am speaking of facts, albeit with a bias, so I will quote Brian Garrett of ESG, taken from the March cover story in Storage Magazine by Jacob Gsoedl.

"With Symmetrix, Clariion, Celerra and Centera, EMC has four different solutions, each with its own code base and architecture, and it would make sense for EMC to head to a unified solution," says Brian Garrett, technical director, ESG Lab at Enterprise Strategy Group.

Why buy several platforms to serve all your requirements when one will do? Really! Stove-piping just jacks up the TCO to the benefit of the storage vendor, not the customer.

The article in Storage Magazine goes on to say all kinds of nice stuff about Pillar – and frankly we deserve it, but if I say any more you guys will think the marketing team at Pillar wrote this instead of me. For those of you who are suspicious, check out the footnote below....you can be sure they didn’t write that one.

OK – I will put one good one in for the Marketing Team – here she goes:

The same story describes the Pillar Axiom in the following way, “Pillar has supported FC, iSCSI, NFS and CIFS in its Axiom arrays from day one. A scalable architecture, offloading of file-system protocol and RAID processing to so-called Slammers, and cluster support make the Axiom array family a great fit for SMBs and enterprises. Tight integration with Oracle tools like Oracle Enterprise Manager makes Axiom arrays a perfect fit in Oracle environments.

-Mike

* I know it is NetApp now, but for crying out loud could they be more sensitive about their name? Who cares about their stinking name? NTAP, Network Appliance, NetApp, they’re all just fine. What nobody can understand is why they changed their logo to one that looks familiar if you’ve been to Stonehenge. Perhaps they have a Druid on board over there? If they ever commission a sculpture for a stage prop with this logo – they better make sure they have their feet and inches straight or they could end up imitating a scene from Spinal Tap.

April 23, 2008

Who cares about Performance under Fault Conditions?

Well, you do if you own or operate a storage array. Storage systems have lots of components, including mechanical ones like disk drives. The whole point of RAID is to deal with failures of parts. Moving parts fail most often. Most systems today (not all) have redundancy built-in throughout the entire system. In an Axiom, all those redundant parts do work for the array all the time because it is an active-active architecture. Some arrays have active-passive architectures that waste those components that just sit around waiting for a failure.

So, what happens when a component fails? Well, in HA systems like Axiom, Clariion, and NetApp products, customers still have access to their data. Paramount in all but the most trivial storage applications is being able to get at your data regardless of any single failure.

What goes mostly unspoken is the effect of a failure on performance. Systems from NetApp and EMC can take a long time under load to rebuild their failed disks onto a spare drive. In fact, it can take more than a day! This matters because while the rebuild is in progress, the array is running without protection against another failure. The odds of a second failure are small, but get proportionately larger as the rebuild times grow longer.

To solve this problem, some storage manufactures put yet another redundant disk drive into their arrays. So you pay for more unusable capacity and power to protect yourself against the vendor’s long rebuild times. This is a great technique against loss of data, but wasteful and expensive. In contrast, Pillar’s Axiom drastically reduces the drive rebuild time by using a distributed hardware RAID architecture. Distributed RAID gives the following clear, demonstrable benefits that have been measured by outside laboratories against our competitors:

  1. We rebuild faster than any array on the market.
  2. We perform better under faulted conditions by a HUGE margin, factors of 2-3.
  3. We perform under all faulted conditions with minor loss of performance, on the order of 0-8%, versus 50% loss of performance from some vendors.

While most everyone guarantees continuous access to data under fault, they really don’t want to talk about the systems’ performance under those fault conditions. Why does this matter? Well, backup window integrity, customer perceived performance, boot times, the list goes on and on. They all depend on predictable, reasonable performance of the system, not 3 to 1 variations in performance under fault.

If you want great performance under fault conditions of any type, buy the Axiom. Your mileage may vary, but it will vary a hell of a lot less with the Axiom than with our competitors.

April 18, 2008

Axiom 300 for up to 20TB

Contrary to the belief of some of our competitors, Axiom 300 is a full-fledged product. The software, look and feel, ease of use, and guided maintenance our customers love is the same as the Axiom 500. Besides being Application-Aware, the Axiom 300 has many features and functions that most competitors reserve for the Enterprise built into a great product for smaller applications.

The difference in the AX300 and the AX500 lies in the Slammer hardware; unique motherboards reduce power, cache, back-end connectivity to appropriate levels for smaller applications requiring fewer spindles (up to 54). There is not a lot of use in scaling to 832 spindles for smaller applications, so powering up all kinds of infrastructure to allow such scaling would be wasteful, and more costly than necessary.

While the Axiom 500 has unprecedented scalability in the marketplace, making it smaller did require a modified set of CPU boards for the Slammer. The great news for our customers who care to trade up is that they can, and with data in place. So in the event that their business scales much larger than they dreamed, they are protected!

Typically as storage vendors shrink their product offerings for smaller budgets many features and functions are dropped. In fact, at the bottom end of the scale “storage products” end up being a tray of disk drives without much software at all. While this is affordable (cheap), it also means the customer will spend lots of time managing their storage manually. With the Axiom 300 software customers have lots more help than they get with a RAID box. Try to get a RAID box to “Call Home”...or feature 6GB of cache memory.

April 09, 2008

QoS – The Even Simpler View!

We can all remember 36GB FC 15K RPM drives, and 160GB SATA desktop drives. I say remember because you can’t really even buy these anymore, unless you shop on eBay. Today, we talk of 300GB and 1TB capacities instead.

The HDD Industry is spectacular in my opinion. Much of what the IT industry has been able to accomplish owes itself to our ability to store massive amounts of stuff very economically.  I am biased here as I spent 20 years of my life working in the HDD industry, and I loved it.

HDD performance has not kept pace with aerial density increases. We all know this one. Performance in an HDD means motors and things that move, and we all know that those things will not keep pace with solid state technology advances.

So what can you do about it?  Buy the drives you can buy, convince yourself that you can use the capacity, and then forgo the capacity because you need IO’s. This means underutilization – don’t even think you are going to fill the disk drives to their capacity.

Unless you buy Pillar Axiom and use networked storage differently than you have for the last 10 years. Pillar allows you to extract both the capacity and the IO’s from a set of spindles as long as you share those spindles amongst applications that have varying requirements for speed and capacity.

If you want to use Axiom to do what our competitors do, you can. In fact it is extremely competitive. But with Axiom you can, and in my opinion should, use storage differently than you are used to: Define a storage pool, Tier your applications and point them at the pool for their storage.

The result? Higher utilization, full use of the disk you paid for, less stove-piping of platforms, lower cost. For all of us, this represents some important relief from the ever increasing HDD capacity march that heightens the disparity between that capacity and the performance with which it can be delivered.