Tag Archives: Microsoft Exchange 2010

How to design a Scale Out NAS Architecture

Scale-out architecture and why it is important when architecting a storage solution.

I had an interesting discussion with an architectural firm the other day.  Most of the discussion was around scaling for the future.  In our discussion we talked about the linear scalability of the ISE technology and he pointed out that while that made a ton of sense for his block-access requirements he was a little concerned around the unstructured data, as well as some plans utilizing NFS for some of his server and desktop virtualization needs.  The last thing he wanted to worry about was changing his architecture in 12 to 24 months due to growth or technology changes.  So we started working on architecting a solution utilizing our new “scale-out” ISE-NAS solution.

You’ve probably heard a lot about scale-out type architectures. 3PAR sort of led the way with their ability to scale out (at least to eight) their storage controllers to their fixed-backend backplane-attached disk drives and it offers up a pretty unique solution (at least in a block storage architecture).  3PARs problem is they don’t really have an answer for the same scalability around unstructured data (NAS).  Don’t get me wrong, they list 5 NAS companies on their website, 1 is out of business and the other 4 have either been acquired by their competitors or is a straight up competitor.  This scale out architecture seems to have caught on in the emerging NAS Gateway devices like Symantec FileStore and Isilon.  Clearly both FileStore and Isilon are very different on the scale-out architecture.  More below.

 

So first things first, let’s describe what a “scale-out” architecture means, at least to me that is.  When architecting solutions, it’s always important to put a solution together that can grow with the business.  In other words, they know what they need today, and they have an idea what they might need in 12 months, but 24 – 48 is a complete crap shoot.  They could be 5X the size, or just 2X the size but the architecture needs to be in place to support either direction. What is sometimes not discussed is what happens when you run out of either front-side processing power, backend IOPS or usable capacity?  Most storage solutions give you 1 to 2(ea) clustered controllers, and a fixed number of disk-drives they can scale to dependent on the specific controller you purchase.  From a front-end NAS solution most of them only scale to 2 nodes as well.  If you need more processing power,  more backend IOPS or capacity, you buy a second storage solution or you spend money to upgrade storage controllers that are not even remotely close to being amortized off the CFO’s books.  If you look at the drawing above, you can clearly see what scale-out architecture should look like.  You need more front-side processing, no problem.  You need more backend IOPS or Capacity, no problem.  They scale independently of each other.  There is no longer the case of “You love your first <insert storage/NAS solution of choice> and you hate your third, fourth etc etc. Isilon is probably a great example of that.  They tout their “scale-out” architecture but it clearly has some caveats.  For example, If you need more processing power, buy another Isilon, you need more capacity buy another Isilon, you need more backside IOPS…well you get the idea :)  It’s not a very efficient “scale-out” architecture.  It’s closer to a Scale up !!

Let’s also not loose site on the fact that this is a solution that will need to be in place for about 4 to 5 years, or the amount of time in which your company will amortize it.   The last thing you want to have to worry about is a controller upgrade, or net-new purchase because you didn’t size correctly or you under/over guessed your growth or even worse, years 4 and 5 hardware maintenance.  This is especially true if the vendor “end of life’d” their product before it was written off the books !!!  Cha-CHING.

 

So this company I was working with fluctuates with employees depending on what jobs they are working on.  It could go from 50 people to 500 people in a moment’s notice and while they would LOVE to size for 500, most of the time they were around 50 to 100.  So as I mentioned above, we started architecting a solution that incorporated our ISE-NAS solution based on Symantec’s FileStore product. When coupled with our Emprise 5000 (ISE) gives them the perfect scale-out solution.  They can start with 2-nodes and grow to 16 by simply adding NAS engines (x86) to the front end.  If they need more capacity, or backend IOPS, we can scale any direction independent of the rest of the solution.  Coupled with our predictable performance we gave them the ultimate ability to size for today, and know exactly what they can scale to in the future.

In the world of “Unified Storage”, cloud computing and 3 to 5 year project plans, its important to consider architecture when designing a solution to plan for the future.  Scale-Out architecture just makes a lot of sense.  BUT – do your homework.  Just because they say “scale-out” doesn’t really mean they are the same.  Dual-Clustered controllers – or even eight-way – will eventually become the bottle neck and the last thing you want to worry about is having to do a wholesale swap-out/upgrade of your controller nodes to remove the bottleneck or worse, have to buy a second (or third) storage solution to manage!!

@StorageTexan

10,000 Exchange Users in 3U of space

 

This is a pretty cool video done by the Technical Marketing team at Xiotech.  10,000 Exchange users in 3U of space!!   No fancy/expensive SSD needed for this !!

In summary:

250 VDI instances or

10,000 Exchange Users or

750 DVD Quality Video’s or

25,000 MP3’s

In 3U of space.  Now that’s WICKED FAST !!!

 As they say in Minny – “that’s not to shabby !!”

By the way, if by chance 10,000 is just not enough users for you.  Don’t worry, add a second ISE and DOUBLE IT TO 20,000.  Need 30,000, then add a THIRD ISE.  100,000 users in 10 ISE or 30U of RackSpace.  Sniff Sniff….I love it !!!!!!!!!!!!

By the way – Check out what others are doing:

Pillar Data = 8,500 Exchange Users with 24GB of Cache !!!  I should say, our ISE comes with 1GB.  It’s not the size that counts, it’s HOW YOU USE IT !! :)

One Pillar Axiom 600 with one FC Slammer
24GB of cache <—-  WOW !!!!!
4 SSD Bricks for databases. Each with:
Two dedicated RAID controllers
13 50GB SSDs <— I’m going to guess that these aren’t very cheap.
2 SATA bricks for Logs. Each with:
13 500GB 7,200 RPM SATA disk drives

Hitachi AMS 2300 = 10,800 users – 400+ Pages PLUS !!! <– I have to say it again, WOW 400+ pages on this bad boy !!!

240 300GB 15K RPM SAS disks, <— Ahh ya  – TONZ of spindles !!!  We had 20(ea) 3.5″ Drives to do our testing. 
16GB of cache and
8(ea) 4Gb/s Fibre Channel paths was used for these tests.
Testing used 8 Sun Fire 4600 M2 servers with 32GB of RAM,
four dual-core AMD Opteron CPUs,
8(ea) Emulex 4Gbit/s Fibre Channel adapters and
Windows Server 2003 R2 Enterprise x64 with Service Pack 2.

The old pain in the DAS

The old pain in the DAS !! 

Is it me, or are others in the field seeing more and more companies being pushed to look at DAS solutions for their application environments?  It strikes me as pretty interesting.  I’m sure mostly it’s positioned around reducing the overall cost of the solution, but I also think it has a lot to do with assuring predictability around performance that you can expect when not having to fight for IOPS among other applications.  In fact, Devin Ganger did a great blog post around this very subject in regards to Exchange 2010.  It was a pretty cool read. I left a comment on his site, it pretty much matches (some may call it plagiarizing myself :) ) my discussion here but I’ve had some time to expand a little more.  As such, here are my  thoughts on my own site :)

Let’s take a look back 10 years or so.  DAS solutions at the time signified low cost, low-to moderate performance and basic RAID-enabled reliability, administration time overhead, downtime for any required modification, as well as an inability to scale.  Pick any 5 of those 6 and I think most of us can agree on it.  Not to mention the DAS solution was missing key features that the applications vendors didn’t include like full block copies, replication, deduplication, etc.  Back then we spent a lot of time educating people on the benefits of SAN over DAS.  We touted the ability to put all your storage eggs in one highly reliable, networked, very redundant, easy-to-replicate-to-a-DR site-solution, which could be shared amongst many servers, to gain utilization efficiency.  We talked about cool features like “Boot from SAN” as well as full block Snapshots, replicating your data around the world and the only real way to do that was via a Storage array with those features. 

Fast forward to today and the storage array controllers are not only doing RAID and Cache protection (which is super important), they also doing thin provisioning, CDP, replication, dedupe (in some cases), snapshots (full copy and COW or ROW), multi-tier scheduled migration, CIFS, NFS, FC, iSCSI, FCoE, etc etc.  It’s getting to the point that performance predictability is pretty much going away not to mention, it takes a spreadsheet to understand the licensing that goes along with these features.  Reliability of the code, and mixing of different technologies (1GB, 2GB, 4GB, FC Drive bays, SAS connections, SATA connections, JBODs, SBODs, loops) as well as all the various “plumbing” connectivity options most arrays offer today is not making it any more stable.  2TB drive rebuild times are a great example of adding even more for controllers to handle.  Not to mention, the fundamental building block of a Storage array is data protection.  Rob Peglar over at the Xiotech blog did a really great job of describing “Controller Feature Creep”.  If you haven’t read it, you should.  Also, David Black over at “The Black Liszt” discussed “Mainframes and Storage Blades” he’s got a really cool “back to the future” discussion. 

Today it appears that application and OS/hypervisor vendors have caught up with all the issues we positioned against just years ago.  Exchange 2010 is a great example of this.  VSphere is another one.  Many application vendors now have native deduplication and compression built into their file systems, COW snapshots are a no-brainer, and replication can be done natively by the app which gives some really unique DR capabilities.  Not to mention, some applications support the ability to migrate data from Tier 1, to Tier 2 and Tier 3 based not only on a single “last touched” attribute, but also on file attributes like content (.PDF, .MP3), importance, duration, deletion policy and everything else without caring about the backend storage or what brand it is.  We are seeing major database vendors support controlling all aspects of the volumes on which logs, tablespaces, redo/undo, temporary space, etc. are held.  Just carve up 10TB’s and assign it to the application and it will take care of thin provisioning and all sorts of other ‘cool’ features.   

At the end of the day the “pain in the DAS” that we knew and loved to compete against is being replaced with “Intelligent DAS” and application aware storage capabilities.  All this gives the end user a unique ability to make some pretty interesting choices.  They can continue down the path with the typical storage array controller route, or they can identify opportunities that leverage native abilities in the application and “Intelligent DAS” solutions on the market today to vastly lower their total cost of ownership.   The question the end user needs to ask is, ‘What functionality is already included in the application/operating system I’m running?’ vs. ‘What do I need my storage  system to provide because my application doesn’t have this feature?’  At the end of the day, it’s Win-Win for the consumer, as well as a really cool place to be in the industry.  Like I’ve said in my “Cool things Commvault is doing with REST”, when you couple Intelligent DAS and application aware storage with a RESTful open standards interface, it really starts to open up some cool things. 2010 is going to be an exciting year for Storage.  Commvault has already started this parade so now its all about “who’s next”. 

 @StorageTexan <– Click on me to follow on Twitter. 

 PS – I’ve added an e-mail subscription capability to my site (as well as RSS feeds).  In the upper right corner of this site you will see a button to sign up.  Each time I post a new blog, you will get an e-mail of it.  Also, you will need to confirm the subscription via the e-mail you receive.