I’m in a Fibre-Channel State of Mind

This post covers an area that is close to my heart, Fibre Channel, and Fibre Channel-based systems.

The Monolith becomes the Mid-Range

I’ve been working in this space for 13 years. Back in 2001 I installed a pair of EMC Clariion FC4700 arrays, a product of the (at that time) recent acquisition of Data General by EMC.

That turned out to be an extremely far-sighted view by EMC. At that time the “high-end” monolithic arrays like EMC Symmetrix 8400 Series dominated the landscape in banking in particular. Many banks deployed EMC Timefinder software for LUN Snapshots, to improve what had historically been atrocious backup performance, using agents dragging data across a 10/100Mbs LAN. That was a nightmare – and not moving data is always better than letting it touch the network.

This was a major leap forward, driven by extensive UNIX shell scripting. That led to automation of this process via Storage APIs over time and integration of this technology into software such as Netapp SnapXXX or Commvault Intellisnap. Heterogeneous storage support via a single engine in the case of Commvault.

The likes of the Clariion led to push down of this technology (full redundancy and software defined capabilities) into what we now call the mid-range. We saw Netapp evolving from this space and bringing true multi-protocol technology and an application-aware view to the table. And that marked them out as different and probably still does, to this day.

We also had a sprinkling of EMC SRDF to enable synchronous replication between sites. We should not forget what a mind-blowing concept that was. Most people know it now, but having a write acknowledged by the cache in a partner array sometimes miles away counting as the acknowledgement for an application was so different to local DAS. These leaps forward get forgotten over time.

I have previously written about how the inability to perform software RAID to enable a fully fault tolerant Dual Data Centre topology is something that is a backward step. Those articles are here.

http://www.virtualizationsoftware.com/vmware-metro-storage-cluster-part-1/

and here:

http://www.virtualizationsoftware.com/vmware-metro-storage-cluster-part-2/

Now let’s look back with Anger

One of my pet hates is revisionism, sometimes driven by a specific agenda, and sometimes not.

I read an article by Greg Ferro over at Etherealmind.com in relation to the lack of future thinking within the storage industry. I came across it by accident.

http://etherealmind.com/myth-fibrechannel-over-token-ring/

It’s actually quite old but that doesn’t turn a sows ear into a silk purse. It’s not just this article but it is typical of how commentators/analysts views can diverge from reality.

This here post is written out of a need to try and redress what I feel is a sometimes anti-Fibre Channel mind-set that has evolved within some parts of the IT Industry.

That viewpoint can have at its core an agenda to sell alternative systems that use different protocols/architectures, but that’s not true in all cases.

The old TCO debate

One of my least favourite words/concepts is Total Cost of Ownership (TCO). It is difficult to accurately put a $$ figure on and is just not an empirical methodology.

However let’s examine it for a moment…

I have spoken to customers lately to assess and evaluate their opinion of Fibre Channel. I have specifically asked them about cost and TCO. I have not been surprised to hear that many customers don’t seem to be in any hurry to throw it out. After all, Fibre Channel is undoubtedly the most reliable technology in IT today. It has been for the last 20-30 years and probably will be for the next 10-20 years. Why  ? Because there are no moving parts – so it stands to reason. The only item likely to fail is a PSU, and they’re hot-swap.

I haven’t forgotten SCSI, another “legacy” technology still used today.

While articles get written about how FC is over-engineered and too expensive, customers rubbish TCO arguments that are purely based on capital costs, instead of operational costs. Cost per switch or port are not the only metrics that contribute to TCO.

For a truer picture ask your customers these questions:

  • How many times has a FC HBA failed ?
  • How many times has a FC switch failed ?
  • How many times has a FC front-end storage array port failed ?.
  • How many times have you logged a call in relation to FC ?
  • How many times do I need to rezone a server to a storage array or re-mask LUNs due to FC issues ?
  • Do I need to worry about concepts like Spanning Tree Protocol in FC

What about FCoE ?

FCoE is intended to reduce the need for separate networks which of course is a good idea.

But don’t forget even VMware guidance is to use a separate physical Ethernet network (at least physical switches) to create a separate fault domain if you use IP-based storage such as NFS or ISCSI. This is doubly true with vSphere 5 to enable separate fault domains for Datastore Hearbeating. This has major cluster design implications anyway for vSphere 5. So you should end up with two networks either way.

On Converged/FCoE implementations I’ve been involved in (three in 2013), involving Cisco UCS/Nexus, all three involved a protocol conversion back to native FC at the array edge.

Why ? (BTW I performed elements of the delivery – not design)

FCoE is (mainly) Cisco’s attempt to create a new technology that has not caught on yet, part of the so-called Data Center 3.0 strategy. I evaluated and championed FCoE and have been personally disappointed at its lack of adoption.

Scalability is key ?

Finally, I have heard the argument that FC doesn’t scale past 1000 ports. Let’s just think about that for a second. 1000 ports could be:

  • 400 server ports
  • 400 switch ports
  • 100 ISLs (let’s be generous)
  • 100 Array ports (let’s be generous)

On that Compute platform you could consolidate 50 VMs on each server, which equates to 10,000 VMs. How many customers have 10,000 VMs ?.

Not many, I would suggest.

So that is a half-baked corner case example that is not seen on 95% of customer sites.

The case for iSCSI

I am a big fan of Nimble Storage, and whenever I can, I extol the virtues of Nimble. There’s a lot to like about it. I recently did so to a very friendly large storage customer I know who comes from a Sales background for all types of storage. So he had previously mainly sold iSCSI-based systems.

He said to me “I’m more a Fibre Channel kind of guy now”. This sums up a certain mind-set that is held by many people.

The real question here is why do customers not want to use their native Ethernet networks to use as a transport for Disk I/O ?

I’m not answering THAT question.

 

 311 total views,  1 views today

I’m in a Fibre-Channel State of Mind

2 thoughts on “I’m in a Fibre-Channel State of Mind

  1. “Finally, I have heard the argument that FC doesn’t scale past 1000 ports. Let’s just think about that for a second.”

    I can’t. I’m still trying to not cry with laughter and wake up my sleeping girl.
    1000 ports?
    FC doesn’t Scale in *any* aspect?

    Whoever says that immediately qualifies as complete idiots. >5000 Port SANs have been pretty standard in real businesses and scaling is quite inherent since the days of FC Loops ended. Both of which is true for more than 10 years.

    I reckon anyone claiming otherwise most probably thinks so since they are just the kind of people that will *not* be asked to work in a critical environment.

    When I talked to medium-sized monitoring customers *after* my unix/storage times the story of “yeah we used to have iSCSI but it blew up once under load and we switched back to FC” has come up often. Alternatively, invest in a dedicated dual 10ge backbone. And still have more issues than plain FC.
    The larger enterprises never fell for the hoax.

    iSCSI is fun, is helpful, is flexible, but like FCOE without all the DCE tidbits it’s useless for production load. and with them, FC becomes a bargain.
    Not to mention that the basic choice is:
    Have a good admin (who won’t find FC complicated) and know stuff will always be fine or
    Don’t have one and let it go to hell, whatever tech.

    1. Hi Florian,

      Thanks for your comments. I’m with you on this. When I had an interaction with Greg Ferro on Twitter, it was Greg who mentioned that “FC doesn’t scale past 1000 ports”. He also mentioned that the storage industry has been negligent in terms of a lack of advancement in technology. That’s why I wanted to point out where we came from when I started.

      I would agree with you that FC is the most performant, robust, reliable and bulletproof technology that scales, and provides long-life time to customers. I still see 2/4Gb/s HBAS in full service working away quite happily. Also the ability of vendors like EMC/HDS to carry out a RPQ to support a site specific configuration (request for qualification) is a concept that I think really think has not trickled into any other area of IT. It provides great comfort in most situations and is so common with FC-based solutions in Enterprise clients.

      I would also agree in relation to iSCSI/FCoE versus FC. FC was designed for great performance but is very simple to manage, and highly secure, as it doesn’t typically have any overlap with users/consumer networks. And if you still need separate networks anyway, does the cost differential justify it. I have worked on Cisco UCS based solutions with Nexus and the tech is very impressive but required CCNP+ in my experience to understand how it connects to the corporate LAN. Management complexity should always be balanced against performance in my view.

      I also personally think that FC/Storage/Data management are different mindsets than Networking engineering. We are concerned about consistency, in-ordering or data, split-brain and other such concepts. TCP/IP is obviously designed to drop packets which is not something I’m sure any SQL DBA would be happy about if he had to choose a solution.

      Best,
      Paul

Leave a Reply

Your email address will not be published.

*

Scroll to top