I’m currently in Santa Clara, CA, at the Next Generation Ethernet Data Center -Technology Exploration Forum (TEF), being hosted by the Ethernet Alliance. There has been a lot of attention on Data Centers over the last few years, with most of the emphasis on power and cooling, though more recently protocol convergence is increasingly the topic of conversation. Growth trends continue to be worrying, leading to the need for strategic planning. Building the mega-Data Centers of today have been a “learn as you grow” process, with little time for planning ahead. TEF’s are a great way of pausing to reflect on the subject matter. Today we are openly discussing where we are at on the growth curve, what has gone well and not so well, surprise lessons learned, and how to meet future demand. Sessions include Higher Speed Ethernet, IO Virtualization, Unprecedented Growth, Low Latency Ethernet, Cloud Computing, Convergence, and End Users.
Data center intra/interconnect continues to be a multiple protocol affair. Infiniband is used for high performance computing (HPC), Fibre Channel for storage area networking (SAN), and Ethernet for transport and OAM. Infiniband provides not only the low-latency needed for HPC applications; it also provides the direct processor-to-processor, shared memory protocol needed for distributed/parallel computing architectures. Fibre Channel based SAN’s provide not only information backup, but also rapid recovery from failures, as well as buffer-credits for long range transport of data. Ethernet provides the much needed addressing, data routing, and Internet Protocol connectivity, but has struggled to offer much more. The very features that have been IP/Ethernet’s strength, enabling the internet generation, have also been it’s Achilles heel, slowing its perhaps imperialistic goal of conquering all the continents of the information age.
All modern IP networks today are based upon the Open System Interconnection (OSI) Reference Model which consists of a 7 layer protocol stack and over the years each of these layers have had their own revolution. Wavelength Division Multiplexing (WDM) increased the physical Layer 1 capacity by two to three magnitudes of order. System on Chip (SoC) silicon ASIC’s simultaneously increased capacity of Layer 2 switching while bringing down costs per switched frame. Ternary Content Addressable Memory (TCAM) and other high speed memories formed the basis of today’s monster core IP routers at Layer 3. Yet, Layer 4 is still awaiting its Eureka! moment. Faced with upcoming 100GE, Terabit router/switches, and Exabit networks, Layer 4 has not kept pace in what has become known as the TCP (Transmission Control Protocol) bottleneck. And, unfortunately, most of the future promise of cloud services is reliant on the performance of Layer 4.
The best way to think of the mechanics of Layer 4, and TCP’s sliding window, is to consider the analogy of a rental car kiosk at the airport. Passengers arrive at the rental kiosk, sometimes in a steady stream, sometimes in bursts, and need to take a shuttle bus to the remote parking lot where they will pickup their car. If they are lucky enough to arrive just as the bus is leaving, then they will have minimal delay in getting to their vehicle. If they arrive early as the bus is sitting most empty, then they will have to wait until either the bus is full or the scheduled departure time to leave, adding some delay to their trip. If they arrive late and just miss the departing shuttle bus, then they will be forced to wait for the next bus, as well as then waiting for enough passengers to load, substantially delaying their travel. TCP works in a similar fashion, transmitting segments (the shuttle bus) on a regular schedule (rate), each segment containing packets (travelers). When the internet was primarily used for file transfer and download, TCP was optimal. As voice and then video began being transported over the internet, User Datagram Protocol (UDP) was adopted from use in the Domain Name System (DNS) to avoid TCP latency, and can be thought of as the moving walkways at the airport where everyone gets on in order, is helped along, and then are tossed off at the end in pretty much the same order they hopped on the conveyer belt. Unfortunately, UDP does not appear to be the savior of cloud services, as there is no flow control present, and packets can be dropped at will. So, what to do?
Here, today, industry leaders have converged to change that paradigm. As the name of the TEF suggests, the focus today is Ethernet and how it can meet future needs, without the helping hand of other protocols. As typical with anything involving engineers, there is an alphabet soup of acronyms, and competing proposals on how best to meet the needs of the data center: iWARP, RDMA, RDDP/TCP, RDDP/SCTP, RoCEE, to name just a few. While the subject matter is a bit deep to cover here in a blog post, the concept is simple: Take the best of each of the protocols and make a new protocol for Layer 4. This is often referred to as protocol convergence in data centers. To characterize this convergence as merely a way for routers to carry Fibre Channel over Ethernet (FCoE) is selling the work of the industry leaders short. Rather, this wide range of protocols being worked on seeks to increase the efficiency of data center communication by reducing latency, removing distance from the performance equation, and finally stenting the TCP bottleneck that has been clogging the main arteries of the internet for far too long. In the meantime, there has been a temporary reprieve in the situation, as microprocessors have moved to multiple cores rather than higher clock speeds, since each of the cores can be given their own TCP streams, a technique known as TCP striping.
It is Layer 4’s turn to shine in the limelight, and the brightest minds in the industry are working together to improve data center performance and interconnect, in order to keep up with the heretofore unimaginable growth in demand.
For more information on Cloud Computing, and the different service models being explored, you can download a whitepaper on the topic here.