Data center network architectures

Last updated

A data center is a pool of resources (computational, storage, network) interconnected using a communication network. [1] [2] A data center network (DCN) holds a pivotal role in a data center, as it interconnects all of the data center resources together. DCNs need to be scalable and efficient to connect tens or even hundreds of thousands of servers to handle the growing demands of cloud computing. [3] [4] Today's data centers are constrained by the interconnection network. [5]

Contents

Types of data center network topologies

Data center networks can be divided into multiple separate categories. [6]

Types of data center network architectures

Three-tier

The legacy three-tier DCN architecture follows a multi-rooted tree based network topology composed of three layers of network switches, namely access, aggregate, and core layers. [10] The servers in the lowest layers are connected directly to one of the edge layer switches. The aggregate layer switches interconnect together multiple access layer switches. All of the aggregate layer switches are connected to each other by core layer switches. Core layer switches are also responsible for connecting the data center to the Internet. The three-tier is the common network architecture used in data centers. [10] However, three-tier architecture is unable to handle the growing demand of cloud computing. [11] The higher layers of the three-tier DCN are highly oversubscribed. [3] Moreover, scalability is another major issue in three-tier DCN. Major problems faced by the three-tier architecture include, scalability, fault tolerance, energy efficiency, and cross-sectional bandwidth. The three-tier architecture uses enterprise-level network devices at the higher layers of topology that are very expensive and power hungry. [5]

Fat tree

The fat tree DCN architecture reduces the oversubscription and cross section bandwidth problem faced by the legacy three-tier DCN architecture. Fat tree DCN employs commodity network switches based architecture using Clos topology. [3] The network elements in fat tree topology also follows hierarchical organization of network switches in access, aggregate, and core layers. However, the number of network switches is much larger than the three-tier DCN. The architecture is composed of k pods, where each pod contains, (k/2)2 servers, k/2 access layer switches, and k/2 aggregate layer switches in the topology. The core layers contain (k/2)2 core switches where each of the core switches is connected to one aggregate layer switch in each of the pods. The fat tree topology can offer up to 1:1 oversubscription ratio and full bisection bandwidth, [3] depending on each rack's total bandwidth versus the bandwidth available at the tree's highest levels. Higher tree branches are typically oversubscribed to their lower branches by a ratio of 1:5, with the problem compounding at the highest tree levels, including up to 1:80 or 1:240, at the highest levels. [12] The fat tree architecture uses a customized addressing scheme and routing algorithm. The scalability is one of the major issues in fat tree DCN architecture and maximum number of pods is equal to the number of ports in each switch. [11]

DCell

DCell is a server-centric hybrid DCN architecture where one server is directly connected to one server. [4] A server in the DCell architecture is equipped with multiple network interface cards (NICs). The DCell follows a recursively built hierarchy of cells. A cell0 is the basic unit and building block of DCell topology arranged in multiple levels, where a higher level cell contains multiple lower layer cells. The cell0 is building block of DCell topology, which contains n servers and one commodity network switch. The network switch is only used to connect the server within a cell0. A cell1 contain k=n+1 cell0 cells, and similarly a cell2 contains k * n + 1 dcell1. The DCell is a highly scalable architecture where a four level DCell with only six servers in cell0 can accommodate around 3.26 million servers. Besides very high scalability, the DCell architecture depicts very high structural robustness. [13] However, cross section bandwidth and network latency is a major issue in DCell DCN architecture. [1]

Others

Some of the other well-known DCNs include BCube, [14] Camcube, [15] FiConn, [16] Jelly fish, [17] and Scafida. [18] A qualitative discussion of different DCNs along with benefits and drawbacks associated with each one has been made available. [2]

Challenges

Scalability is one of the foremost challenges to the DCNs. [3] With the advent of cloud paradigm, data centers are required to scale up to hundreds of thousands of nodes. Besides offering immense scalability, the DCNs are also required to deliver high cross-section bandwidth. Current DCN architectures, such as three-tier DCN offer poor cross-section bandwidth and possess very high over-subscription ratio near the root. [3] Fat tree DCN architecture delivers 1:1 oversubscription ratio and high cross section bandwidth, but it suffers from low scalability limited to k=total number of ports in a switch. DCell offers immense scalability, but it delivers very poor performance under heavy network load and one-to-many traffic patterns.

Performance Analysis of DCNs

A quantitative analysis of the three-tier, fat tree, and DCell architectures for performance comparison (based on throughput and latency) is performed for different network traffic pattern. [1] The fat tree DCN delivers high throughput and low latency as compared to three-tier and DCell. DCell suffers from very low throughput under high network load and one to many traffic patterns. One of the major reasons for DCell's low throughput is very high over subscription ratio on the links that interconnect the highest level cells. [1]

Structural robustness and Connectivity of DCNs

The DCell exhibits very high robustness against random and targeted attacks and retains most of its node in the giant cluster after even 10% of targeted failure. [13] multiple failures whether targeted or random, as compared to the fat tree and three-tier DCNs. [19] One of the major reasons for high robustness and connectivity of the DCell is its multiple connectivity to other nodes that is not found in fat tree or three-tier architectures.

Energy efficiency of DCNs

The concerns about the energy needs and environmental impacts of data centers are intensifying. [5] Energy efficiency is one of the major challenges of today's information and communications technology (ICT) sector. The networking portion of a data center is accounted to consume around 15% of overall cyber energy usage. Around 15.6 billion kWh of energy was utilized solely by the communication infrastructure within the data centers worldwide in 2010. [20] The energy consumption by the network infrastructure within a data center is expected to increase to around 50% in data centers. [5] IEEE 802.3az standard has been standardized in 2011 that make use of adaptive link rate technique for energy efficiency. [21] Moreover, fat tree and DCell architectures use commodity network equipment that is inherently energy efficient. Workload consolidation is also used for energy efficiency by consolidating the workload on few devices to power-off or sleep the idle devices. [22]

Related Research Articles

<span class="mw-page-title-main">Ethernet</span> Computer networking technology

Ethernet is a family of wired computer networking technologies commonly used in local area networks (LAN), metropolitan area networks (MAN) and wide area networks (WAN). It was commercially introduced in 1980 and first standardized in 1983 as IEEE 802.3. Ethernet has since been refined to support higher bit rates, a greater number of nodes, and longer link distances, but retains much backward compatibility. Over time, Ethernet has largely replaced competing wired LAN technologies such as Token Ring, FDDI and ARCNET.

<span class="mw-page-title-main">Network topology</span> Arrangement of the elements of a communication network

Network topology is the arrangement of the elements of a communication network. Network topology can be used to define or describe the arrangement of various types of telecommunication networks, including command and control radio networks, industrial fieldbusses and computer networks.

<span class="mw-page-title-main">Distributed hash table</span> Decentralized distributed system with lookup service

A distributed hash table (DHT) is a distributed system that provides a lookup service similar to a hash table. Key–value pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key. The main advantage of a DHT is that nodes can be added or removed with minimum work around re-distributing keys. Keys are unique identifiers which map to particular values, which in turn can be anything from addresses, to documents, to arbitrary data. Responsibility for maintaining the mapping from keys to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption. This allows a DHT to scale to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures.

<span class="mw-page-title-main">Fat tree</span> Universal network for provably efficient communication

The fat tree network is a universal network for provably efficient communication. It was invented by Charles E. Leiserson of the Massachusetts Institute of Technology in 1985. k-ary n-trees, the type of fat-trees commonly used in most high-performance networks, were initially formalized in 1997.

<span class="mw-page-title-main">Content delivery network</span> Layer in the internet ecosystem addressing bottlenecks

A content delivery network, or content distribution network (CDN), is a geographically distributed network of proxy servers and their data centers. The goal is to provide high availability and performance by distributing the service spatially relative to end users. CDNs came into existence in the late 1990s as a means for alleviating the performance bottlenecks of the Internet as the Internet was starting to become a mission-critical medium for people and enterprises. Since then, CDNs have grown to serve a large portion of the Internet content today, including web objects, downloadable objects, applications, live streaming media, on-demand streaming media, and social media sites.

In software architecture, publish–subscribe is a messaging pattern where publishers categorize messages into classes that are received by subscribers. This is contrasted to the typical messaging pattern model where publishers send messages directly to subscribers.

<span class="mw-page-title-main">Simon S. Lam</span> American computer scientist and academic (born 1947)

Simon S. Lam is an American computer scientist. He retired in 2018 from The University of Texas at Austin as Professor Emeritus and Regents' Chair Emeritus in Computer Science #1. He made seminal and important contributions to transport layer security, packet network verification, as well as network protocol design, verification, and performance analysis.

<span class="mw-page-title-main">Computer network</span> Network that allows computers to share resources and communicate with each other

A computer network is a set of computers sharing resources located on or provided by network nodes. Computers use common communication protocols over digital interconnections to communicate with each other. These interconnections are made up of telecommunication network technologies based on physically wired, optical, and wireless radio-frequency methods that may be arranged in a variety of network topologies.

<span class="mw-page-title-main">Radia Perlman</span> American software designer and network engineer

Radia Joy Perlman is an American computer programmer and network engineer. She is a major figure in assembling the networks and technology to enable what we now know as the internet. She is most famous for her invention of the Spanning Tree Protocol (STP), which is fundamental to the operation of network bridges, while working for Digital Equipment Corporation, thus earning her nickname "Mother of the Internet". Her innovations have made a huge impact on how networks self-organize and move data. She also made large contributions to many other areas of network design and standardization: for example, enabling today's link-state routing protocols, to be more robust, scalable, and easy to manage.

<span class="mw-page-title-main">Urs Hölzle</span> Swiss computer scientist

Urs Hölzle is a Swiss software engineer and technology executive. As Google's eighth employee and its first VP of Engineering, he has shaped much of Google's development processes and infrastructure, as well as its engineering culture. His most notable contributions include leading the development of fundamental cloud infrastructure such as energy-efficient data centers, distributed compute and storage systems, and software-defined networking. Until July 2023, he was the Senior Vice President of Technical Infrastructure and Google Fellow at Google. In July 2023, he transitioned to being a Google Fellow only.

<span class="mw-page-title-main">Edge computing</span> Distributed computing paradigm

Edge computing is a distributed computing paradigm that brings computation and data storage closer to the sources of data. This is expected to improve response times and save bandwidth. Edge computing is an architecture rather than a specific technology, and a topology- and location-sensitive form of distributed computing.

Randy Howard Katz is a distinguished professor emeritus at University of California, Berkeley of the electrical engineering and computer science department.

<span class="mw-page-title-main">Alexander G. Fraser</span> British-American computer scientist (1937–2022)

Alexander G. Fraser, also known as A. G. Fraser and Sandy Fraser, was a noted British-American computer scientist.

A reliable multicast is any computer networking protocol that provides a reliable sequence of packets to multiple recipients simultaneously, making it suitable for applications such as multi-receiver file transfer.

Software-defined networking (SDN) technology is an approach to network management that enables dynamic, programmatically efficient network configuration to improve network performance and monitoring, in a manner more akin to cloud computing than to traditional network management. SDN is meant to address the static architecture of traditional networks and may be employed to centralize network intelligence in one network component by disassociating the forwarding process of network packets from the routing process. The control plane consists of one or more controllers, which are considered the brains of the SDN network, where the whole intelligence is incorporated. However, centralization has certain drawbacks related to security, scalability and elasticity.

<span class="mw-page-title-main">Victor Bahl</span> American computer scientist

Victor Bahl is an Indian Technical Fellow and CTO of Azure for Operators at Microsoft. He started networking research at Microsoft. He is known for his research contributions to white space radio data networks, radio signal-strength based indoor positioning systems, multi-radio wireless systems, wireless network virtualization, edge computing, and for bringing wireless links into the datacenter. He is also known for his leadership of the mobile computing community as the co-founder of the ACM Special Interest Group on Mobility of Systems, Users, Data, and Computing (SIGMOBILE). He is the founder of international conference on Mobile Systems, Applications, and Services Conference (MobiSys), and the founder of ACM Mobile Computing and Communications Review, a quarterly scientific journal that publishes peer-reviewed technical papers, opinion columns, and news stories related to wireless communications and mobility. Bahl has received important awards; delivered dozens of keynotes and plenary talks at conferences and workshops; delivered over six dozen distinguished seminars at universities; written over hundred papers with more than 65,000 citations and awarded over 100 US and international patents. He is a Fellow of the Association for Computing Machinery, IEEE, and American Association for the Advancement of Science.

In computing, energy proportionality is a measure of the relationship between power consumed in a computer system, and the rate at which useful work is done. If the overall power consumption is proportional to the computer's utilization, then the machine is said to be energy proportional. Equivalently stated, for an idealized energy proportional computer, the overall energy per operation is constant for all possible workloads and operating conditions.

A Wireless Data center is a type of data center that uses wireless communication technology instead of cables to store, process and retrieve data for enterprises. The development of Wireless Data centers arose as a solution to growing cabling complexity and hotspots. The wireless technology was introduced by Shin et al., who replaced all cables with 60 GHz wireless connections at the Cayley data center.

A Network Coordinate System is a system for predicting characteristics such as the latency or bandwidth of connections between nodes in a network by assigning coordinates to nodes. More formally, It assigns a coordinate embedding to each node in a network using an optimization algorithm such that a predefined operation estimates some directional characteristic of the connection between node and .

References

  1. 1 2 3 4 K. Bilal, S. U. Khan, L. Zhang, H. Li, K. Hayat, S. A. Madani, N. Min-Allah, L. Wang, D. Chen, M. Iqbal, C.-Z. Xu, and A. Y. Zomaya, "Quantitative Comparisons of the State of the Art Data Center Architectures," Concurrency and Computation: Practice and Experience, vol. 25, no. 12, pp. 1771-1783, 2013.
  2. 1 2 M. Noormohammadpour, C. S. Raghavendra, "Datacenter Traffic Control: Understanding Techniques and Trade-offs," IEEE Communications Surveys & Tutorials, vol. PP, no. 99, pp. 1-1.
  3. 1 2 3 4 5 6 M. Al-Fares, A. Loukissas, A. Vahdat, A scalable, commodity data center 2 network architecture, in: ACM SIGCOMM 2008 Conference on Data 3 Communication, Seattle, WA, 2008, pp. 63–74.
  4. 1 2 C. Guo, H. Wu, K. Tan, L. Shi, Y. Zhang, S. Lu, DCell: a scalable and fault tolerant network structure for data centers, ACM SIGCOMM Computer Communication Review 38 (4) (2008) 75–86.
  5. 1 2 3 4 K. Bilal, S. U. Khan, and A. Y. Zomaya, "Green Data Center Networks: Challenges and Opportunities," in 11th IEEE International Conference on Frontiers of Information Technology (FIT), Islamabad, Pakistan, December 2013, pp. 229-234.
  6. Liu, Yang; Muppala, Jogesh K.; Veeraraghavan, Malathi; Lin, Dong; Hamdi, Mounir (2013). "Data Center Network Topologies: Research Proposals". In Liu, Yang; Muppala, Jogesh K.; Veeraraghavan, Malathi; Lin, Dong (eds.). Data Center Networks: Topologies, Architectures and Fault-Tolerance Characteristics. SpringerBriefs in Computer Science. Cham: Springer International Publishing. pp. 15–31. doi:10.1007/978-3-319-01949-9_3. ISBN   978-3-319-01949-9.
  7. Al-Fares, Mohammad; Loukissas, Alexander; Vahdat, Amin (2008). "A scalable, commodity data center network architecture". Proceedings of the ACM SIGCOMM 2008 conference on Data communication. Seattle, WA, USA: ACM Press. pp. 63–74. doi:10.1145/1402958.1402967. ISBN   978-1-60558-175-0. S2CID   65842.
  8. Niranjan Mysore, Radhika; Pamboris, Andreas; Farrington, Nathan; Huang, Nelson; Miri, Pardis; Radhakrishnan, Sivasankar; Subramanya, Vikram; Vahdat, Amin (2009-08-16). "PortLand: a scalable fault-tolerant layer 2 data center network fabric". ACM SIGCOMM Computer Communication Review. 39 (4): 39–50. doi:10.1145/1594977.1592575. ISSN   0146-4833.
  9. Al-Fares, Mohammad; Radhakrishnan, Sivasankar; Raghavan, Barath; Huang, Nelson; Vahdat, Amin (2010-04-28). "Hedera: dynamic flow scheduling for data center networks". Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation. NSDI'10. San Jose, California: USENIX Association: 19.
  10. 1 2 Cisco, Cisco Data Center Infrastructure 2.5 Design Guide, Cisco Press, 2010.
  11. 1 2 Bilal et al., "A Taxonomy and Survey on Green Data Center Networks," Future Generation Computer Systems.
  12. Greenberg, Albert, et al. "VL2: a scalable and flexible data center network." Proceedings of the ACM SIGCOMM 2009 conference on Data communication. 2009.
  13. 1 2 K. Bilal, M. Manzano, S. U. Khan, E. Calle, K. Li, and A. Y. Zomaya, "On the Characterization of the Structural Robustness of Data Center Networks," IEEE Transactions on Cloud Computing, vol. 1, no. 1, pp. 64-77, 2013.
  14. Guo, Chuanxiong, et al. "BCube: a high performance, server-centric network architecture for modular data centers." ACM SIGCOMM Computer Communication Review 39.4 (2009): 63-74.
  15. Costa, P., et al. CamCube: a key-based data center. Technical Report MSR TR-2010-74, Microsoft Research, 2010.
  16. Li, Dan, et al. "FiConn: Using backup port for server interconnection in data centers." INFOCOM 2009, IEEE. IEEE, 2009.
  17. Singla, Ankit, et al. "Jellyfish: Networking data centers randomly." 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI). 2012.
  18. Gyarmati, László, and Tuan Anh Trinh. "Scafida: A scale-free network inspired data center architecture." ACM SIGCOMM Computer Communication Review 40.5 (2010): 4-12.
  19. M. Manzano, K. Bilal, E. Calle, and S. U. Khan, "On the Connectivity of Data Center Networks," IEEE Communications Letters, vol. 17, no. 11, pp. 2172-2175, 2013.
  20. Bilal, K.; Khan, S. U.; Zomaya, A. Y. (December 2013). "Green Data Center Networks: Challenges and Opportunities" (PDF). 2013 11th International Conference on Frontiers of Information Technology. pp. 229–234. doi:10.1109/FIT.2013.49. ISBN   978-1-4799-2503-2. S2CID   7136258.
  21. K. Bilal, S. U. Khan, S. A. Madani, K. Hayat, M. I. Khan, N. Min-Allah, J. Kolodziej, L. Wang, S. Zeadally, and D. Chen, "A Survey on Green Communications using Adaptive Link Rate," Cluster Computing, vol. 16, no. 3, pp. 575-589, 2013
  22. Heller, Brandon; Seetharaman, Srinivasan; Mahadevan, Priya; Yiakoumis, Yiannis; Sharma, Puneet; Banerjee, Sujata; McKeown, Nick (2010). "ElasticTree: saving energy in data center networks" (PDF). Proceedings of the 7th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2010, April 28-30, 2010, San Jose, CA, USA. USENIX Association. pp. 249–264.