We intend to answer the following questions for this class of switches; How does one estimate the average delay, and possibly also the delay distribution, for such switches? This includes estimating resequencing delay. We will again consider the packet delay performance instead of just cell delay. This is challenging because the packet reassembly and resequencing operations overlap in time.
Also, we will test a conjecture made on the basis of our previous research: For the long range dependent traffic characteristic of the Internet, load balanced switches may have inherent traffic shaping properties which lead to improved performance over competing switch architectures. After testing the validity of our conjecture using Internet traffic traces, we hope to uncover interesting switch design insights analogous to our above-mentioned work on the difference between packet delay and cell delay for packet switches.
We hope this work will bring about important new results based on the demonstrable interplay between switch hardware design and performance, and Internet traffic engineering. The research will be disseminated through publications, presentations and interactions with industry. The New York Center for Advanced Technology in Telecommunications, which has seed-funded the PI's research in this area, also facilitates interaction with switch equipment vendors such as Lucent Technologies and Fujitsu Network Communications.
Jonathan Chao has a track record in writing texts in the area of switching and broadband communications. These books have helped present research in switching and networking in general through course texts at the graduate level. He has also developed courses where students design network subsystems such as buffer managers, schedulers and switch fabrics as projects using VLSI design tools.
This is the approach taken when a workstation or server serves as a router e. Given the existence of a routing table, the routing table lookup is conceptually simple -- we just search through the routing table, looking for a destination entry that matches the destination address of the datagram, or a default route if the destination entry is missing.
In practice, however, life is not so simple. Perhaps the most important complicating factor is that backbone routers must operate at high speeds, being capable of performing millions of lookups per second. Indeed, it is desirable for the input port processing to be able to proceed at line speed , i. In this case, input processing of a received packet can be completed before the next receive operation is complete.
To get an idea of the performance requirements for lookup, consider that a so-called OC48 link runs at 2. With byte long packets, this implies a lookup speed of approximately a million lookups per second. Given the need to operate at today's high link speeds, a linear search through a large routing table is impossible. A more reasonable technique is to store the routing table entries in a tree data structure. Each level in the tree can be thought of as corresponding to a bit in the destination address.
To lookup an address, one simply starts at the root node of the tree. If the first address bit is a zero, then the left subtree will contain the routing table entry for destination address; otherwise it will be in the right subtree.
The appropriate subtree is then traversed using the remaining address bits -- if the next address bit is a zero the left subtree of the initial subtree is chosen; otherwise, the right subtree of the initial subtree is chosen. In this manner, one can lookup the routing table entry in N steps, where N is the number of bits in the address. The reader will note that this is essentially a binary search through an address space of size 2 N. Refinements of this approach are discussed in [Doeringer ].
Several techniques have thus been explored to increase lookup speeds. Content addressable memories CAMs allow a bit IP address to be presented to the CAM, which then returns the content of the routing table entry for that address in essentially constant time. Another technique for speeding lookup is to keep recently accessed routing table entries in a cache [Feldmeier ]. Here, the potential concern is the size of the cache. Measurements in [Thompson ] suggest that even for an OC-3 speed link, approximately , source-destination pairs might be seen in one minute in a backbone router.
Most recently, even faster data structures, which allow routing table entry to be located in log N steps [Waldvogel ] , or which compress routing tables in novel ways [Degemark ], have been proposed. A hardware-based approach to lookup that is optimized for the common case that the address being looked up has 24 or less significant bits is discussed in [ Gupta ].
A high-performance two-stage packet switch architecture
Once the output port for a packet has been determined via the lookup, the packet can be forwarded into the switching fabric. However, as we'll see below, a packet may be temporarily blocked from entering the switching fabric due to the fact that packets from other input ports are currently using the fabric.
A blocked packet must thus be queued at the input port and then scheduled to cross the switching fabric at a later point in time. We'll take a closer look at the blocking, queueing and scheduling of packets at both input ports and output ports within a router in section 4.
The switching fabric is at the very heart of a router.
It is through this switching that the datagrams are actually moved from an input port to an output port. Switching can be accomplished in a number of ways, as indicated in Figure 4. Switching via memory. The simplest, earliest routers were often traditional computers, with switching between input and output port being done under direct control of the CPU routing processor. An input port with an arriving datagram first signaled the routing processor via an interrupt. The packet was then copied from the input port into processor memory.
The routing processor then extracted the destination address from the header, looked up the appropriate output port in the routing table, and copied the packet to the output port's buffers. Many modern routers also switch via memory. A major difference from early routers, however, is that the lookup of the destination address and the storing switching of the packet into the appropriate memory location is performed by processors on the input line cards.
In some ways, routers that switch via memory look very much like shared memory multiprocessors, with the processors on a line line card storing datagrams into the memory of the appropriate output port. Cisco's Catalyst series switches [Cisco a] and Bay Networks Accelar Series routers switch packets via a shared memory. Switching via a bus. Although the routing processor is not involved in the bus transfer, since the bus is shared, only one packet at a time can be transferred over the bus at a time. A datagram arriving at an input port and finding the bus busy with the transfer of another datagram is blocked from passing through the switching fabric and queued at the input port.
Because every packet must cross the single bus, the switching bandwidth of the router is limited to the bus speed. Given that bus bandwidths of over a gigabit per second are possible in today's technology, switching via a bus is often sufficient for routers that operate in access and enterprise networks e. Bus-based switching has been adopted in a number of current router products, including the Cisco [Cisco b] , which switches packets over a 1Gbps Packet Exchange Bus.
Switching via an interconnection network. One way to overcome the bandwidth limitation of a single, shared bus is to use a more sophisticated interconnection network, such as those that have been used in the past to interconnect processors in a multiprocessor computer architectures. A crossbar switch is an interconnection network consisting of 2N busses that connect N input ports to N output ports, as shown in Figure 4.
A packet arriving at an input port travels along the horizontal bus attached to the input port until it intersects with the vertical bus leading to the desired output port. If the vertical bus leading to the output port is free, the packet is transferred to the output port.
Di Lucente, J. Luo, and H. Not Accessible Your account may give you access. Abstract We review an optical flat datacenter network DCN based on scalable optical switching system with port-count independent low latency. References You do not have subscription access to this journal.
Nick Mckeown's paper
Photonics in Switching Please login to set citation alerts. Equations displayed with MathJax. Right click equation to reveal menu options. Select as filters.
Related High-performance Packet Switching Architectures
Copyright 2019 - All Right Reserved