Special Report: Network Provisioning

Supercomputing and High-Performance Networking Supercomputers are among the most vital components of the infrastructure for large-scale science. The continual increases in the computational speeds of supercomputers enabled unprecedented simulations, computations and explorations in several areas such as climate, genomics and astrophysics. Currently, the Japanese earth simulator provides a peak execution speed of 37 teraflops. Within a few years supercomputers with speeds in excess of 100 teraflops are expected to be available. Such computational speeds present unprecedented challenges to the networks both in terms of moving the massive amounts of generated data as well as in interactively steering the ultra high-speed computations running on them. Figure 1. Supercomputer computational speeds consistently outpaced the network speeds. Historically, the network speeds have been consistently outpaced by the computational speeds of supercomputers as shown in Fig. 1. Furthermore, the speed mismatch between the computational speeds of supercomputers and throughputs of their network connections continues to grow, often isolating the former from the wide-area remote access. Consequently, the supercomputers are often restricted to mainly local use and/or batch jobs. Note that in Fig 1 the computational speeds are on a log scale and networks speeds are on (almost) linear scale, which represents an ever widening performance gap. If this situation is not addressed, it is likely that: (a) massive amounts of data generated and/or required by the computations will not be transported in a timely manner, thereby choking or idling the computations, (b) batch executions could result in computations entering into undesired parameter domains due to the lack of active monitoring, thereby causing multiple reruns that waste the computational resources, and (c) unstable control loops (typical ofthe current Internet) would result in "flops on the floor" phenomenon, wherein the supercomputers idle while waiting for control messages to arrive over the network.

Due to the critical role played by the supercomputers in several DOE large-scale science applications, it is particularly important to develop the network technologies capable of specifically addressing the supercomputing needs. Figure 2. The end-to-end throughput continues to be small fraction the backbone speeds.

2.3 State of Networking for Large-Scale Science: As Assessment There are several limitations of the current networks in meeting the requirements of DOE large- science applications in the areas identified in previous sections. While the needed transport speeds are currently available at the backbone links based on dense wavelength division multiplexing (DWDM) technologies, several architectural and design factors of provisioning, TCP stacks, network interface cards, and related software, currently limit the typical application throughputs to less than 1 Gbps as shown in Fig. 2. Experts in the field agree that sustaining multi-Gbps throughputs at the application level will not be achieved by simply replacing the existing links with ultra faster ones. For instance (to give a dated example), when the OC3 (150 Mbps) backbone was upgraded to OC12 (600 Mbps), the typical application throughput improved only marginally (25-50%) instead of the expected fourfold. Indeed, it took several years of protocol tuning and enhancements to reach 300Mbps throughput at the applicationlevel. A similar fate awaits the simple-minded approach of just replacing the current links with OC-768 (40 Gbps) links or other high-speed optical links. In fact harnessing the abundant backbone bandwidth to provide it to the applications will require new advances in host system as well as network components, including transport protocols, network optimized system bus architectures, and dynamic provision of high-speed optical links. This last item may appear to be a non sequitur, but it follows directly from the fact that new transport protocols may demand segregated links on which they can run unimpeded, and this in turn, will indeed require the ondemand provisioning of those links.

Previous    Next    Table of Content for report: Network Provisioning    Home

Network Provisioning

 

 

 

Photuris.com - Optical Data Networking