What is Adaptive Routing ?
•Switch networks typically provide many paths between any two points
•In an adaptively routed network
routers make packet by packet decisions on the route to use based on
–Queue occupancy
–Channel usage
–Error rates and state
–Class of traffic
QsNetIII, An HPC Interconnect For Peta Scale SystemsFederica Pisani
QsNetIII Network
–Multi-stage switch network
–Evolution of the QsNetIIdesign
–Increased use of commodity hardware
–Increasing support for standard software
•QsNetIII Components
–ASICs Elan5 and Elite5
–Adapters, switches, cables
–Firmware, drivers, libraries
–Diagnostics, documentation
Slawomir Janukowicz, Juniper Networks
Juniper Day, Praha, 13.5.2015
Jestliže SlideShare nezobrazí prezentaci korektně, můžete si ji stáhnout ve formátu .ppsx nebo .pdf (kliknutím na tlačitko v dolní liště snímků).
QsNetIII, An HPC Interconnect For Peta Scale SystemsFederica Pisani
QsNetIII Network
–Multi-stage switch network
–Evolution of the QsNetIIdesign
–Increased use of commodity hardware
–Increasing support for standard software
•QsNetIII Components
–ASICs Elan5 and Elite5
–Adapters, switches, cables
–Firmware, drivers, libraries
–Diagnostics, documentation
Slawomir Janukowicz, Juniper Networks
Juniper Day, Praha, 13.5.2015
Jestliže SlideShare nezobrazí prezentaci korektně, můžete si ji stáhnout ve formátu .ppsx nebo .pdf (kliknutím na tlačitko v dolní liště snímků).
Building DataCenter networks with VXLAN BGP-EVPNCisco Canada
The session specifically covers the requirements and approaches for deploying the Underlay, Overlay as well as the inter-Fabric connectivity of Data Center Networks or Fabrics. Within the VXLAN BGP-EVPN based Overlay, we focus on the insights like forwarding and control plane functions which are critical to the simplicity operation of the architecture in achieving scale, small failure domains and consistent configuration. To complete the overlay view on VXLAN BGP-EVPN, we are going to the insides of BGP and its EVPN address-familiy and extend to about how multiple DC Fabric can be interconnected within, either as stretched Fabrics or with true DCI. The session concludes with a brief overview of manageability functions, network orchestration capabilities and multi-tenancy details. This Advanced session is intended for network, design and operation engineers from Enterprises to Service Providers.
Performance Improved Network on Chip Router for Low Power ApplicationsIJTET Journal
Abstract— On chip routers typically have buffers dedicated to their input or output ports for temporarily storing packets in case contention occurs. Buffers consume significant portions of router area. While running a traffic trace, however not all input ports of routers have incoming packets needed to be transferred simultaneously. So large numbers of buffer queues in the network are empty and other queues are mostly busy. This observation motivates us to design Router architecture with Shared Queues (RoShaQ), router architecture that maximizes buffer utilization by allowing the sharing multiple buffer queues among input ports. In the network design of the NoC the most essential things are a network topology and a routing algorithm. Routers route the packets based on the algorithm that they use. Every system has its own requirements for the routing algorithm. A new adaptive weighted XY routing algorithm for eight port router Architecture is proposed in order to decrease the latency of the network on chip router.
A novel way of creating overlay networks for OpenNebula is presented here. Using BGP Ethernet VPN (EVPN) with VXLAN data-plane encapsulation. This provides scalable Layer 2 over IP networks.
Demystifying EVPN in the data center: Part 1 in 2 episode seriesCumulus Networks
Network operators are slowly but surely embracing L3-based leaf-spine designs. However, either due to legacy applications or certain multi-tenancy requirements, the need for L2 across racks is still present. How do you solve the problem of providing L2 across multiple racks? EVPN is quickly emerging as the best answer to this question.
In this episode of our 2-part series on EVPN, we start with a discussion of the use cases, a review of the technologies EVPN competes with, and dive into an evaluation of the pros and cons of each.
For a recording of the live event, go to http://go.cumulusnetworks.com/l/32472/2017-09-22/95t27t
CCNA DC ,CCNP DC ,CCIE DC ,CCIE DC RACK RENTALS ,CCIE DC LEARNING PPT ,CCIE DC ONLINE TRAINING.
UCS RACK RENTALS ,MDS RACK RENTALS ,NEXUS 7000 RACK RENALS
Hardware in clouds in commonly connected by Ethernet LAN. There are alternatives but all are much more expensive by comparison. The problem in Ethernet used to be its CSMA/CD protocol which helps detect and resolve collisions. Even though modern switches are 100\% collision-free, high-rate multiparty chatter over the shared medium creates congestion and reduced overall utilization efficiency. One way to drastically improve efficiency is to revert to the old technology of circuit switching. This paper looks into possibilities of implementing a distributed scheduler which would emulate a circuit switching environment for multiparty communications. The immediate practical application in mind is access to local shared storage in clouds, or specifically, data centers.
Infrastructure as a Service (IaaS) for cloud environments provides compute processing, storage, networks, and other fundamental computing resources. To support multi-tenant cloud environments, IaaS utilizes the various advantages of the virtualization, but con-ventional virtual (overlay) network architectures for IaaS have been a direct cause of scalability limitations in multi-tenant cloud environments. In other words, IaaS’s virtual networks have the limitations due to the problems of high availability and load bal-ancing, etc. To solve these problems, we present EYWA, a virtual network architecture that scales to support huge data centers with high availability, load balancing and large layer-2 semantics. The design of EYWA overcomes the limitations by accommodating (1)a large number of tenants (about 224 = 16,777,216) by using virtual LANs such as logically isolated network with its own IP range in the cloud service providers’ view, and providing (2)public network service per tenant without throughput bottleneck and single point of failure (SPOF) on Source and Destination Network Address Translation (SNAT/DNAT) and (3)a single large IP subnet per tenant by using large layer-2 semantics in the consumers’ view. EYWA combines existing techniques into a decentralized scale-out control and data plane. The only component of EYWA is an agent in every hypervisor host that can control packets and the agents act as distributed controller. As a result, EYWA can be deployed into all the multi-tenant cloud environments today.
Building DataCenter networks with VXLAN BGP-EVPNCisco Canada
The session specifically covers the requirements and approaches for deploying the Underlay, Overlay as well as the inter-Fabric connectivity of Data Center Networks or Fabrics. Within the VXLAN BGP-EVPN based Overlay, we focus on the insights like forwarding and control plane functions which are critical to the simplicity operation of the architecture in achieving scale, small failure domains and consistent configuration. To complete the overlay view on VXLAN BGP-EVPN, we are going to the insides of BGP and its EVPN address-familiy and extend to about how multiple DC Fabric can be interconnected within, either as stretched Fabrics or with true DCI. The session concludes with a brief overview of manageability functions, network orchestration capabilities and multi-tenancy details. This Advanced session is intended for network, design and operation engineers from Enterprises to Service Providers.
Performance Improved Network on Chip Router for Low Power ApplicationsIJTET Journal
Abstract— On chip routers typically have buffers dedicated to their input or output ports for temporarily storing packets in case contention occurs. Buffers consume significant portions of router area. While running a traffic trace, however not all input ports of routers have incoming packets needed to be transferred simultaneously. So large numbers of buffer queues in the network are empty and other queues are mostly busy. This observation motivates us to design Router architecture with Shared Queues (RoShaQ), router architecture that maximizes buffer utilization by allowing the sharing multiple buffer queues among input ports. In the network design of the NoC the most essential things are a network topology and a routing algorithm. Routers route the packets based on the algorithm that they use. Every system has its own requirements for the routing algorithm. A new adaptive weighted XY routing algorithm for eight port router Architecture is proposed in order to decrease the latency of the network on chip router.
A novel way of creating overlay networks for OpenNebula is presented here. Using BGP Ethernet VPN (EVPN) with VXLAN data-plane encapsulation. This provides scalable Layer 2 over IP networks.
Demystifying EVPN in the data center: Part 1 in 2 episode seriesCumulus Networks
Network operators are slowly but surely embracing L3-based leaf-spine designs. However, either due to legacy applications or certain multi-tenancy requirements, the need for L2 across racks is still present. How do you solve the problem of providing L2 across multiple racks? EVPN is quickly emerging as the best answer to this question.
In this episode of our 2-part series on EVPN, we start with a discussion of the use cases, a review of the technologies EVPN competes with, and dive into an evaluation of the pros and cons of each.
For a recording of the live event, go to http://go.cumulusnetworks.com/l/32472/2017-09-22/95t27t
CCNA DC ,CCNP DC ,CCIE DC ,CCIE DC RACK RENTALS ,CCIE DC LEARNING PPT ,CCIE DC ONLINE TRAINING.
UCS RACK RENTALS ,MDS RACK RENTALS ,NEXUS 7000 RACK RENALS
Hardware in clouds in commonly connected by Ethernet LAN. There are alternatives but all are much more expensive by comparison. The problem in Ethernet used to be its CSMA/CD protocol which helps detect and resolve collisions. Even though modern switches are 100\% collision-free, high-rate multiparty chatter over the shared medium creates congestion and reduced overall utilization efficiency. One way to drastically improve efficiency is to revert to the old technology of circuit switching. This paper looks into possibilities of implementing a distributed scheduler which would emulate a circuit switching environment for multiparty communications. The immediate practical application in mind is access to local shared storage in clouds, or specifically, data centers.
Infrastructure as a Service (IaaS) for cloud environments provides compute processing, storage, networks, and other fundamental computing resources. To support multi-tenant cloud environments, IaaS utilizes the various advantages of the virtualization, but con-ventional virtual (overlay) network architectures for IaaS have been a direct cause of scalability limitations in multi-tenant cloud environments. In other words, IaaS’s virtual networks have the limitations due to the problems of high availability and load bal-ancing, etc. To solve these problems, we present EYWA, a virtual network architecture that scales to support huge data centers with high availability, load balancing and large layer-2 semantics. The design of EYWA overcomes the limitations by accommodating (1)a large number of tenants (about 224 = 16,777,216) by using virtual LANs such as logically isolated network with its own IP range in the cloud service providers’ view, and providing (2)public network service per tenant without throughput bottleneck and single point of failure (SPOF) on Source and Destination Network Address Translation (SNAT/DNAT) and (3)a single large IP subnet per tenant by using large layer-2 semantics in the consumers’ view. EYWA combines existing techniques into a decentralized scale-out control and data plane. The only component of EYWA is an agent in every hypervisor host that can control packets and the agents act as distributed controller. As a result, EYWA can be deployed into all the multi-tenant cloud environments today.
A quick simple presentation about how a company needs to use the OSI Model to look at building their network. Power, Cabling, Routers, and Switches are the most important items to start with; they are the foundation of your companies infrastructure!
Introducing the Future of Data Center Interconnect NetworksADVA
Our ADVA FSP 3000 CloudConnect™ is the future of Data Center Interconnect (DCI) networks. It’s a highly scalable, energy efficient and truly open platform. With our DCI technology, there are no more limits, no more restrictions. A new era of possibilities has arrived.
Harvard HPC Seminar Series
Theresa Kaltz, PhD, High Performance Technical Computing, FAS, Harvard
Due to the wide availability and low cost of high speed networking, commodity clusters have become the de facto standard for building high performance parallel computing systems. This talk will introduce the leading technology for high speed interconnects called Infiniband and compare its deployment and performance to Ethernet. In addition, some emerging interconnect technologies and trends in cluster networking will be discussed.
A Survey on Wireless Mesh Networks (WMN)Eyob Sisay
The network architectures of WMNs, Critical factors influencing protocol design or its design factors and Open Areas for Research on WMNs are discussed in this slide.
A Whole Lot of Ports: Juniper Networks QFabric System AssessmentJuniper Networks
Juniper Networks commissioned Network Test to assess the performance, interoperability, and usability of its QFabric System, a converged switch fabric for cloud and large data center applications tested with 1,536 10-Gbit/s Ethernet ports.
Even at this unprecedented scale – by far the largest ever in a public switch test – this project loaded the QFabric System to only one-quarter of its maximum capacity of 6,144 10-Gbit/s Ethernet ports.
Using industry-standard RFC benchmarks representing the most rigorous possible test cases, engineers stress-tested QFabric System performance in terms of unicast and multicast throughput and latency with separate events for Layer 2 and Layer 3 traffic. Engineers also assessed interoperability, a key consideration when adding QFabric technology incrementally into existing data center networks, and evaluated device management.
IBM Power9 Servers are here! Launched this week, the AC922 POWER9 servers will form the basis of the world’s fastest “Coral” supercomputers coming to ORNL and LLNL. Built specifically for compute-intensive AI workloads, the new POWER9 systems are capable of improving the training times of deep learning frameworks by nearly 4x allowing enterprises to build more accurate AI applications, faster.
Listen to the Radio Free HPC podcast on Power9: https://insidehpc.com/2017/12/radio-free-hpc-looks-new-power9-titan-v-snapdragon-845/
Learn more: https://www.ibm.com/us-en/marketplace/power-systems-ac922
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
High-performance 32G Fibre Channel Module on MDS 9700 Directors:Tony Antony
To better serve the new application requirements, Cisco is introducing a New high-performance Analytics ready 32G Fibre Channel Module on MDS 9700 Directors and a new 32G Host Bus Adapter for UCS C-series. The end to end 32G FC support across Cisco DC platforms set new standards for Storage Networking providing customers with choice. Along with this announcement, Cisco is also announcing NVMe over Fabric support on MDS 9000 Series enabling customers to take advantage of the performance and low latency benefits offered by the new technology to scale efficiently in the post-flash environments.
Ariel Waizel discusses the Data Plane Development Kit (DPDK), an API for developing fast packet processing code in user space.
* Who needs this library? Why bypass the kernel?
* How does it work?
* How good is it? What are the benchmarks?
* Pros and cons
Ariel worked on kernel development at the IDF, Ben Gurion University, and several companies. He is interested in networking, security, machine learning, and basically everything except UI development. Currently a Solution Architect at ConteXtream (an HPE company), which specializes in SDN solutions for the telecom industry.
Cloud Networking is not Virtual Networking - London VMUG 20130425Greg Ferro
Talking how and why virtual networking that we use today is not suitable for use in Cloud deployments. First I talk about the gap between "server" & "networks", then discuss the problems of virtual networking that we use today. Then into using software appliances instead of physical devices by highlighting the good & bad.
Then a brief overview of Software Defined Networking and how it will impact Cloud Networking in the next two years,
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
The infamous Mallox is the digital Robin Hoods of our time, except they steal from everyone and give to themselves. Since mid-2021, they've been playing hide and seek with unsecured Microsoft SQL servers, encrypting data, and then graciously offering to give it back for a modest Bitcoin donation.
Mallox decided to go shopping for new malware toys, adding the Remcos RAT, BatCloak, and a sprinkle of Metasploit to their collection. They're now playing a game of "Catch me if you can" with antivirus software, using their FUD obfuscator packers to turn their ransomware into the digital equivalent of a ninja.
-------
This document provides a analysis of the Target Company ransomware group, also known as Smallpox, which has been rapidly evolving since its first identification in June 2021.
The analysis delves into various aspects of the group's operations, including its distinctive practice of appending targeted organizations' names to encrypted files, the evolution of its encryption algorithms, and its tactics for establishing persistence and evading defenses.
The insights gained from this analysis are crucial for informing defense strategies and enhancing preparedness against such evolving cyber threats.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Essentials of Automations: Optimizing FME Workflows with Parameters
QsNetIII Adaptively Routed Network For HPC
1. QsNetIII an Adaptively Routed Network for
High Performance Computing
Duncan Roweth, Quadrics Ltd
Hot Interconnects August 2008
28/8/2008 Quadrics Ltd 1
2. Quadrics Background
• Develops interconnect products for the HPC market
– HPC Linux systems
– AlphaServer SC systems
• Quadrics is owned by the Finmeccanica group
• Quadrics was 12 years old in July
28/8/2008 Quadrics Ltd 2
5. Quadrics Networks
• Elan1 / Elite1, 1994, Meiko Computing Surface 2
– Source chooses between pre-defined routes
• Elan3 / Elite3, 2000, first Quadrics product, QsNet
– First use of packet-by-packet adaptive routing
– Crosspoint router, x8
• Elan4 / Elite4, 2004, QsNetII
– Reduced latency, increased bandwidth
– Increased support for offloading collectives
• Elan5 / Elite5, 2008, QsNetIII
– General purpose crosspoint router, increased radix, x32
– Highly programmable adapter
28/8/2008 Quadrics Ltd 5
6. What is Adaptive Routing ?
• Switch networks typically provide many
paths between any two points
• In an adaptively routed network
routers make packet by packet decisions
on the route to use based on
– Queue occupancy
– Channel usage
– Error rates and state
– Class of traffic
28/8/2008 Quadrics Ltd 6
7. Why is Adaptive Routing Important ?
• Most HPC networks are statically routed
– They use pre-determined paths between nodes
• Static routing can work well
– If traffic pattern is known in advance
– If traffic pattern is persistent
– If traffic pattern is uniform (i.e. application is load balanced)
– If there are no errors
• These conditions are not met by real codes on production
HPC systems {see LLNL and Sandia results}
• Adaptive routing solves these problems
– Delivering significantly better aggregate bandwidths and worst
case latencies on real systems running real codes
28/8/2008 Quadrics Ltd 7
8. Benefits of Adaptive Routing
• Bandwidth achieved
when 1024 nodes all
communicate at the
same time
• Plots show the
distribution of
measured bandwidths
System Interconnect Min Max Average
Atlas Infiniband 95 762 263
QsNetII
Thunder 248 403 369
Data from Lawrence Livermore National Lab, published at the Sonoma OpenFabrics workshop April 2007
28/8/2008 Quadrics Ltd 8
10. Ordering Considerations
• Adaptively routed packets can arrive out of order
– Problems for stream devices, e.g. multipath Ethernet
• Message ordering is required in HPC
– But within a message we are free to deliver the bulk data in
arbitrary order
Get it there as fast as possible then tell me that it is done
• QsNet ordering
– Packets contain the destination virtual address at which to write
the data
– Bulk data transfers can arrive out of order and can be replayed
– Atomic transactions are sequenced
28/8/2008 Quadrics Ltd 10
11. Adaptive Routing in QsNetIII
• More flexible than QsNetII
– Operates over arbitrary sets of links
– More opportunities to use the technique
– Higher radix switches
• Select a subset of lightly loaded output ports based on:
– Destination
– Link state, errors etc
– Number of pending acks (programmable threshold)
• Programmable algorithm for selecting from this subset:
– First free, next free, random
28/8/2008 Quadrics Ltd 11
12. Adaptive Routing: standard case
– All top switches are equivalent, select one
– Adaptive routing selects a lightly loaded path
28/8/2008 Quadrics Ltd 12
13. Implementation of Fat Tree Networks
• Connect M×N-way node switches by N×M-way top switches
• In this case M = 16, N = 4
28/8/2008 Quadrics Ltd 13
14. Adaptive Routing in the Top Switch
• If top switch radix ≤ router radix / 2
– i.e. 16 for Elite5, 2048-way networks
• Router provides multiple top switches
– Select which to use based on load
• Example:
– Traffic from A to B via routers 210 and
300 is blocked by traffic between 300
and 200.
– The router providing 300, 301, 302 and
303 can select a different path
28/8/2008 Quadrics Ltd 14
15. Adaptive Routing on the Final Hop
• Multiple connections to a node
• Switch can select a free path
• Reduces end-point contention
• Simple case is not optimal
• Spreading the connections
– Improves fault tolerance
– Reduces network contention
• Routing decision is made higher
in the network
28/8/2008 Quadrics Ltd 15
16. Adaptive routing in the presence of errors
• In a production system with 1000s
of links it is not uncommon for a
small number to be broken – until
the next maintenance slot
• Adaptive routing minimises the
impact
• Example:
– Link between routers 10 and 20 is
broken
– Router 10 dynamically selects paths
via 21,22,23 spreading the load.
– Reverse case, avoid sending to 10
via 20. Reset 20’s links or update
switches 11,12,13.
28/8/2008 Quadrics Ltd 16
17. Small Packet Support
• Aim to get as close to line rate as possible with small packets
• For example:
– Small put
– 32 byte packet
• Adapter has multiple packet engines
• Adapters support up to 64 outstanding packets per link
– Doubles if we use both links
• Switches provide 32 virtual channels per output link
• Prioritisation – buffering on input to the router
28/8/2008 Quadrics Ltd 17
18. Barrier & Broadcast Support
• Switches broadcast over
a range of output links
• Combine Acks / Nacks
• Contiguous in QsNetII
• Sparse in QsNetIII
• Barrier implementation
– Network conditional
– Broadcast release
28/8/2008 Quadrics Ltd 18
19. Elan5 – Device Overview
CX4/ CX4/
• 2× QSNetIII QSNetIII
QsNetIII links
– 20Gbit/s/direction after protocol
Elan5 Adapter
Link Link
• PCIe, PCIe2 host interface
• Multiple packet engines
Packet Engine Packet Engine Packet Engine Packet Engine Packet Engine Packet Engine Packet Engine
16K inst cache 16K inst cache 16K inst cache 16K inst cache 16K inst cache 16K inst cache 16K inst cache
9K data buffers 9K data buffers 9K data buffers 9K data buffers 9K data buffers 9K data buffers 9K data buffers
• 512KB of high bandwidth on
Fabric
chip local memory x8
• SDRAM interface to optional Bridge
Host I/F Local Memory Local Functions
Object Cache Tags
TLB
local memory Buffer Manager External cache
Cmd Launch
SDRAM i/f Ext i/f
Free List
PCIe
• Buffer manager, object 16K x 8 x 8 banks = 1MB ECC RAM PLL
SERDES
cache External EEPROM Clocks
PCIe
• Details in ISC Dresden
DDRII
16 Lanes
Paper
28/8/2008 Quadrics Ltd 19
20. Elite5 – Device Overview
• 64 × 32 crosspoint router
– Direct & buffered input from each link
– 8K of input buffering per link
• 32 virtual channels per link
• Physical layer DDR XAUI (6.25GHz)
• Adaptive routing
• Hardware barrier and broadcast
• Memory mapped stats & error
counters accessed out-of-band
28/8/2008 Quadrics Ltd 20
22. QsNetIII Implementation
• Node switch chassis
– 128 links down to the nodes
– 128 links up to the top switches
– Backplane connects 2 sets of cards
• Top switches QsNetIII switch
– 256 links down to the node switches logical design
– Range of system sizes:
Ports Radix Per Chassis
512 4 64
QsNetIII switch
1024 8 32
implementation
2048 16 16
4096 32 8
28/8/2008 Quadrics Ltd 22
23. QsNetIII Network 1024–way
• Fat tree, constructed from 8 × 128-way node switches connected by
128 × 8-way top switches
28/8/2008 Quadrics Ltd 23
24. QsNetIII Implementation – Cables
• QSFP connectors throughout
• Copper cables (e.g. Gore) 1-10m
• Active copper cables (e.g. Gore), 8-20m
• Optical cables (e.g. Luxtera), 5-300m
– PVDF Plenum rated
– LSZH available as an option
• No longer Quadrics proprietary
• Likely usage:
– Short copper cables from nodes
– Optical cables between switches
28/8/2008 Quadrics Ltd 24
25. QsNetIII Fault Tolerance
• All of the QsNetII Features
– CRCs on every packet
– Automatic retransmission
– Redundant routes
– Adaptive routing avoids failed links
– Redundant, hot plugable, PSUs and fans
+ Line rate testing of each link as it comes up
– Switches generate CRPAT, CJPAT or PRBS packets
– Links are only added to the route tables when they are (a) up, (b)
connect to the right place, and (c) can transfer data at full line rate
without error.
28/8/2008 Quadrics Ltd 25
26. QsNetIII Implementation – HP BladeSystem
Elan5 mezzanine adapter
Elite5 switch module
2 QsNet links, PCI-E x8 Gen2
Full bandwidth
128 MB of memory
16 links to the blades (via backplane)
16 links to back of the module
28/8/2008 Quadrics Ltd 26
27. Current Status
• Elite5 silicon in Bristol
• Elan5 at TSMC, first parts expected
in 3-4 weeks
• Switch PCBs, chassis, backplane,
controllers are working
• First adapter PCBs are ready
– PCI-Express x16, HP Blade,
ExpressModule (Sun Blade)
• We are porting the QsNetII software
• Components at SC08 in Austin
• First customer shipment in Q1 of 2009
28/8/2008 Quadrics Ltd 27
28. Future Work
• QsNetIII hardware
– Low cost 32-way switch
– 1024-way single chassis switch
• QsNetIII Software
– General framework for optimised collectives
– Support for “multiport” networks - “fat” nodes have multiple
connections to the same rail
– Ethernet firmware for the network adapter
28/8/2008 Quadrics Ltd 28
29. Conclusions
• Adaptive routing underwrites the scalability of HPC systems
designed to run a single large application
• Adaptive routing has been a feature of QsNet systems since 2000
• QsNetIII offers significant enhancements over both QsNetII and
competing products
28/8/2008 Quadrics Ltd 29
32. Packet Format
• Packet size of up to 4K made up of 256 byte packet segment and
continuations, 8 byte ACK
28/8/2008 Quadrics Ltd 32
33. Impact of static routing on latency
Data from Thunderbird cluster, Sandia National Lab
Big increases in worst case latency with number of nodes
28/8/2008 Quadrics Ltd 33
34. Impact of static routing on latency
Data from Thunderbird cluster, Sandia National Lab
Big variation in worst case latency across a large job
28/8/2008 Quadrics Ltd 34
35. Software Model – Firmware & Drivers
• Base firmware in the ROMs
• Firmware modules loadable with the device driver
– Elan, OpenFabrics, 10GE Ethernet, …
• Kernel modules
– elan5, elan, rms
• Device dependent library (libelan5)
• Device independent library (libelan)
• User libraries
28/8/2008 Quadrics Ltd 35
36. Software Model – Elan Libraries
• Point-to-point message • Optimised collectives
passing • Locks and atomics ops
• One-sided put/get • Global memory allocation
• Transparent rail striping
28/8/2008 Quadrics Ltd 36
37. QsNetIII Performance Summary
• Similar latencies to QsNetII
– The 1.3 to 2 microsecs of latency is mostly in the host PCI and
memory system
• Higher issue rates
– Improved link utilisation on small transfers
• Higher bandwidths
– 1.5 to 2.25 GB/sec/link depending on host interface
• Bi-directional host interface
– 2 x improvement over QsNetII
• Broadcast and barrier in hardware
• Continued development of adaptive routing underwrites scaling
to high node counts
28/8/2008 Quadrics Ltd 37