Part I: Parallel Computer System Architectures ASCI Option-Red System Service I/0 Nodes Computation Nodes Nodes I/0 Nodes PCI Computation Computation+ Service Node Node ode Node PPI, etc PCI Computati Computation+ Service Ethernet Node Node Node Node Node Ethernet Disks Operator Station Computation Computation ervice Node Node Node Boot raid Node System nodes NHPCC(Hefei)·USTC· CHINA glchen @ustc.ed.cl 1-6
NHPCC(Hefei) •USTC •CHINA glchen@ustc.edu.cn ASCI Option-Red System Part I : Parallel Computer System Architectures 1 -6 Boot RAID Computation Node Service Node Service Node Service Node Node Station Boot Node ... Computation Node Computation Node... Computation Node Computation Node Computation Node ... ... ... ... ... I/O Nodes Computation Nodes Service Nodes I/O Nodes System Nodes Ethernet Operator ATM, Ethernet Node PCI Node PCI Node PCI Node PCI Node Disks Tapes HiPPI, etc
Part I: Parallel Computer System Architectures High-Performance CPU Chips for MPP Attribute Pentium Pro PowerPC 620 Alpha 21164A MIPS Roo TechnologyBiCMOS CMOS CMOS CMOS CMOS Transistors 55M55M7M 9.6M 5.4M 6.8M Clock Rate 150 MHZ 133 MHz 417 MHz 200 MHz 200 MHZ 2.9V 3.3V 22V 2.5V 3.3V Power 20W 30W 30W Word Length 32 bits 64 bits 64 bits 64 bits UD Cache 8KB/8KB 32 KB/32 KBKB/8 KB16 KB/16KB 32 KB/ZKB L2 Cache 256KB 1-128MB96KB 16 MB 16 MB a multi-chip off-chip ofif-chi off-chip module Execution Units 5 units 6 units uni Superscalar 4 way 4 way 4 4 way Pipeline depth 14 stages 4-8 stage 7-9 stages9stages 5-7 stages SPECint92 366 225 500 350 300 SPECip9z 283 300 2750 600 SPECint95 8.09 11 NA 74 SPECIp95 6.70 300 >17 NA 15 CISCRISC Short Highest clock Multimedia MP cluster hybrid2-level large LI caches rate and density and graphic bus su speculative with on-chip instructions up to L2 cache NHPCC(Hefei)·USTC· CHINA glchenaustc edu.ci
NHPCC(Hefei) •USTC •CHINA glchen@ustc.edu.cn High-Performance CPU Chips for MPP Part I : Parallel Computer System Architectures 1 -7
Part I: Parallel Computer System Architectures Microprocessor Families and Representative cPu chips Intel x86 Series: 86.286. 386. 486. Pentium. Pentium pro cisc Motorola Series M 68x0 and 680x0 Digital VAX( VLSI version) gital Alpha Series: 21064, 21164, 21264 MIPS Series R200030,400 RISC R500,0800,000 HP/PA-RISC Series PA 7300 and PA 8000 Micro- Sun sparc Series SPArC, MicrosPARC processors Supersparc and ultrasparc PowerPC series 601,603,604,620,630 DSP Chips Digital Sa-110, Motorola 68EC040 Microcontrollers Intel i960. IBM PowerPC 403GA NHPCC(Hefei)·USTC· CHINA glchen @ustc.ed.cl 1-8
NHPCC(Hefei) •USTC •CHINA glchen@ustc.edu.cn Microprocessor Families and Representative CPU Chips Part I : Parallel Computer System Architectures 1 -8 : 86,286,386,486,Pentium,Pentium Pro : M 68x0 and 680x0 : PA 7300 and PA 8000 : SPARC,MicroSPARC, SuperSPARC and UltraSPARC : Digital SA-110,Motorola 68EC040 : Hitachi SuperH,NEC R4300 Gerenal Purpose CISC Microprocessors Embedded RISC Intel x86 Series Digital : VAX(VLSI version) RISC Digital Alpha Series MIPS Series HP/PA-RISC Series Sun SPARC Series PowerPC Series : 21064,21164,21264 : 601,603,604e,620,630 DSP Chips Microcontrollers Motorola Series Media Processors : Intel i960,IBM PowerPC 403GA : R2000, R3000, R4000 R5000, R8000, R10000
Part I: Parallel Computer System Architectures DSM: Distributed Shared-Memory E MIMD, NUMA, NORMA, large grain E Memory physically distributed, but system hardware and software support a single address space to application users. E DIR( Cache directory )is used to support distributed coherent caches A custom-designed communication netwo E Shared-memory programming style E Examples: Stanford DASH, Cray T3D etc Typical Structure: MB P/C P/C LM LM DIR DIR NIC NIC Custom-designed Network NHPCC(Hefei)·USTC· CHINA glchen @ustc.ed.cl 1-9
NHPCC(Hefei) •USTC •CHINA glchen@ustc.edu.cn DSM : Distributed Shared-Memory + MIMD , NUMA, NORMA, large grain. + Memory physically distributed , but system hardware and software support a single address space to application users. + DIR( Cache directory ) is used to support distributed coherent caches. + A custom-designed communication network. + Shared-memory programming style. + Examples : Stanford DASH , Cray T3D etc. + Typical Structure : Part I : Parallel Computer System Architectures 1 -9 P/C LM NIC DIR MB Custom-designed Network P/C LM NIC DIR MB
Part I: Parallel Computer System Architectures COW: Cluster of workstations s MIMD, NUMA, coarse grain. w Distributed memory c Each node of Co w is a complete computer( SMP or PC)sometimes called headless workstation e A low-cost commodity network. e There is always a local disk. g A complete os resides on each node, whereas MPP only a microkernel exists g Examples: Berkeley now alpha Farm, FXCOWetc c Typical Sturcture: MB P/C P/C M M Bridge Bridge LD IOB LD IOB NIC NIC Commercial Networks(Ethernet, ATM etc.) NHPCC(Hefei)·USTC· CHINA glchenaustc edu.ci I-10
NHPCC(Hefei) •USTC •CHINA glchen@ustc.edu.cn COW : Cluster Of Workstations + MIMD , NUMA , coarse grain. + Distributed memory. + Each node of COW is a complete computer ( SMP or PC) sometimes called headless workstation. + A low-cost commodity network. + There is always a local disk. + A complete OS resides on each node , whereas MPP only a microkernel exists. + Examples : Berkeley NOW ,Alpha Farm ,FXCOW etc. + Typical Sturcture : Part I : Parallel Computer System Architectures 1 -10 LD P/C M MB IOB LD P/C M MB IOB Commercial Networks(Ethernet,ATM etc.) Bridge NIC NIC Bridge