Part I: Parallel Computer System Architectures PVP: Parallel vector processors K MIMD, UMA, large grain g A small number of powerful custom-designed Vector Processors(VP): 21G flops g A custom-designed high-bandwidth crossbar switch aA number of shared-memory modules. A large number of vector registers and instruction buffer without caches normally. Er Examples: Cray C-90/T-90, NEC SX-4, Galaxy-1 etc a Typical structure VP VP VP Crossbar Switch SM SM SM NHPCC(Hefei)·USTC· CHINA glchenaustc edu.ci I-
NHPCC(Hefei) •USTC •CHINA glchen@ustc.edu.cn PVP : Parallel Vector Processors + MIMD ,UMA , large grain. + A small number of powerful custom-designed Vector Processors(VP) : ‡1G flops. + A custom-designed high-bandwidth crossbar switch. + A number of shared-memory modules. + A large number of vector registers and instruction buffer without caches normally. + Examples : Cray C-90/T-90, NEC SX-4 , Galaxy-1 etc. + Typical Structure : Part I : Parallel Computer System Architectures 1 -1 VP SM VP VP SM SM ... Crossbar Switch
Part I: Parallel Computer System Architectures SMP: Symmetric Multiprocessors K MIMD UMA, medium grain, higher DOP(Degree of arales K Commodity microprocessors with on/off-chip caches E A high-speed snoopy bus or crossbar switch. G Central shared memory K Symmetric each processor has equal access to SM(Shared Memory), I/O and OS services. g Unscalable due to sm and bus E Examples SGI Power Challenge, DEC Alpha server 8400, Dawning-1 etc E Typical Structure: P/CP/C P/C Bus or crossbar switch SM SM I/O NHPCC(Hefei)·USTC· CHINA glchenaustc edu.ci 1-2
NHPCC(Hefei) •USTC •CHINA glchen@ustc.edu.cn SMP: Symmetric Multiprocessors + MIMD ,UMA , medium grain , higher DOP(Degree of Parallelism). + Commodity microprocessors with on/off-chip caches. + A high-speed snoopy bus or crossbar switch. + Central shared memory. + Symmetric : each processor has equal access to SM(Shared Memory) , I/O and OS services. + Unscalable due to SM and bus. + Examples : SGI Power Challenge , DEC Alpha server 8400, Dawning-1 etc. + Typical Structure : Part I : Parallel Computer System Architectures 1 -2 P/C Bus or Crossbar Switch SM SM I/O P/C ... P/C
Part I: Parallel Computer System Architectures Comparison of Five Commercial SMP Systems DEC System HP9000/IBM Sun Ultra SGI Power Alphaserver Enterprise Characteristics T600RS6000R40 Challenge x 840054410 6000 No. processors 12 12 30 36 437MHZ180 MHz 112 MHZ 167 MHz 195 MHz Processor type Alpha 21164 PA 8000/ PowerPC UltraSPARC MIPS 604 R10000 Off-chip cac 4 MB I MB 512 KB 4 MB per processor Max memory 28 GB 2GB30GB16GB Interconnect Bus Bus Bus+ Xbar Bus+ Xbar Bus bandwidth 2.1 GB/s 960 MB/s 1.8 GB/s 2.6 GB/s 1.2GB/s Intemal disk 192 GB 168 GB 38 63GB144GB 6 power 12 PCI 2 MCA, 30 Sbus, tO channels buses, each N/A each 160 each 200 Channel-2 HIO eack 133 MB/s MB/s MB/s 320 MB/s 144PCI 112 HP- VO slots 15 MCA/45 Sbus 12 HIO slots PB slots slots slots 320MB IO bandwidth 1.2 GB/s I GB/s 320 MB/s 2.6GB/s per HIO slot NHPCC(Hefei)·USTC· CHINA glchen @ustc.ed.cl -3
NHPCC(Hefei) •USTC •CHINA glchen@ustc.edu.cn Comparison of Five Commercial SMP Systems Part I : Parallel Computer System Architectures 1 -3
Part I: Parallel Computer System Architectures MPP: Massively Parallel Processors G MIMD, NUMA, medium /large grain g A large number of commodity microprocessors E A custom-designed high bandwidth, low latency communication network e Physically distributed memory shared or not ) g May or may not have local disk Synchronized through blocking message-passing operations K Examples Intel Paragon, IBM SP2, Dawning-1000 E Typical Structure: MB MB P/C P/C LM LM NIC NIC Custom-designed Network NHPCC(Hefei)·USTC· CHINA glchen @ustc.ed.cl
NHPCC(Hefei) •USTC •CHINA glchen@ustc.edu.cn MPP:Massively Parallel Processors + MIMD, NUMA , medium/large grain . + A large number of commodity microprocessors . + A custom-designed high bandwidth , low latency communication network. + Physically distributed memory( shared or not ). + May or may not have local disk. + Synchronized through blocking message-passing operations. + Examples : Intel Paragon , IBM SP2,Dawning-1000 etc. + Typical Structure : Part I : Parallel Computer System Architectures 1 -4 P/C LM NIC Custom-designed Network P/C LM NIC MB MB
Part I: Parallel Computer System Architectures Comparison of three MPP systems Intel/Sandia ASCI MPP Models IBM SP2 SGICray Option Red Origin2000 A Large sample 9072 processors, 400 processors, 100 128 processors, 51 contiguration 1.8 Tflop/s at SNL Gflop/s at MHPCC Gflop/s at NCSA Available date December 1996 September 1994 October 1996 Processor type 00 MHz, 200 Mflop/s 67 MHz, 267 200 MHz, 400 Mflop Pentium Pro Mflop/s POWER2 MIPS R10000 Node architecture 2 processors, 32 to I processor, 64 MB to 2 processors,64 MB and data storage 256 MB of memory, 2 GB local memory, to 256 GB of DSM hared disk 1-4.5GB Local disk and shared disk Interconnect and Split 2D mesh Multistage network, Fat hypercube, memory model NORMA NORMA CC-NUMA Node operating Light-weighted kernel Complete AIX Microkernel ystem LWK (BM Unix) Cellular IriX Native MPi based on MPI and PVM Power C programming PUMA Portals mechanism Power Fortran Other programming Nx, PVM, HPF HPF. Linda models MPL PVM NHPCC(Hefei)·USTC· CHINA glchen @ustc.ed.cl 1-5
NHPCC(Hefei) •USTC •CHINA glchen@ustc.edu.cn Comparison of three MPP systems Part I : Parallel Computer System Architectures 1 -5