Caching Basics: Exploit Spatial Locality Idea: Store addresses adjacent to the recentlyaccessed one in automatically managed fast memory-Logicallydividememoryintoequalsizeblocks- Fetch to cache the accessed block in its entiretyAnticipation: nearby data will be accessed soon:Spatial locality principleNearby data in memory will be accessed in thenear future. E.g, sequential instruction access, array traversalThis is whatIBM360/85 implemented: 16 Kbyte cache with 64 byte blocks: Liptay,"Structural aspects of the System/360 Model 85 II: thecache,IBMSystems Journal, 1968.ComputerArchitecture21
Computer Architecture Caching Basics: Exploit Spatial Locality • Idea: Store addresses adjacent to the recently accessed one in automatically managed fast memory – Logically divide memory into equal size blocks – Fetch to cache the accessed block in its entirety • Anticipation: nearby data will be accessed soon • Spatial locality principle – Nearby data in memory will be accessed in the near future • E.g., sequential instruction access, array traversal – This is what IBM 360/85 implemented • 16 Kbyte cache with 64 byte blocks • Liptay, “Structural aspects of the System/360 Model 85 II: the cache, ” IBM Systems Journal, 1968. 21
The Bookshelf AnalogyBookinyourhandDeskBookshelfBoxes at homeBoxes in storage.Recently-used books tend to stay on desk- Comp Arch books, books for classes you are currently taking- Until the desk gets fullAdjacent books in the shelf needed around the sametime- If I have organized/categorized my books well in the shelfComputerArchitecture22
Computer Architecture The Bookshelf Analogy • Book in your hand • Desk • Bookshelf • Boxes at home • Boxes in storage • Recently-used books tend to stay on desk – Comp Arch books, books for classes you are currently taking – Until the desk gets full • Adjacent books in the shelf needed around the same time – If I have organized/categorized my books well in the shelf 22
Caching in a Pipelined Design: The cache needs to be tightly integrated into the pipeline- Ideally, access in 1-cycle so that dependent operationsdo not stall: High frequency pipeline →> Cannot make the cache large- But, we want a large cache AND a pipelined designIdea: Cache hierarchyMainLevel 2MemoryCPULevel1Cache(DRAM)CacheRFComputerArchitecture23
Computer Architecture Caching in a Pipelined Design • The cache needs to be tightly integrated into the pipeline – Ideally, access in 1-cycle so that dependent operations do not stall • High frequency pipeline à Cannot make the cache large – But, we want a large cache AND a pipelined design • Idea: Cache hierarchy 23 CPU Main Memory (DRAM) RF Level1 Cache Level 2 Cache
A Note on Manual vs.Automatic ManagementManual:Programmer manages data movement across levels--toopainfulforprogrammers onsubstantial programso"core"vs“drum"memoryinthe50'sostilldoneinsomeembeddedprocessors(on-chipscratchpadSRAMin lieu of a cache)Automatic:Hardware manages datamovement acrosslevelstransparently to the programmer++programmer'slifeiseasiero simpleheuristic:keep most recently used items in cacheo the average programmer doesn't need to know about itoYoudon'tneedtoknowhowbigthecacheisandhowitworkstowritea"correct" program!(What if you want a"fast"program?)ComputerArchitecture24
Computer Architecture A Note on Manual vs. Automatic Management • Manual: Programmer manages data movement across levels - too painful for programmers on substan4al programs o “core” vs “drum” memory in the 50’s o s4ll done in some embedded processors (on-chip scratch pad SRAM in lieu of a cache) • Automa4c: Hardware manages data movement across levels, transparently to the programmer ++ programmer’s life is easier o simple heuris4c: keep most recently used items in cache o the average programmer doesn’t need to know about it o You don’t need to know how big the cache is and how it works to write a “correct” program! (What if you want a “fast” program?) 24
Automatic Management in Memory Hierarchy: Wilkes, "Slave Memories and Dynamic StorageAllocation,"IEEE Trans. On Electronic Computers,1965.Slave Memoriesand DynamicStorageAllocationM.V.WILKESSUMMARYTheuseisdiscussed of afastcorememoryof,say,32000 wordsasa slave to a slower core memory of,say,onemillion wordsin such away that inpractical cases the effective accesstimeis nearerthat ofthefast memory thanthat of the slow memory."By a slave memory I mean one which automaticallyaccumulates to itself words that come from a slowermain memory, and keeps them available forsubsequent use without it being necessary for thepenalty of main memory access to be incurred again.'ComputerArchitecture25
Computer Architecture Automatic Management in Memory Hierarchy • Wilkes, “Slave Memories and Dynamic Storage Allocation,” IEEE Trans. On Electronic Computers, 1965. • “By a slave memory I mean one which automatically accumulates to itself words that come from a slower main memory, and keeps them available for subsequent use without it being necessary for the penalty of main memory access to be incurred again.” 25