Comparing Intel’s Core with AMD's K8 Microarchitecture. IS 3313 December 14 th. Why is the Core Better at Prefetching and Caching?. 3 prefetchers, 2 data, 1 instruction, per core 2 prefetchers for the shared L2-cache Eight prefetchers active in a Core 2 Duo CPU
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
IS 3313 December 14th
The K8 Athlon 64 can only move loads before independent ALU operations (ADD etc.)Loads cannot be moved ahead much at all to minimize the effect of a cache miss, and other loads cannot be used to keep the CPU busy if a load has to wait for a store to finish. The K8 has some Load/Store reordering, but it's much later in the pipeline and is less flexible than the Core architecture
Core’s approach to determining whether a Load and a Store share the same address is called Memory Disambiguation. The P8 terefore permits Loads to move ahead of Stores thereby giving a big performance boost. Intel claim up to a 40% performance boost in some instances: however, 10-20% increase in performance is possible using the fast L2 and L1 cache