1 / 21

Memory Overview

Memory Overview. Chapter 7.5,7.7,7.8. Outline. Smart memory allocation Memory overview Blocking. Memory Overview. Cache configurations Cache miss categorization Pitfalls. Where can a block be placed?. Scheme # of sets Blocks/set Direct-Mapped Set associative Fully associative.

thetis
Download Presentation

Memory Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Memory Overview Chapter 7.5,7.7,7.8

  2. Outline • Smart memory allocation • Memory overview • Blocking

  3. Memory Overview • Cache configurations • Cache miss categorization • Pitfalls

  4. Where can a block be placed? Scheme # of sets Blocks/set Direct-Mapped Set associative Fully associative

  5. How is a block found? Scheme Location method # parallel comparisons Direct-Mapped Set associative Fully associative VM FullyAssoc

  6. Pitfalls • Forgetting about word-addressing vs byte-addressing • Using miss rate as the only metric for evaluating a cache • Ignoring the memory system when writing code.

  7. Linked List class Link { private: Link *next; void *data; public: Link(){next=NULL;data=NULL;} inline void SetData(void *d){data = d;} inline void *GetData(){return data;} inline void SetNext(Link *n){next = n;} inline Link *GetNext(){return next;} };

  8. Linked List class LinkedList { private: Link *head; Link *tail; Link *freelist; Link *NewLink(); void DeleteLink(Link *); public: LinkedList(); ~LinkedList(); void InsertHead(void *data); void InsertTail(void *data); void *RemoveHead(); int Find(void *d); … whatever other functions you want };

  9. Data structures with links • Links often accessed in sequence • If allocated separately, they may end up located far apart • Allocate many links at the same time in the same space • Never deallocate a link - just stick it on a free list.

  10. data next data next data next data next data next data next data next freelist Linked List inline Link *LinkedList::NewLink() { // if there are free links, remove and return head if (freelist != NULL) {Link *retlink = freelist; freelist = freelist->next; return retlink;} else // allocate a new free list { int I; Link *retlink = new Link[100]; freelist=retlist+1; // link the array together as a llist for(I=0;I<98;I++) freelist[I].next = &(freelist[I+1]); } }

  11. data next data next data next data next data next data next data next data next data next data next data next data next data next data next freelist Linked List link inline void LinkedList::DeleteLink(Link *link) { // rather than calling delete, // we add this link to the free list at the head // so that we reuse it first link->next = freelist; freelist = link; } freelist

  12. Savings • Avoid multiple calls to new, delete • reuses links • place links together in memory

  13. Blocking Multiply a matrix X = Y * Z = * X[I,J] = k=0-> n the sum of Y[I,K] * Z[K,J]

  14. Blocking Multiply a matrix X = Y * Z = * for(I=0;I<n;I++) for(J=0;J<n;J++) for(K=0;K<n;K++) X[I,J] = X[I,J] + Y[I,K] * Z[K,J]

  15. Blocking Multiply a matrix X = Y * Z = * for(I=0;I<n;I++) for(J=0;J<n;J++) for(K=0;K<n;K++) X[I,J] = X[I,J] + Y[I,K] * Z[K,J] If Y & Z do not both fit in cache at once, miss rate is high

  16. Blocking Multiply a matrix X = Y * Z B = * The BxB block in Z will be the outermost loop. Do all combinations of X & Y with that block of Z, then move on.

  17. Blocking Multiply a matrix X = Y * Z B = * The BxB block in Z will be the outermost loop. Do all combinations of X & Y with that block of Z, then move on.

  18. Blocking Multiply a matrix X = Y * Z B = * The BxB block in Z will be the outermost loop. Do all combinations of X & Y with that block of Z, then move on.

  19. Blocking Multiply a matrix X = Y * Z B = * For(jj=0;jj<N;jj+=B) For(kk=0;kk<N;kk+=B) // for each BxB block of Z

  20. Blocking Multiply a matrix X = Y * Z B = * For(jj=0;jj<N;jj+=B) For(kk=0;kk<N;kk+=B) // for each BxB block of Z For(I=0;I<N;I++) // for each row of X&Y For(J=jj;J<jj+B;J++) // for each element in X row

  21. Blocking Multiply a matrix X = Y * Z B = * For(jj=0;jj<N;jj+=B) For(kk=0;kk<N;kk+=B) // for each BxB block of Z For(I=0;I<N;I++) // for each row of X&Y For(J=jj;J<jj+B;J++) // for each element in X row { tmp = 0; For(K=kk;K<kk+B;K++) // for each Y row,Z col tmp += Y[ I ][ K ]*Z[ K ][ J ] X[ I ][ J ] += tmp; }

More Related