1 / 24

Principles: Performance

CS569 Selected Topics in Software Engineering Spring 2012. Principles: Performance. Key performance principles. Use indexes Minimize traffic Minimize locks Parallelize. GAE Indexes. An index is a list of pointers Each pointer indicates the location of one entity

thimba
Download Presentation

Principles: Performance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS569 Selected Topics in Software EngineeringSpring 2012 Principles: Performance

  2. Key performance principles • Use indexes • Minimize traffic • Minimize locks • Parallelize

  3. GAE Indexes • An index is a list of pointers • Each pointer indicates the location of one entity • The list is sorted according to selected entity member variables • Multi-value queries are handled by binary search + direct scan (single-value by hashtable) • Example: A list of pointers to Course entities could be sorted by department, then number Use indexes  Minimize traffic  Minimize locks  Parallelize

  4. GAE Index Creation • When entities of a certain kind are saved, GAE will (generally) create an index for each simple member variable • When a query comes in with more than one member variable, GAE will automatically create a new index (generally) • For exceptions to the rules, refer to GAE book pages 166-169 Use indexes  Minimize traffic  Minimize locks  Parallelize

  5. Example • Many Course entities • PK is automatically generated • The entities are scattered around datastore • Index based on department • Index based on coursenum • How to query for • Courses in the “BIO” department • Courses with numbers >= 700 Use indexes  Minimize traffic  Minimize locks  Parallelize

  6. Let’s walk through it… How can these be supported using indexes? • Filters combined with && • Filters combined with || if they operate on the same member variable • Filters that use != in your queries, as well • Filters that match == on a set of values, meaning “if any value matches” • Cannot use != on a set of values • Sorting using setOrdering() on the query before you invoke execute Use indexes  Minimize traffic  Minimize locks  Parallelize

  7. A few notes about indexes in GAE • GAE disallows joins on JDO queries • Instead, you have to do the join in the application • Indexes will not be used for filters combined with || on different member variables • A query like this will (usually) throw an exception • Try not to use any query that filters based on more than one multi-valued member variable • Because the resulting index is a big space-hog Use indexes  Minimize traffic  Minimize locks  Parallelize

  8. Minimize traffic • Traffic bogs down the server and the client • And costs money on the server • And wastes battery on the client • Keys to minimizing traffic • Minimal messages • Minimal roundtrips • Aggressive caching • Local computation Use indexes  Minimize traffic  Minimize locks  Parallelize

  9. Minimal messages • When client-server communicate… • Only send data needed at that moment • Use a concise data format (i.e., probably JSON) • For example, suppose that an app needed to retrieve a list of courses in response to a queryin order to show a list of links • http://www.myserver.com/info.jsp?prof=cscaffid Use indexes  Minimize traffic  Minimize locks  Parallelize

  10. Option #1565 bytes <?xml version="1.0"?> <courses> <course><dept>CS</dept><num>361</num><prof>cscaffid</prof><title>Intro to SE</title><description>Blah blah blah blah blah blah blah blah blah</description></course> <course><dept>CS</dept><num>494</num><prof>cscaffid</prof><title>Web development</title><description>Blah blah blah blah blah blah blah blah blah</description></course> <course><dept>CS</dept><num>496</num><prof>cscaffid</prof><title>Cloud+Mobile development</title><description>Blah blah blah blah blah blah blah blah blah</description></course> </courses> Use indexes  Minimize traffic  Minimize locks  Parallelize

  11. Option #2108 bytes [{n:"CS361",t:"Intro to SE"}, {n:"CS494",t:"Web development"}, {n:"CS496",t:"Cloud+Mobile development"}] Combine fields if appropriate (e.g., dept and number) Omit fields if not needed (e.g., description) Shorten field names if appropriate (e.g., n and t) Use JSON if feasible Use indexes  Minimize traffic  Minimize locks  Parallelize

  12. Minimal roundtrips • Eliminate unnecessary messages • E.g., cache images on the client so that these do not need to be repeatedly downloaded • Combine messages if feasible • E.g., if you need to query CS and MA courses, design server to handle both queries at once • Defer messages if feasible • E.g., give the user the option to defer logging in until it’s abolutely necessary Use indexes  Minimize traffic  Minimize locks  Parallelize

  13. Aggressive caching • If a computation or transmission is expensive, then do not repeat it unnecessarily • Cache images on the client • Cache expensive computation results on server • Options for caching on server • Write to the datastore • Write to memcache (might disappear and need recomputing) Use indexes  Minimize traffic  Minimize locks  Parallelize

  14. Pseudocode for caching – an example of computing rainfall String location = read from client e.g., “Albany, OR” String rainfall = memcache[location] If (rainfall is null) { latlon = convert location to latitude/longit. map = load weather map from data store pixelcolor = color of pixel for latlon in map rainfall = convert pixelcolor to inches of rain memcache[location] = rainfall } return rainfall as JSON to client

  15. Local computation • If a computation uses a very large amount of data, then move the computation to the data, instead of the data to the computation. • Example: Find city with maximal rainfall in US • Option #1: • Server sends rainfall for 4500 cities to client • Client loops through cities to choose maximum • Option #2: • Server loops through cities to choose maximum • Server sends just the maximum to the client Use indexes  Minimize traffic  Minimize locks  Parallelize

  16. Local computationanother example • Example: Exercise app • Every user’s cellphone logs activity during the day (every 1 minute, logs accelerometer) • Need to have a “winner board” • Option #1: • Every client sends every minute’s data to server • Server computes each user’s total for the day • Server picks winner (person with most exercise) • Option #2: • Every client computes that user’s total for day • Client sends that user’s total to the server • Server picks winner (person with most exercise)

  17. Lock only when necessary • If entities need to be modified by different people at the same time, then put the entities in different entity groups. • Clean up any inconsistency problems • Using transactional tasks • Or just before reading from data Use indexes  Minimize traffic  Minimize locks  Parallelize

  18. Example: An application that tracks college revenue • Suppose that there are N Course entities, each with a “cost” and a “num_students” member. • Suppose there is also a Projections entity with “total_num_students” and “total_revenue”. • Should we make all of the entities be in the same entity group? Use indexes  Minimize traffic  Minimize locks  Parallelize

  19. Example: An application that tracks college revenue • Option #1: • Make all of the Course entities to be JDO children of the Projections entity • When a student registers for a course, lock the Course and the Projections, update both • Option #2: • Put each Course in its own entity group • When a student registers, only update the Course • Schedule a transactional task to update the Projections Use indexes  Minimize traffic  Minimize locks  Parallelize

  20. Parallelize work when possible • When you need to update many, many entities, divide the work • Assign 1/N of the work to each of N tasks • Also useful when you have a complex computation that can be divided (even if there is no “update” involved) Use indexes  Minimize traffic  Minimize locks  Parallelize

  21. Example: Computing “best student” award • For each student, we shall compute a score based on that student’s grade in all courses • But it isn’t just GPA • We also are going to take into account the difficulty of different courses, and weight different courses differently • The computation will also take into account other data, such as numbers of papers published and time required to graduate. • It is a very detailed, complicated computation Use indexes  Minimize traffic  Minimize locks  Parallelize

  22. Example: Computing “best student” award • Option #1 • In a single task, we loop over all students, and foreach student, we compute the score; then we take the maximum to select the winner. • Option #2 • If we have N students, we launch N tasks each to compute (and store) the score for 1 student • We have one additional task that periodically runs: it checks to see if N scores have been stored, and if so, it selects the winner Use indexes  Minimize traffic  Minimize locks  Parallelize

  23. Key performance principles • Use indexes • Automatically created but limited • Minimize traffic • Minimal messages • Minimal roundtrips • Aggressive caching • Local computation • Minimize locks • Accept inconsistency and fix it in transactional task • Parallelize

  24. Extra credit opportunity (1XC) • Find 3 ways that the PSS application could be improved by applying the performance principles on the previous slide • For each of 3 ways, write 3 sentences (total of 9 sentences): • Principle: What principle would you apply? • Violation: What areas of the PSS code violate the principle? • Modification: How would you modify the PSS code? • Add a 10th sentence stating that you did not discuss this with any classmates, and that you worked on it alone • Create a PDF with your 10 sentences (should be less than 1 page long) and upload to Blackboard (under XC uploads)

More Related