330 likes | 515 Views
Introduction. This project presents a model for predicting high-traffic areas of a Web mapModel output indicates where server-side cache of map tiles should be created. Project objectives. Describe server-side caching of map tilesDescribe the need for selective cachingPresent a predictive mo
E N D
1. A predictive model for frequently viewed tiles in a Web map Sterling QuinnMGIS CandidateESRI ArcGIS Server Product Engineer
Mark GaheganFaculty Advisor
2. Introduction This project presents a model for predicting high-traffic areas of a Web map
Model output indicates where server-side cache of map tiles should be created
3. Project objectives Describe server-side caching of map tiles
Describe the need for selective caching
Present a predictive model for popular areas of the map
Describe ways the model could be used and evaluated
4. Web map optimization and the advent of server-side caching
5. Organizing large maps in manageable “tiles” is not new Large paper map series are indexed in organized grids
CGIS, a pioneering GIS, used “frames” to organize data (right)
6. Other techniques for organizing maps in tiles or grid systems Pyramid technique successively generalizes rasters in groups of four cells (right)
Quadtree structures index datasets in a hierarchy of quadrants
7. The modern map tile JPG or PNG image
Standard square dimensions (256 x 256 or 512 x 512)
Stored in large “caches” on the server at multiple scales
8. Server-side caching of map tiles is new Traditional map servers (ArcIMS, WMS) draw the image on the fly
Can take a while if the map is complex
Cached map tiles give extremely fast performance
Tiled maps allow users to retrieve just the needed pieces of the map
9. Advent of tiled maps and server-side caching Microsoft Terra Server an early deployment of massive amounts of cached imagery tiles
Google Maps serves cached map tiles with AJAX techniques to create a “seamless” Web mapping experience
12. Caching options
13. Current caching options Current GIS software allows analysts to create tile caches for their own maps
ESRI’s ArcGIS Server
Mapnik
Microsoft MapCruncher
14. Caching can require enormous resources on the server Caches covering big areas at large scales can include millions of tiles
Many gigabytes, or even terabytes of storage
Days, weeks, or sometimes months to generate
Many GIS shops lack resources to maintain large caches
15. Selective caching as a strategy for saving resources Administrator can cache only the areas anticipated to be most visited
Remaining areas can be:
Added to the cache “on-demand” when first user navigates there
Filled with a “Data not available” tile
16. Benefits of selective caching Wise because some tiles (ocean, desert) will rarely, if never, be accessed
Saves time
Saves disk space
17. Implications of selective caching Requires an admission that some areas are more important than others
Poses challenge of predicting popular areas before the map is released
18. The need for a predictive model
19. Project presents a predictive model for where to pre-cache tiles “Which places are most interesting?”
Inputs are datasets readily available to GIS analyst
Output vector features a template for where to pre-cache tiles
20. Purpose of the model Help majority of users see a fast Web map while minimizing cache creation time and storage space
21. Not a descriptive model Descriptive model shows where users have already viewed
Microsoft Hotmap good example of a descriptive tool (right)
Descriptive models useful for deriving and validating predictive models
22. Advantages of a predictive model Doesn’t require the map to be deployed already
Can include fixed and varying geographic phenomena
Has applications far beyond map caching
23. Proposed methods
24. Study area and conditions Model predicts frequently viewed places for a general base map
May create models for thematic maps if time allows
Study area of California
25. Input datasets Populated / developed areas
Road networks
Coastlines
Points of interest
26. Populated / developed areas Human Influence Index grid by the Socioeconomic Data and Applications Center (SEDAC) at Columbia University
Model selects all grid cells over a certain value
27. Road networks Major roads buffered by a given distance
All roads within national parks, monuments, historical sites, and recreation areas, buffered by a given distance
28. Coastlines All coastlines buffered by a given distance (wider buffer on inland side)
29. Points of interest Set of 60 interesting points chosen by model author
Mountain peaks
Theme parks
Sports arenas
Etc.
Represents a flexible layer that could be tailored to local needs
30. Deriving the output Merge all layers together
Clip to California outline (with small buffer)
Remove small holes and polygons
Dissolve into one multipart feature
Simplify to remove unneeded vertices
31. Using the model output Output a vector dataset that can be used as a template for creating cached tiles
Compare model output area with total area to understand percent coverage
Compare model output with actual usage over time
Refine if necessary
32. Limitations Models of world scope should account for Internet connectivity
Input datasets have varying collection dates
Input datasets vary in resolution and precision
Maps with many scales might require multiple iterations and variations of the model
33. Questions?
34. References De Cola, L. & Montagne, N. (1993). The PYRAMID system for multiscale raster analysis. Computers & Geosciences, 19(10), 1393 – 1404.
Tomlinson, R. L., Calkins, H. W., & Marble, D. F. (1976). Computer Handling of Geographical Data. Paris: Unesco.