1 / 22

Data Structures & Algorithms Overview: Recursion, Lists, Trees, Hashing, GIS

Learn about image manipulation, linked lists, stacks, queues, binary search trees, hashing, GIS, memory storage, and more in this comprehensive introduction to data structures and algorithms.

claire
Download Presentation

Data Structures & Algorithms Overview: Recursion, Lists, Trees, Hashing, GIS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 315 Lec 2, Jan 29 • Goals: • Finish the course overview • Introduction to recursion • Reading for the week: • Chapter 1: sections 1.1, 1.2 and 1.3

  2. Images stored in 2-dim arrays • We will work on 2-d arrays by manipulating images: • Each pixel is represented by a blue value, a red value and a green value (any color is a combination of these colors). (255, 255, 255) represents white, (255, 0, 0) represents red etc. • pic(i , j)-> Blue represents the blue component of the i-th row, j-th column pixel of pic and so on. • Some basic operations on images: • open, read, write • rotate, copy a sub-image • filter (remove blemishes) • extract features (identify where buildings are in an aerial photograph)

  3. Linked lists order is important • Linked lists: • Storing a sequence of items in non-consecutive locations of the memory. • Not easy to search for a key (even if sorted). • Inserting next to a given item is easy. • In doubly linked list, inserting before or after a given item is easy. • Don’t need to know the number of items in advance. (dynamic)

  4. stacks and queues • stacks: • insert and delete at the same end. • equivalently, last element inserted will be the first one to be deleted. • very useful to solve many problems • Processing arithmetic expressions • queues: • insert at one end, deletion at the other end. • equivalently, first element inserted is the first one to be deleted.

  5. Non-linear data structures • Trees • Binary search trees, expression tree • Quad-tree Lptr key Rptr 15 Main purpose of a binary search tree  supports dictionary operations efficiently

  6. Priority queue Max priority key is the one that gets deleted next. Equivalently, support for the following operations: insert deleteMin Useful in solving many problems fast sorting (heap-sorting) shortest-path, minimum spanning tree, scheduling etc.

  7. Hashing • Supports dictionary operations very efficiently (most of the time). • Main advantages: • simple design, easy to implement • on average very fast • not good in the worst-case

  8. What data structure to use? Example 1: There are many billions of web pages accessible to a search engine. When you type on the google search page something like: • you get instantaneous response. What kind of data structure is used here? • The details are quite complicated, but the main data structure used is quite simple.

  9. Data structure used - inverted index Array of lists – each array entry contains a word and a pointer to all the web pages that contain that word: 38 97 297 145 Data structure 876 Question: How do we access the array index from key word?Hashing is used.

  10. Example 2: The entire landscape of the world is being digitized (there is a whole new branch that combines information technology and geography called GIS – Geographic Information System). What kind of data structure should be used to store all this information? Snapshot from Google earth

  11. Some general issues related to GIS • How much memory do we need? Can this be stored in one computer? • Building the database is done in the background (off-line processing) • How fast can the queries be answered? • Response to query is called the on-line processing • Suppose each square mile is represented by a 1024 by 1024 pixel image, how much storage do we need to store the map of the United States?

  12. Calculate the memory needed • Very rough estimate of the memory needed: • Area of USA is 4 x 106 sq miles (roughly) • Each square mile needs 106 pixels (roughly) • Each pixel requires 32 bits usually. • Thus the total memory needed • = 4 x 106 x 32 x 106 = 168 x 1012 = 168000 Giga bits • (A standard desk top has ~ 200 Giga bits of memory.) • Need about 800 such computers to store the data

  13. What data structure to store the images? • each 1024 x 1024 image can be stored in a two-dimensional array. (standard way to store all kinds of images – bmp, jpg, png etc.) The actual images are stored in a secondary memory (hard disks on several servers either in a central location or distributed). • The number of images would be roughly 4 x 106. A set of pointers to these images can be stored in a 1 (or 2) dimensional array. • When you click on a point on the map, its index in the array is calculated. • From that index, the image is accessed and sent by a network to the requesting client.

  14. Overview of the projects • Generate all the poker hands • More generally, given a set of N items and a number k<= N, generate all possible combinations or permutations of k items. • (concept: recursion, arrays, lists)

  15. Overview of the projects 2) Image manipulation (concept: arrays, library, analysis of algorithm) (a) filtering After filtering

  16. 2b) Labeling an image

  17. (2c) Recursive image generation:

  18. 3) Bounding box construction: OCR is one of the early success stories in software applications. Scan a printed page and recognize the characters in it. First step: bounding box construction.

  19. 4) Spelling checker: Given a text file T, identify all the misspelled words in T. Idea: build a hash table H of all the words in a dictionary, and search for each word of the text T in the table H. (hashing, string processing)

  20. Last semester, peg-solitaire was the project that used hashing.

  21. 5) Image compression/decompression Run-length coding: 1111111111100000001110111111111 Can be coded as: 101121112112121001 Similar idea can be used in an image. If all the pixels of a subimage are the same, then we can store that subimage using a single pixel. Divide the image into quadrants and recursively apply this idea.

  22. 6) Geometric computation problem – given a set of rectangles, determine the total area covered by them. Also draw the contour. Concept: binary search tree.

More Related