1 / 4

Searching the Web

Searching the Web. The web can be considered a graph “ the web graph ” Web pages are the graph nodes Hyperlinks on pages are graph edges The web graph is huge (way over 8 billion nodes) - even infinite (pages are created on the fly)

cais
Download Presentation

Searching the Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Searching the Web • The web can be considered a graph “the web graph” • Web pages are the graph nodes • Hyperlinks on pages are graph edges • The web graph is huge (way over 8 billion nodes) - even infinite (pages are created on the fly) • For the web graph, DFS is not a good strategy (you get lost quickly) • You need to search your neighborhood before going deeper

  2. Syllabus Assignments Documentation CS Dept. Brian Text PLTC ... ... ... ... ... ... The Web Graph - Starting at CS230 Home CS230

  3. S A H I J C B D E G F Breadth-First Search (BFS): Neighbors First • In DFS you keep the nodes you visited on a stack so that you can find your way back • In BFS you keep the nodes* you visited on a queue so that you can explore them in the order you found them • At every visited node you also keep track of the direct path that took you there from S in a list* actually the paths to the nodes, not just the nodes Initialization While you have not reached G remove path from BFS queue and check at the last node L in the path extend the path to unvisited neighbors of L and add extended paths to back of queue

  4. S[S] A [SA] H [SH] I [SHI] J [SHJ] C [SAC] B [SAB] D [SACD] E [SACE] G[SHJG] F BFS: Queue of Paths S SA SH SH SAB SAC SAB SAC SHI SHJ SAC SHI SHJ SHI SHJ SACD SACE SHJ SACD SACE Initialization While you have not reached G remove path from BFS queue and check at the last node L in the path extend the path to unvisited neighbors of L and add extended paths to back of queue SACD SACE SHJG SACE SHJG … in two more steps it will reach the goal

More Related