480 likes | 601 Views
Outline. Introduction Crawler architectures Increasing the throughput What pages we do not want to fetch Spider traps Duplicates Mirrors. Introduction. Job of a crawler (or spider): fetching the Web pages to a computer where they will be analyzed . The algorithm is conceptually simple, but
E N D