Search engines
1 / 30

Search Engines - PowerPoint PPT Presentation

  • Uploaded on

Search Engines. Introducing. Directories, Meta-Searchengine How search engines work What influences the ranking. Directories. hand-constructed hierarchy of topics (e.g. Yahoo!) use human editors for page selection, indexing and classification Covers a small part of the web

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Search Engines' - grady-tyler

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Search engines

Search Engines

Thomas Haidlas


  • Directories, Meta-Searchengine

  • How search engines work

  • What influences the ranking

Thomas Haidlas


  • hand-constructed hierarchy of topics (e.g. Yahoo!)

  • use human editors for page selection, indexing and classification

  • Covers a small part of the web

  • Small updatability

  • No ranking

Thomas Haidlas

Directories ii
Directories II

  • No searching across the index

  • Searching across the reviews

  • Sometimes partnership with search engines to increase coverage

Thomas Haidlas

Meta searchengine

  • Rare keyword requests require use of more than one web search engine

  • Submit the same query parallel to many engines

  • Duplicated entries are eliminated

  • The results are shown in uniform format

  • No harvesting or indexing

Thomas Haidlas

How search engines work
How search engines work

  • Harvesting

  • Indexing

  • Analyzing Requests

  • Ranking

Thomas Haidlas


  • programs (robots, gatherer or crawler )visit web sites and gather the web pages for indexing

  • Start with an initial page

  • Follows hyperlinks (<a href=…>)

  • Sometimes, more then 2 sub-levels are visited

  • These programs are started periodically

Thomas Haidlas

Harvesting ii
Harvesting II

  • Problems:

    • Links aren‘t found in

      • Frames

      • Imagemaps

    • Many robots are started by a search engine

      => traffic

Thomas Haidlas

Robot exclusion
Robot Exclusion

  • Two Methods:

    • Meta-Tags:

      <meta name="robots" content=„noindex,nofollow">

    • robots.txt:

      User-agent: Scooter

      Disallow: /privat/geht_dich_gar_nix_an.html

      Allow: /allesOffen

Thomas Haidlas

Robot exclusion ii
Robot Exclusion II

  • robots.txt (Example 2):

    User-agent: *

    Allow: /allesOffen

Thomas Haidlas


  • Indextable gets the harvesting-resuls

  • Indextable includes keywords

  • Table is located in main-mamory => fast access

Thomas Haidlas

Analysing requests
Analysing Requests

  • Comparison between searchstring and index-table

  • The searchstring consists of a word:

    => easy processing

  • The search word consists of truncation or booleans:

    => complex processing

  • If the searchstring in the index is discovered, the side is taken up to the hit-list

Thomas Haidlas


  • influences on the ranking:

    • How many keywords are found

    • keyword-frequency

    • keywords-position:

      • Domain/URL

      • Documentname

Thomas Haidlas

Ranking ii
Ranking II

  • Headline

  • Early in the text

  • Meta-Tags

  • Ranking for cash

  • Page Rank

  • Clicking frequency/ Hit Popularity Engine

  • Thomas Haidlas

    Ranking for cash
    Ranking for cash

    • Capitalism principle

    • Paying money => high ranking-level

    • Contents are not relevant

    • additional incomes

    Thomas Haidlas

    Ranking for cash ii
    Ranking for cash II

    • not independently in the employment

    • Mostly used by e-commerce-companies

    • Second method:

      • pay for faster indexing time

    Thomas Haidlas

    Page rank google
    Page Rank (Google)

    • Evaluation through internet-community (web-admins)

    • Realtion between quality of a page and number of links that point to it

    • Links of the popular web-sites are regarded as better

    Thomas Haidlas

    Page rank google ii
    Page Rank (Google) II

    • Disadvantage:

      • new web-sites have a bad ranking

      • Querys with many boolean-connections and keywords are not easy to process

    Thomas Haidlas

    Hit popularity engine
    Hit Popularity Engine

    • index already exists and is pre-sorted

    • A click on a link leads to a voting for this site concerned => „click“ is recorded to the database

    • pages with many „clicks“ are more popular

    • developed by „Direct Hit“

    Thomas Haidlas

    Hit popularity engine ii
    Hit Popularity Engine II

    • This method is usually combined with others

    • Disadvantage:

      • new web-sites have a bad ranking

    Thomas Haidlas

    Ranking m anipulation

    • Why?

      • commercial interest

    • Done of:

      • Search Engine Optimizer, SEO

    • Sense of:

      • to boost the pagerank

    Thomas Haidlas


    • Many Domains are registered

    • Programs generate thousands among themselves linked pages

    • each page contains keywords

    • Partly these sides are arranged even complex

    Thomas Haidlas


    • intermediate page contains the looked for terms

    • HTML Meta tags and simple Javascript can be recognized

    • SEO‘s complicate the forwarding instructions => no recognizing

    Thomas Haidlas

    Ip delivery
    IP Delivery

    • normal site is indicated by Robots

    • After this, contents of the site are exchanged

    Thomas Haidlas

    Ip cloaking
    IP Cloaking

    • Servers programs determine who the Request starts

    • Robots request: "cloaked" content is delivered which is designed to influence ranking

    • Human visitors: do not see the "cloaked" content

    Thomas Haidlas

    Other simple tricks
    Other simple tricks

    • Links in guestbooks

      • particularly effectively with high-ranking guestbooks

    • „Blind Text“

      • Text in background-color

    Thomas Haidlas

    Trade with weblinks
    Trade with weblinks

    • Paying for linking

    • Partnership =>Commission

    Thomas Haidlas


    • suitable tools select

    • The www is dynamic =>

      new developments consider

    • correct estimate of ranking

    Thomas Haidlas

    Thank you
    Thank You!

    Thomas Haidlas


    • [1]

    • [2] Jo Bager Orientierungslose Infosammler c‘t 23/99

    • [3] Stefan Karzauninkat Zielfahndung c‘t 23/99

    • [4] Sven Lennartz Ich bin wichtig c‘t 23/99

    • [5] Stefan Karzauninkat Google zugemüllt c‘t 1/03

    • [6]

    • [7] Dr. Wolfgang Sander-Beuermann Schatzsucher c‘t 13/98

    • [8] Arno Dittmar Suchmaschinen und Anfragen im WWW

    • [9] Ralf RudolfSuchmaschinen und Anfragen im WWW

    Thomas Haidlas