1 / 61

Hidden Universes of Information on the Internet

Russ Haynal. Internet Instructor, Speaker, and Paradigm Shaker. 21015 Forest Highlands Ct Ashburn, VA 20147. Phone : 703-729-1757 russ@navigators.com. http://navigators.com. Hidden Universes of Information on the Internet. Part 2. Rev. 12/2012.

kylar
Download Presentation

Hidden Universes of Information on the Internet

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Russ Haynal Internet Instructor, Speaker, and Paradigm Shaker 21015 Forest Highlands Ct Ashburn, VA 20147 Phone : 703-729-1757 russ@navigators.com http://navigators.com Hidden Universes of Information on the Internet Part 2 Rev. 12/2012 Note: If you send me an email, put “internet training” in the e-mail's subject

  2. Course Outline specific_page.html Session Goal: Strengthen your Internet skills. (focused, accurate, successful) Course Outline: • Advanced Search Tools and Techniques • Country-Specific Content • Country-Specific Infrastructure • Advanced Tools and Traceroute Details • Paradigm Shake-up Online Web page = http://navigators.com/opensource.html

  3. Internet’s Growth Continues stats.html www.glreach.com

  4. Which have you bookmarked? search_tools.html Basic search Advanced Search • Advanced search page can be used just as easily as basic search page. • Seeing the options might remind you to use them Key Tip: Limit your searches to PDF or Powerpoint files to quickly locate detailed content from great web sites. www.google.com/advanced_search and www.google.com/preferences

  5. search.yippy.com search_tools.html • Yippy examines the first couple hundred hits, and groups them together into “clouds” • View the 10-15 hits you really want without reading through 200 mixed search results. • Saves you lots of time. • Ixquick.com - searches 10 other search tools • Stars show number of search engines that gave site a top 10 ranking. • Illustrates that search engines each have their own ranking algorithm

  6. Alexa.com search_tools.html • Like most toolbars, it “spies” on its users • Most of the information collected via the tool bar is available at alexa.com • Click on “site info” • Enter a Domain name • Click “get details” and then “related links” & “traffic stats” • This is a great way to discover new websites based on the traffic patterns of millions of Alexa Toolbar users

  7. Search Engine Comparison search_tool_details.html • There are several comprehensive online guides to search engines • Read them & Compare :-) • Some sites include PDF versions for easy print-out/ reference

  8. www.archive .org persona_example.html Web Servers copied Web page User Interface User PC • Surf through previous copies of a web site. • Deleting sensitive information from today’s web server does not remove it from archive.org Robot • Archive.org robot collects web pages like other search engines • Previous web page copies are not deleted Recent copy Archive copies • “document not found”? – Paste the address into archive.org • Viewing archived web pages will cause hits to live target website

  9. Archives of your data. How much of your information remains online in archives like the “wayback machine”?

  10. Surfing Upstream vs. Downstream search_upstream.html “Joe’s guide to MANY targets” “Upstream” Target.com #1 Most researchers follow the links “downstream” from an interesting page #2 Shows pages that link towards the target (=upstream) This is an Indication of the page’s “popularity” (= who knows about target.com) #3 Shows pages that have links to both target sites … will show “user pages” for the that topic #1 #3 #2 Target2.net Target.com Target.com

  11. Be Creative When Surfing UpstreamExample: Washington DC Tourist Sites search_upstream.html Museums / Educational Theatre links DC Tourism • Any combination of these target pages will lead you to “DC Tourism” pages, but certain pairings may also lead you to subject-specific pages www.nasm.si.edu (air & space museum) www.fordstheatre.org www.spymuseum.org www.kennedy-center.org

  12. Surfing Upstream Details search_upstream.html • You need to decide which scenario makes more sense; Row #1 or Row #2 • A 3rd and 4th site can be added if they are popular enough

  13. Searching within a site or domain name search_upstream.html • This technique can save you weeks of search time • Much faster than reading through thousands of web pages from a large website. • “use your imagination” to focus these searches.

  14. Who knows about your topic?(google search terms in red) search_upstream.html Example: Iranian cell phone Company (Irancell-MTN) Topic’s own website Marketing information Press announcement site:irancell.ir Equipment vendor Phones, networks Press announcement site:nokia.com iran Government Regulations, license site:gov.ir irancell Industry Magazine News, vendors, maps, Management interviews site:gsmworld.com iran Employees Resumes, Job Postings resume irancell site:linkedin.com irancell Resume’ Construction vendor Towers, networks site:vendorsname.com iran Customers Service issues, technology insights Irancell forum post Site:mob.ir irancell Investors Ownership, disclosures

  15. Business Databases Can be Quite Useful The Edgar Database search_tool_specialized.html • Most publicly held companies are required to file financial statements with the Securities Exchange Commission (SEC) • These filings are accessible to the public through the SEC’s EDGAR Database • READ forms 10-Q (quarterly report) and 10-K (annual report). These are very detailed reports about the company’s activities, plans, sales, etc • Seek out other business databases: financial, investment, government regulatory, etc

  16. Industry Specific Resources • There is a wealth of online information related to specific industries (technical, regulatory, marketing, professional societies etc.)

  17. Course Outline Session Goal: Strengthen your Internet skills. (focused, accurate, successful) Course Outline: • Advanced Search Tools and Techniques • Country-Specific Content • Country-Specific Infrastructure • Advanced Tools and Traceroute Details • Paradigm Shake-up

  18. Many country resources are online country_specific_content.html Phone books

  19. Comparing Two Russian Search tools...(query format is for www.blekko.com) country_specific_content.html • how many web pages link toward the search tool • You could also test the popularity of these sites at alexa.com (under “site info”) • Surf upstream to find another list of russian search tools link:http://www.aport.ru 2000 pages found link:http://www.bigmax.ru 400 Pages Found

  20. Many countries sell their domains domain_name.html • These were just some of the country domains available for sale • “All Domains” happens to be a licensed “registrar” for these countries • There are many additional countries who will sell their domain names to “anyone”

  21. Learn about the 2-letter code domain_name.html • Visit that country’s domain name registrar • www.iana.org/domains/root/db OR • www.norid.no/domenenavnbaser/domreg.html • What is the policy for getting a domain name? (citizenship, trademark, local presence, money) • What is the cost to register a domain name? • Are there any censorship clauses? • Does the Registrar require any proof of identity? (drivers license, passport, business license) • Is there a whois service? (make a bookmark)

  22. http://www.norid.no/regelverk/rammer/regelverksmodeller.en.htmlhttp://www.norid.no/regelverk/rammer/regelverksmodeller.en.html domain_name.html • An Analysis of domain name policies Notice: Most countries sell their domain names to “anybody”

  23. Example of Domains for Sale domain_name.html • Less than one third of Haiti domain names are registered to people with an address in Haiti • Almost half of Haiti’s Domain names are registered to U.S Addresses. • When you see a .ht website… is it necessarily foreign? .HT Domain Owners Iceland Fall 2008 Haiti UnitedStates • Notice the spike in foreign-owned domain names when during Iceland’s 2008 financial Crisis

  24. Course Outline Session Goal: Strengthen your Internet skills. (focused, accurate, successful) Course Outline: • Advanced Search Tools and Techniques • Country-Specific Content • Country-Specific Infrastructure • Advanced Tools and Traceroute Details • Paradigm Shake-up

  25. Country-Specific Infrastructure country_specific_infrastructure.html B USA B B • There are many different scenarios for a county’s connectivity • A “clear picture” will emerge after many Inter-country traceroutes • Initiate traceroutes in both directions (inbound & outbound) • Online web page has a detailed narration of many scenarios • example: How ISP “D” communicates with ISP “J” L K IX A IX I IX H F E J G D C Country 3 Country 2 Country 1 N M Global ISP Internet Exchange Local / regional ISP Each letter labels 1 ISP IX Country 4

  26. US-Centric Traceroutes traceroute.html • 2000 CAIDA Study: “Measurements of the Internet topology in the Asia-Pacific Region” • The U.S. is the major Internet transit intermediary for the rest of the world: 71% of traces that neither start nor end in the U.S. still pass through it. Example: South Africa to Russia via US 3 s7.ten-mr2-csir.uni.net.za 7 48NA209.sdn.net.za 8 wash-jhb-fr2.sdn.net.za 9 sdn-co-za-gw.digex.net 11 dca6-core2.atlas.digex.net 13 iad1-core11.atlas.digex.net 14 icm-mae-e-f0/0.icp.net 15 icm-bb1-dc-0-1-T3.icp.net 16 icm-bb10-dc-0-0.icp.net 17 icm-bb2-dc-1-0-0.icp.net 19 icm-bb11-pen-9-0.icp.net 21 usnyk105-tc-p0-0.ebone.net 22 gblon504-tc-p6-0.ebone.net 24 bebru203-tc-p2-0.ebone.net 25 nlams303-tc-p1-0.ebone.net 26 dedus205-tc-p1-0.ebone.net 28 sesto501-tb-p4-2.ebone.net 30 195.158.226.54 Apr/2005 Apr/2002 Jan/2000 A mapping of IPV6 shows Europe as the new “center” (As of Aug 2010)

  27. Router Router Router Router Router Router Router Router How Does it Work? traceroute.html • Internet started as “Packet Switching Networks” using TCP/IP (Transmission Control Protocol - Internet Protocol) • Every Internet connection has a unique IP Address consisting of 4 numbers, each number has a range of 0-255 (ie. 198.211.16.134) • Internet IP numbers are allocated through a hierarchy • IANA --> ARIN/RIPE/APNIC/LACNIC/AFRINIC --> ISP’s /Company/Country • Routers direct your packets of information along the “preferred” path Note: The next generation of IP Address space (IPV6) is quite LARGE 3,911,873,538,269,506,102 IP #’s per square meter of the Earth's surface 4,500,000,000,000,000 IP#’s for every observable star in the known universe

  28. Asymmetric Routing www.helios.de   registro.fapesp.br traceroute.html Germany  Brasil Brasil  Germany You can try this with any two sites listed at traceroute.org Via New York, Atlanta, Miami Via Miami, Washington, London 1 bb (200.160.2.1)0.254 ms 2 gw01 (200.160.0.228) 0.435 ms 3 200.142.94.157.metrored.net.br 3.991 ms 4 rbcor2-atm.rjo.metrored.net.br (200.225.76.222) 6ms 5 rbcor1-atm.rjo.metrored.net.br (200.225.72.213) 8ms 6 BRASIL-STM1-pm2R1-pacorR1.metrored.net 118 ms 7 bar3-serial4-1-0.Miami.cw.net (208.173.80.201) 118 ms 8 acr2-loopback.Miami.cw.net (208.172.98.62) 123 ms 9 -loopback.Washington.cw.net (206.24.226.103) 147 ms 10 dcr1-so.Washington.cw.net (206.24.238.57) 147 ms 11 bcr2.Thamesside.cw.net (166.63.210.62) 238 ms 12 zcr2-loopback.Londonlnt.cw.net (166.63.210.19) 239 ms 13 oscar.LON.router.COLT.NET (212.74.64.217) 231 ms 14 ar3.haj.de.colt.net (213.61.232.42) 260 ms 15 213.61.144.18 (213.61.144.18) 261 ms 16 pop9.pop-hannover.de (193.98.1.212) 263 ms 17 cishelios2.helios.de (193.141.98.1) 268 ms 18 proxy.helios.de (193.141.98.37) 268 ms 1 cishelios2.helios.de (193.141.98.1) 3 ms 2 pop9.pop-hannover.de (193.98.1.212) 6 ms 3 popcore.pop-hannover.de (193.98.1.213) 7 ms 4 ar3.haj.de.colt.net (213.61.144.17) 6 ms 5 213.61.232.45 (213.61.232.45) 6 ms 6 pos1-kyle.NYC.router.COLT.NET (212.74.74.169) 107 ms 7 so-4-1.nycmny1-hcr3.bbnplanet.net (4.25.133.37) 107 ms 8 nyc4so.xnycmny4-uunet.bbnplanet.net (4.0.2.42) 108 ms 9 0.so-6-0-0.XL2.NYC4.ALTER.NET (152.63.21.82) 108 ms 10 0.so-2-0-0.TL2.NYC8.ALTER.NET (152.63.0.185) 112 ms 11 0.so-7-0-0.TL2.ATL5.ALTER.NET (152.63.146.41) 126 ms 12 0.so-7-0-0.XL2.MIA4.ALTER.NET (152.63.86.193) 142 ms 13 POS7-0.GW7.MIA4.ALTER.NET (152.63.85.29) 142 ms 14 POS2-1.IG3.MIA4.ALTER.NET (65.208.80.142) 147 ms 15 MIAMI-STM1.metrored.net (200.49.77.14) 148 ms 16 BRASIL-STM1-.metrored.net (200.49.77.6) 259 ms 17 rjo.metrored.net.br (200.225.72.214) 261 ms 18 spo.metrored.net.br (200.225.76.221) 268 ms 19 .metrored.net.br (200.142.94.158) 267 ms 20 bb.registro.br (200.160.0.226) 268 ms 21 registro.br (200.160.2.3) 267 ms

  29. Some Definitions... • Telco - Company that own physical networking infrastructure (Fiber in the ground, switches, etc.) A Telco is often regulated by their country’s government • “Real ISP” - ISP’s that directly operate an IP network (Routers, Data Circuits) Data Circuits may be obtained (long term lease) from the local telco • “misc ISP” - ISP’s that depends on the “real ISP’s” for their existence. A “misc ISP” may be a very small localized ISP who depends on a “real ISP” for connectivity to the rest of the Internet. A “misc ISP” may also be a reseller of the “real ISP’s” services • Note: Many “telco’s” and “real ISP’s” are are part of the same company. Sometimes referred to as a “facilities-based ISP”

  30. "Country-Specific Infrastructure" country_specific_infrastructure.html A top-down approach… • Identify exchange points which serve that country or regional area • Exchange points may list connected ISPs • Exchange points may also mention telco providers, which provide infrastructure (fiber) to the ISPs • Identify the ISPs which provide service in that country • Examine the ISPs’ backbone maps • Watch for upstream providers, peering partners, and exchange points • Initiate multiple traceroutes in/out of target country

  31. Exchange Points country_specific_infrastructure.html • Visit the home page for each exchange point in your area. • Who operates the exchange point? • Look for the address of the exchange point • Look for list of telcos which provide circuits to the exchange point • May be described under FAQ’s or “how to connect” • Look for list of ISPs which that are connected • Do they provide a traceroute or a “looking glass” page? • Look under “tools” or “support”

  32. An Exchange Point can contain a wealth of information country_specific_infrastructure.html

  33. Global Carriers country_specific_infrastructure.html • Carriers such as AT&T, Sprint, Tata Communications, etc. are part of a core set of Telcos which partner to build infrastructure • Many ISP’s will get their international connections through these handful of carriers

  34. Who Owns the Company? country_specific_infrastructure.html • An ISP’s Map usually represents data circuits which must eventually be mapped to a physical circuit • UUNET happens to be owned owned by Verizon UUNET

  35. Regional ISPs country_specific_infrastructure.html • Notice how these regional ISP’s inter-connect with many Exchange Points • You would expect intra-county traffic to not criss-cross the Atlantic through the U.S. Carrier1

  36. European Backbone country_specific_infrastructure.html In a traceroute… Penn = Pennsylvania? Penn = Pennsauken, NJ • This regional backbone extends across to the U.S. • Ebone shows that they have U.S Connections at the Sprint NAP, and also with GTE, Sprint

  37. Country-Specific Backbone country_specific_infrastructure.html • Shows the ISP’s network within one country • Note the links outward to numerous peering points • Note the “uplinks” outward to C&W, Ebone, DFN

  38. City-Specific Infrastructure country_specific_infrastructure.html • City-wide Map of Fiber Network in Moscow  close-ups reveal access points Dulles Washington DC

  39. Vendors Reveal Details... country_specific_infrastructure.html A telco’s press announcements may tell you which vendors helped build their infrastructure

  40. Still looking for ISPs? country_specific_infrastructure.html • Use a county-specific search tool • http://www.iranyellowpages.net/ • Traceroute towards websites hosted within the country • Sites within country = homepages of universities, governments, Exchange Points, traceroute servers at traceroute.org • Surf Upstream from several country ISPs

  41. Web Hosting “Farms” country_specific_infrastructure.html • Hosting environment that is secure, server-friendly and well connected to the Major ISP’s • May provide complete services including content development, server management. • Others offer “rack-space” and utilities for co-location of User-supplied equipment see: datacentermap.com

  42. Third-party Sites Filled with Resources country_specific_infrastructure.html

  43. Course Outline Session Goal: Strengthen your Internet skills. (focused, accurate, successful) Course Outline: • Advanced Search Tools and Techniques • Country-Specific Content • Country-Specific Infrastructure • Advanced Tools and Traceroute Details • Paradigm Shake-up

  44. Connection DetectiveThree Online Scenarios connection_detective.html Mail server #1: All servers are hosted at target location • Target may put servers at local/remote locations based on bandwidth issues, costs, security, local skills, etc • Trace to www.target.com does NOT always = target’s access provider Web server target Internet #2: All servers are hosted with an online provider. Note: servers can be with different providers #3: Only one server is hosted at target location

  45. Finding a domain’s servers... connection_detective.html DNS Records show names of web servers and mail servers Results for: fool.com primary name server = ns1.fool.com responsible mail addr = dns.fool.com MX preference=100, mail exchanger=mail.uu.net MX preference =5, mail exchanger = spot.fool.com MX preference =10, mail exchanger = fido.fool.com MX preference =50, mail exchanger= mail13.motleyfool.com fool.com internet address = 208.51.76.1 ns1.fool.com internet address = 207.138.33.118 ns2.fool.com internet address = 208.51.76.222 mail.uu.net internet address = 199.171.54.106 spot.fool.com internet address = 208.241.66.9 fido.fool.com internet address = 198.83.60.55 mail13.motleyfool.com internet address = 208.241.66.23 Now run traceroute to the web server and mail servers

  46. Use a Swiss Army Knife... connection_detective.html • There are many websites that enable you to do traceroute, whois, DNS look-up, etc. • Some website will limit how many queries you can do (from a single IP address) robtex.com

  47. Looking in the neighborhood connection_detective.html • Reverse DNS Lookup - - Enter IP #, and it identifies associated Domain Name (if defined) • Some Tools will do this for a series of IP numbers • example search for neighbors of www.gov.ru 194.226.80.66 ipaccess.gov.ru 194.226.80.77 ns.gov.ru 194.226.80.78 www.council.gov.ru 194.226.80.88 apparat.gov.ru 194.226.80.129 apollo.gov.ru 194.226.80.145 ns.vpk.gov.ru 194.226.80.146 ts.vpk.gov.ru 194.226.80.147 ia.vpk.gov.ru 194.226.80.159 president.kremlin.ru 194.226.80.160 www.gov.ru 194.226.80.162 council.gov.ru 194.226.80.163 award.adm.gov.ru 194.226.80.164 kazak.adm.gov.ru 194.226.80.165 lib.adm.gov.ru 194.226.80.166 orgdiv.adm.gov.ru 194.226.80.167 protocol.adm.gov.ru 194.226.80.168 udprf.gov.ru 194.226.80.169 Msu.gov.ru 194.226.80.170 www.government.ru 194.226.80.171 www.youth.gov.ru 194.226.80.172 www2.scrf.gov.ru 194.226.80.173 www.vneshpol.gov.ru 194.226.80.177 time.gov.ru 194.226.80.187 mylex.gov.ru

  48. Router Router Router Router Router Router Router Router Autonomous System Numbers (ASNs) connection_detective.html • Most Internet providers have an Autonomous System Number • ASN’s are part of the announcement of “routing policies” between ISP’s. BGP= Border Gateway Protocol) • Global internet routing tables contains “all” such announcements. AS14312 AS701 AS512 “Traceroute = a path at individual router level” “AS Mapping = paths at ISP level of detail”

  49. Mappings of Autonomous Systems connection_detective.html source: robtex.com • maps the routes towards a specific AS. • some of these websites require Java • Fixedorbit.com has text interface Animation shows routing fluctuations source: http://bgplay.routeviews.org/

  50. Traceroute Tips • Initiate traceroutes towards websites hosted within your country (target pages= ISP’s, exchange point, country registrar, other Traceroute Servers, etc.) • Initiate traceroutes from diverse geographic starting points. • If possible, Initiate traceroutes from within the country heading outward (try traceroute.org) or search for: traceroute domain:ru • Are there key ISP’s / exchange points that appear in most of these traceroutes?

More Related