1 / 65

Introduction to Web Science

Introduction to Web Science. Web 1.0. Introducing Web 1.0. Packet switching network IP Addressing Internet Applications The WWW and markup Searching the WWW Intelligent Agents Internet Governance. Packet-Switched Networks (1). Local area network (LAN)

kevork
Download Presentation

Introduction to Web Science

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Web Science Web 1.0

  2. Introducing Web 1.0 • Packet switching network • IP Addressing • Internet Applications • The WWW and markup • Searching the WWW • Intelligent Agents • Internet Governance

  3. Packet-Switched Networks (1) • Local area network (LAN) • Network of computers located close together • Wide area networks (WANs) • Networks of computers connected over greater distances • Circuit • Combination of telephone lines and closed switches that connect them to each other

  4. Packet-Switched Networks (2) • Circuit switching is used in telephone communication • The Internet uses packet switching • Packet switching needs computers called ‘routers’ and the programs called ‘routing algorithms’

  5. Packet-Switched Networks (3) • Information is divided into packets • It is passed from node to node • It is recomposed as one chunk on the destination server

  6. Routing Packets • Routing computers • Computers that decide how best to forward packets • Routing algorithms • Rules contained in programs on router computers that determine the best path on which to send packets • Programs apply their routing algorithms to information they have stored in routing tables

  7. TCP/IP • Communications protocol suite • Packet switched protocol • No end-to-end connection is required • Each message broken down into small pieces called packets • Packets possibly routed to destination over different paths • Transmission Control Protocol (TCP) • Breaks messages into packets • Numbers packets in order • Reorders packets at the destination • Internet Protocol (IP) • Routes packets to the proper destination

  8. Open Systems Interconnections Model OSI Model (also called TCP/IP protocol suite) layers (from the highest to the lowest):

  9. IP Address • Internet addresses are based on a 32-bit number called an IP address • IP addresses appear as a series of up to four separate numbers delineated by a period • An address such as 126.204.89.56 uniquely identifies a computer connected to the Internet • IP Subnettingconceptually divides a large network into smaller sub-networks

  10. IP Classes (1)

  11. IP Classes (2)

  12. Subnetting

  13. Without subnetting … • Explosion in size of IP routing tables. • Every time more address space was needed, the administrator would have to apply for a new block of addresses. • Any changes to the internal structure of a company's network would potentially affect devices and sites outside the organization. • Keeping track of all those different Class C networks would be a bit of a headache in its own right.

  14. Benefits of Subnetting • Better Match to Physical Network Structure • Flexibility • Invisibility To Public Internet • No Need To Request New IP Addresses • No Routing Table Entry Proliferation

  15. IP Vr6 (or IP Next Generation) • Network Layer • Developed in 1994 • Will replace the IP Vr4 standard • limits on network addresses will eventually lead to exhaustion of available addresses (by 2023) • supports only 4,294,967,296 addresses (32bits) • Improvements include • providing future cell phones and mobile devices their own unique & permanent addresses • supports about 3.4 × 1038 (128bits)

  16. Domain Names • A Uniform Resource Locator (URL) consists of names and abbreviations that are much easier to remember than IP addresses • The HTTP protocol defines how an Internet resource is accessed • An address such as www.microsoft.com is called a domain name • Domain Name System (DNS) • A database of Internet names • DNS Servers convert Internet names to IP addresses • Top level domains

  17. Top-Level Domain Names • Internet Corporation for Assigned Names and Numbers (ICANN) • Responsible for managing domain names and coordinating them with IP address registrars

  18. Domain Name case study • The web was not an ‘open’ place • One company available where you could buy a .com, .net or .org domain • Price of 100 dollars and a two year minimum • Back then, there was a big chance you would be able to buy a dictionary word as .com • In 2000, they lost the monopoly position and domain prices dropped over 95% • Since then innovation halted and Network Solutions became one of the thousands anonymous domain registrars

  19. Internet Applications • E-Mail • File transfers • Instant messaging (IM) • Newsgroups • Streaming audio and video • Internet telephony • World Wide Web (WWW)

  20. E-Mail • Most popular and widely used Internet application • 30 billion e-mails sent every day • Spam – junk e-mail messages • Spam costs corporate America $9 billion per year • Every e-mail message contains head that describes source and destination for the message • E-mail messages are text, but may have attachments of many types of digital data • Viruses often transmitted via e-mail

  21. SMTP, POP, and IMAP (1) • E-mail is sent across the Internet is managed and stored by mail servers • Simple Mail Transfer Protocol (SMTP) is the standard to send mails to the server • Post Office Protocol (POP) is the standard to get mails from the server • The Interactive Mail Access Protocol (IMAP) is a newer e-mail protocol

  22. SMTP, POP, and IMAP (2)

  23. Controlling Spam • Use complex email addresses rather than name and surname combination • Why? Bots? Name Directories? • Control exposure of email address • How? Java script? JPEG? • Use multiple email addresses for different purposes • In what occasions? • Use content-filtering software • black list spam filter  • white list spam filter  • challenge response using graphical challenges ?

  24. E-Mail Case Study • Hotmail (1995) • First place to get a free email address, disconnected from an ISP • 4 years later, 30 million people worldwide were exchanging @hotmail email addresses • Bought by Microsoft in 1998 for just 400 million dollars • 2007 the end of Hotmail • transformation to “Live” mail to become an integrated part of the Microsoft’s “Live” family

  25. File Transfers • File transfer protocol (FTP) • Protocol providing for transmission of a file between an Internet server and a user’s computer • Peer-to-peer (P2P) file sharing • Share data from one computer to another • Every user can be a server • Napster • Kazaa • Gnutella • Torrent • With P2P, every user on the network can make data available to every other user on the network

  26. Instant Messaging • Allows user to create a private chat session with another user • IM started with AOL • IM sneaking into corporate networks • Many Web-based companies use IM technology for customer service • eBay

  27. ICQ case study • ICQ abbreviation of “I seek you” • 1996 first easy to use instant messenger program where you could add friends to your list, and see if they were online • Back then it was revolutionary for the masses and it became the ‘application’ everybody had installed • Acquired by AOL in June 1998 for a whopping $287 million   • Eventually the program got too many additional features that made the application heavy and unorganized • Competition of AOL IM, Yahoo IM, and MSN Messenger increased, and friends on your ICQ-list left the application eventually resulting in a mass abandoning of the network

  28. Usenet Newsgroups • Online, bulletin board discussion forums • Users post and read messages • More than 100,000 newsgroups • Millions of newsgroup readers • Important information resource, especially for technical issues and products • Newsgroup messages distributed using open standard • Many are uncensored

  29. Streaming Audio and Video • Creating and sending audio and video files • Sports • Basketball at sports.yahoo.com • Major league baseball • News • Fox News • CNN radio • Business • ZDNet • Education • Warriors of the Net

  30. Internet Telephony • Voice-over Internet Protocol (VoIP) • Use your computer like a telephone • Software connects computers via the Internet and transmits voice data • Savings comes from eliminating toll charges between locations

  31. Internet TV

  32. The World Wide Web • Collection of hyperlinked computer files on the Internet • Client-server application • Web servers • Web browsers as clients • WWW standards • Hypertext markup language (HTML) • Current standard for writing Web pages • Tags in HTML instruct the client browser how to format and display the Web page content • Hypertext transfer protocol (HTTP) • Establishes a connection between Web server and client • Extensible markup language (XML) • A meta-markup language • Gives meaning to the data enclosed within XML tags

  33. Website case study • Create your own free homepage on the web • 1997 Fifth most popular website, with over 500,000 homepages created • Yahoo bought Geocities two years later for $3.57 billion dollars and started to actively commercialize the homepages with various advertising types that resulted in their death sentence • ‘Real’ web hosting becoming affordable for anybody, the need for free homepages in this form vanished

  34. Overview of Markup Languages • SGML is a rich meta language that is useful for defining markup languages • HTML is particularly useful for displaying Web pages • XML defines data structures for electronic commerce (and much more …)

  35. http://www.w3.org/ Development of Markup Languages

  36. Standard Generalized Markup Language • The ISO adopted SGML standard in 1986 • SGML is nonproprietary and platform-independent • SGML supports user-defined tags and architecture to complement the required richness of documents

  37. Extensible Markup Language • XML is a descendant of SGML • XML allows designers to easily describe and deliver structured data from any application in a standard, consistent way • XML can be embedded within an HTML document • XML allows you to create your own customized markup language.

  38. Learn XML in a slide  • Tag – a piece of Markup • An opening tag <name> • A closing tag </name> • Element – well formed usage of tags • <name>Alexiei</name> • Attribute – properties • <name length=“7”>Alexiei</name> • Rules to keep XML well formed • Can be nested but not overlapping • Case sensitivity • Quoted attributes • Required end tag • Short hand • <abc></abc> is equivalent to <abc/>

  39. Some XML examples <book>E-Commerce</booK> <book pages=100>E-Commerce</book> <book pages=“100”><title>E-Commerce</book></title> <book pages=“100”><title>E-Commerce</title></book> <book pages=“100”> <title>E-Commerce</title> <author> <name>Gary</name> <surname>Schneider</surname> </author> </book>

  40. Some XML examples <book>E-Commerce</booK> <book pages=100>E-Commerce</book> <book pages=“100”><title>E-Commerce</book></title> <book pages=“100”><title>E-Commerce</title></book> <book pages=“100”> <title>E-Commerce</title> <author> <name>Gary</name> <surname>Schneider</surname> </author> </book>

  41. Processing a Request for an XML Page • Why going through all this hassle? • How would you go about displaying HTML on a • PC • Handheld • Mobile

  42. Hypertext Markup Language • Tim Berners-Lee invented HTML • HTML is a document production language that includes a set of tags that define the format and style of a document • HTML is based on SGML • HTML is an instance of one particular SGML document type – Document Type Definition (DTD)

  43. HTML Tags • An HTML document contains both document content and tags • The tags are the HTML codes inserted in a document to specify the format on screen • Each tag is enclosed in brackets (< >) • Most tags are two-sided – opening and closing tags • Well formed tags, bots, meta tags?? Why are they important?

  44. HTML Links • Hyperlinks are bits of text that connect the current document to: • Another location in the same document • Another document on the same host machine • Another document on the Internet • Can they link to a toaster at home? • Hyperlinks are created using the HTML anchor tag • Two popular link structures: • Linear hyperlink structure • Hierarchical hyperlink structure

  45. HTML Version History • HTML version 1.0 was introduced in 1991 • HTML 2.0 was released in Sept. 1995 • HTML 3.2 was introduced in 1997 • HTML 4.0 was released by W3C in Dec 1997 • HTML 4.01 was released in Dec 1999 • XHTML 1.0 became a W3C recommendation in Jan 2000

  46. HTML Editors (1) • Low end editor displays HTML code on the screen and allow you to insert HTML tag pairs by clicking selected buttons • High end editor are Web site builder programs, they provide a rich environment that displays the Web page, not the HTML code • Microsoft FrontPage and Macromedia Dreamweaver are examples of Web site builders

  47. HTML Editors (2)

  48. Static versus Dynamic Pages • HTML and XML only display and exchange data • No interactivity; no processing of data • Scripting languages • Provides basic interactivity • Rollovers • Crawling text • JavaScript • VBScript • Full-featured Web programming • Java • Client side scripting or browser side scripting • Applets • J2EE • Common Gateway Interface (CGI) • Allows passing of data between a static HTML page and a computer program

More Related