1 / 67

World Wide Web

World Wide Web. Presented By Bharath Praveen Swathi. World Wide Web. The World Wide Web was created in 1989 by Tim Berners-Lee, working at the European Organization for Nuclear Research (CERN) in Geneva, Switzerland and released in 1992 Web - Accessing information over internet

zelig
Download Presentation

World Wide Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. World Wide Web Presented By Bharath Praveen Swathi

  2. World Wide Web • The World Wide Web was created in 1989 by Tim Berners-Lee, working at the European Organization for Nuclear Research (CERN) in Geneva, Switzerland and released in 1992 • Web - Accessing information over internet • It is not Internet – Network of networks • Email (SMTP), File sharing (FTP) • System of interlinked documents • Browser / Web Browser • The first Web browser, written by Tim Berners Lee and introduced in early 1991 ran on NeXT

  3. Architecture

  4. URI, URN & URL • <URI> := <scheme> : <scheme-specific-part> • Difference between URL, URN, and URI: • URL: http://www.tmrf.org/kpr/issue1.htm • URN: www.tmrf.org/kpr/issue1.htm#one • URI: http://www.tmrf.org/kpr/issue1.htm#one

  5. Web Protocols ARP: Address Resolution ProtocolDHCP: Dynamic Host Configuration ProtocolDNS: Domain Name ServiceDSN: Data Source NameFTP: File Transfer ProtocolHTTP: Hypertext Transfer ProtocolIMAP: Internet Message Access ProtocolICMP: Internet Control Message ProtocolIDRP: ICMP Router-Discovery ProtocolIP: Internet ProtocolIRC: Internet Relay Chat ProtocolPOP3: Post Office Protocol version 3PAR: Positive Acknowledgment and RetransmissionRLOGIN: Remote LoginSMTP: Simple Mail Transfer ProtocolSSL: Secure Sockets LayerSSH: Secure ShellTCP: Transmission Control ProtocolTELNET: TCP/IP Terminal Emulation ProtocolUPD: User Datagram ProtocolUPS: Uninterruptible Power Supply

  6. HTTP Hyper Text Transfer Protocol • HTTP 1.1 • Persistent connections • Pipelining • Cache validation commands • Request Types: GET, POST, PUT, HEAD, DELETE, TRACE, OPTIONS, CONNECT

  7. Request & Response • Request • GET • POST

  8. Languages used • Client Side • HTML, CSS, Javascript, AJAX, Flex3 • Server Side • .NET (Asp.net, VB.net, c#.net) • Java (JSP, Servlets, Plain java class) • CGI Perl / PHP • Other Languages • Ada 95, Applescript, BEF & Dylan (similar to PASCAL), CCI (Common Client Interface) , CMM, Guile, Hypertalk, Icon, KQML (Knowledge Query and Manipulation language), Linda, Lingo, Lisp, ML, Modula 3, Obliq, Phantom, Python, ReXX, ScriptX, SDI (Software Development Interface),VRML

  9. Web 2.0 • AJAX • Reverse AJAX • Democracy • (Wiki, reddit, digg, youtube) • RIA • SOA • Mashups • Widgets • Feeds, RSS, Web services • Blogging • Tagging

  10. Ajax • Architecture

  11. Ajax • Technologies Associated • XHTML & CSS for presentation • DOM to interact with data • XML & XSLT for interchange and manipulation of data • XMLHttpRequest object for asynchronous communication • Javascript to integrate all the above technologies • Advantages • Fast, No reload, updates the section of a page • Disadvantages • Actions are not registered with browser’s history • Need an alternate way to be indexed • JavaScript must be enabled on the browser • Server load

  12. Reverse AJAX • Server pushes data to all alive clients • DWR Direct Web Remoting

  13. Mashups • Mixing multiple service together to produce new • Types: Data & Enterprise mashups • Tools: Microsoft Popfly, Yahoo Pipes, Google Mashup editor

  14. Widgets • UWA Universal Widget API from NetVibes

  15. Feeds – RSS, JSON, Atom

  16. Web 3.0 • The Data Web • making data as openly accessible and linkable as Web pages • Querying for data across distributed RDF databases • Semantic web

  17. Open Social • A common API for social applications across multiple websites • Supports interoperability with other social networks that support them • Core Services: People & Friends, Activities, Persistence • Platforms: google, hi5, myspace, Imeem • HTML, JavaScript, REST, OAUTH

  18. Summary • Making the web more social • Current version 0.7 • Orkut, MySpace, hi5, Netlog, Imeem, Linkedin • Easy to get data • Apache Shindig: to host open source applications

  19. Semantic Web • Introduction • History • Architecture • Challenges • Future • Conclusion Logo of Semantic Web

  20. What is Semantic Web ? • Meaningful representation of data on World Wide Web • Processed by humans as well as machines in global scale

  21. Why do we need Semantic Web ? • Enhanced Search and Discovery • Enhanced System and Data Interoperability • Knowledge Management • Semantic Web Service • Electronic Commerce

  22. History • 1989 – Vision of Tim-Berners Lee • 1994 – Presented at first WWW conference • 2002 – Architecture

  23. Architecture Source: Lee, T. B. Semantic Web - XML2000 – Architecture. Retrieved July 11, 2008 from http://www.w3.org/2000/Talks/1206-xml2k-tbl/slide10-0.html

  24. Unicode and URI • Unicode – International standard for encoding text • Ex: UTF-8, UTF-16 • URI – Universal Resource Identifier • Uniform Resource Locator (URL) • Identify resources via a representation of their primary access mechanism • Ex: http://seal.ifi.unizh.ch • Universal Resource Name (URN) • Globally unique and persistent even when the resource ceases to exist or becomes unavailable. • Ex: urn:ISBN:0-395-36341-1

  25. XML and Namespace • eXtensible Markup Language • Stores data in related entities • Provides standard for storage layout and logical structure • Supports syntactic interoperability • Namespace • Elements and attributes have expanded names • Expanded name = Namespace name + Local name • Namespace name – name holding URI • XML Schema

  26. RDF – Resource Description Framework • Language for representing metadata of web resources • Framework for exchange of information between applications without loss of meaning

  27. RDF Model • Resource - Thing being described by RDF expression • Property - Specific aspect, characteristic, attribute, or relation used to describe a resource. • Statement - A specific resource + a named property + the value of that property for that resource • Represented as 3-tuple – Subject, Predicate and Object • Ex: http://www.example.org/index.html has a creator called John Smith

  28. RDF Model - Example Source: Manola, Miller, McBride (2004, February). The RDF Primer. W3C Recommendations.

  29. RDF Model – Example (Contd…) Source: Manola, Miller, McBride (2004, February). The RDF Primer. W3C Recommendations.

  30. Why RDF and not just XML ? • Many XML trees for single 3-tuple • XML parser cannot distinguish subject, object and property • RDF model – direct, unambiguous and decentralized

  31. Why RDF and not just XML ? (Contd…) • Example • 3-tuple (index.html, John Smith, author) • Relationship: Index.html has author John Smith • <?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf- syntax-ns#" xmlns:exterms="http://www.example.org/terms/"> <rdf:Description rdf:about="http://www.example.org/index.html"> <exterms:creator>John Smith</exterms:creator> </rdf:Description> </rdf:RDF>

  32. Why RDF and not just XML ? (Contd…) • Possible XML trees • <author> <uri> Index.html </uri> <name>John Smith</name> </author> • <document href=" Index.html "> <author> John Smith </author> </document> • <document> <details> <uri>href=" Index.html "</uri> <author> <name> John Smith </name> </author> </details> </document> or maybe • <document> <author> <uri>href=" Index.html "</uri> <details> <name> John Smith </name> </details> </author> </document>

  33. RDF Schema (RDFS) • Collection of classes authored for specific purpose or domain • Classes organized in hierarchy • Describes inheritance hierarchies, class schemas, properties, domain and range and restriction for properties • Supports extensibility and reusability • Multiple views of same metadata

  34. RDFS - Example • <Class ID=“Animal”> • <Class ID="Male"> <subclass Ofresource="#Animal"/> </Class> <Class ID="Female"> <subclass Ofresource="#Animal"/> <disjointFrom resource="#Male"/> </Class>

  35. Web Ontology Language (OWL) • Extends from RDFS • Specifies axioms based on the classes of entities, their properties and relationships • Draw inference based on axioms

  36. OWL (Contd…) Source: Lee, T. B. Semantic Web Road map. Retrieved July 11, 2008 from http://www.w3.org/DesignIssues/Semantic.html

  37. Challenges • Standardizing Semantic Web Stack • Developing Ontologies • Converting existing WWW into Semantic Web • Capturing Cultural Semantics • Interoperability Issues

  38. Some News… • SPARQL Protocol • Semantic Search Engines – Google, Yahoo, Intelliseek • Jena Semantic Web Toolkit – HP • Joseki Web API – HP • Wilbur – Nokia

  39. What is Cloud Computing? Web 2.0-enabled PCs, TVs, etc. 4+ billion phones by 2010 [Source: Nokia] An emerging computing paradigm where data and services reside in massively scalable data centers and can be ubiquitously accessed from any connected devices over the internet.

  40. Characteristics of Cloud Computing • Virtual – Physical location and underlying infrastructure details are transparent to users • Scalable – Able to break complex workloads into pieces to be served across an incrementally expandable infrastructure • Efficient – Services Oriented Architecture for dynamic provisioning of shared compute resources • Flexible – Can serve a variety of workload types – both consumer and commercial

  41. Cloud Computing Building Blocks A massively scalable and flexible computing platform of the future,built on IBM and open source software, for hosting Web 2.0 and SOA applications. Enabling Technologies Business Benefits • Open source Linux platform • Xen open source systems virtualization • Automated provisioning of computing resources by Tivoli Provisioning Manager • Systems management and monitoring by IBM Tivoli Monitoring • Parallel computing clusters using Apache Hadoop • Open source Eclipse-based development tools for parallel applications • Cost efficient model for creating • and acquiring information services • Removes or reduces IT management complexity • Increases business responsiveness with real-time capacity reallocation • Powers rich internet applications

  42. Cloud Computing Architecture Apache Virtual Machine Virtual Machine Virtual Machine Virtual Machine Tivoli Monitoring Agent Open Source Linux with Xen Virtualized Infrastructure based on Open Source Linux & Xen Data Center – System x Provisioning Baremetal & Xen VM Monitoring IBM Monitoring v.6 DB2 Provisioning Manager v.5.1 WebSphere Application Server Provisioning Management Stack

  43. Examples of Cloud Computing Workloads • Web 2.0 applications • Software to scan voluminous Wikipedia edits to identify spam • Organize global news articles by geographic location • Data-intensive workloads based on scalable architectures. • Next generation rich media, such as virtual worlds, streaming videos, etc. • New services can be created and published via a completely integrated Eclipse-based environment

  44. Joint IBM Google Announcement • Train future workforce with next generation computing skills • University initiative to promote open standards and emerging parallel computing model • Jointly provide compute platform of the future including hardware, software, and services to support new parallel computing curricula • Three active “clouds” IBM Almaden Research Google Universities participating in initial pilot U. Of Washington

  45. Web Mining Web mining is the use of data mining techniques to automatically discover and extract information from Web documents/services

  46. Why is Web Information Retrieval Important? • Research • Health/Medicine • Travel • Business • Entertainment • Arts

  47. Why is Web Information Retrieval Difficult? • The Abundance Problem • Hundreds of irrelevant documents returned in response to a search query. • Limited Coverage of the Web • Largest crawlers cover less than 18% of Web pages • The Web is extremely dynamic 􀂄 Lots of pages added, removed and changed every day 􀂄 Very high dimensionality (thousands of dimensions) 􀂄 Limited query interface based on keyword-oriented search 􀂄 Limited customization to individual users

  48. Web Mining Taxonomy Web Mining Web Content Mining Web Structure Mining Web Usage Mining

  49. Web Mining Taxonomy • Web content mining: focuses on techniques for assisting a user in finding documents that meet a certain criterion (text mining) • Web structure mining: aims at developing techniques to take advantage of the collective judgment of web page quality which is available in the form of hyperlinks • Web usage mining: focuses on techniques to study the user behavior when navigating the web (also known as Web log mining and clickstream analysis)

  50. Web Content Mining • Can be thought of as extending the work performed by basic search engines. • Search engines have crawlers to search the web and gather information, indexing techniques to store the information, and query processing support to provide information to the users • Web Content Mining is: the process of extracting knowledge from web contents

More Related