Introduction and analysis of web 2 0 technologies
1 / 67

Introduction and Analysis of Web 2.0 Technologies - PowerPoint PPT Presentation

  • Uploaded on

Introduction and Analysis of Web 2.0 Technologies. November 23, 2006 Jaesun Han ( [email protected] ) Research Fellow / Ph.D ANLAB, Dept. of EECS, KAIST Contact : Contents. Introduction of Web 2.0 Key Philosophy of Web 2.0

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Introduction and Analysis of Web 2.0 Technologies' - olisa

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Introduction and analysis of web 2 0 technologies

Introduction and Analysis of Web 2.0 Technologies

November 23, 2006

Jaesun Han ([email protected])

Research Fellow / Ph.D


Contact :


  • Introduction of Web 2.0

  • Key Philosophy of Web 2.0

    • Contents Production by User Participation

    • Decentralization of Contents Consumption

    • Contents Sharing by Openness

  • Web as Platform

    • Definitions

    • Case Studies

    • Enabling Technologies

      • Client technologies

      • Server technologies

      • Global Platform technologies

The origin of web 2 0
The Origin of Web 2.0

New Conference?

Question: What are the characteristics of

web companies surviving dot-com collapse?

(Amazon, eBay, Yahoo, Google, etc)

Seven principles of web 2 0
Seven Principles of Web 2.0

1. The Web as Platform

2. Harnessing Collective Intelligence

3. Data is the Next Intel Inside

4. End of the Software Release Cycle

5. Lightweight Programming Models

6. Software Above the Level of a Single Device

7. Rich User Experiences

-from Tim O’Reilly’s “What is Web 2.0?”

Traditional contents distribution
Traditional Contents Distribution




Information Production

Information Producer


Information Consumer


Key philosophy of web 2 0

Web as


Key Philosophy of Web 2.0




(web services, RSS)








Contents production by user participation

UCC(User-Created Contents)

Contents Production by User Participation





Ucc services
UCC Services






photo storing

& sharing

video storing

& sharing

blog & wiki

Expansion of user paricipation
Expansion of User Paricipation

User-Created Media

User-Created Software

P2P Network

 User-Generated Network

WiFi Community for free WiFi access

 User-Generated Infrastructure

Collective intelligence
Collective Intelligence

  • Collective Intelligence

    • Crowd's "collective intelligence" will produce better outcomes than a small group of experts (Users add value)

       Network effects from user contributions

    • Requirements for Collective Intelligence

      • diversity of opinion

      • independence of members from one another

      • decentralization

      • a good method for aggregating opinions

  • Example cases of aggregating methods

    • Google PageRank

    • Yahoo Collaborative Tag Suggestion

    • Amazon Recommendation

Google pagerank
Google PageRank

  • The Philosophy of PageRank

    • PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual page's value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages "important."

  • PageRank Algorithm

    • PR(A) : PageRank of page A

    • L(A) : The number of links going out of page A

    • d : damping factor

Yahoo collaborative tag suggestion
Yahoo Collaborative Tag Suggestion

  • The Philosophy of Collaborative Tag Suggestion

    • Selecting good tags

      • High popularity  tag quality

      • High coverage of multiple facets  good recall

      • Least effort  reduce the cost involved in browsing

    • Eliminating tag spam

      Utilizing collective user tagging behavior and authorities

  • Collaborative Tag Suggestion Algorithm

    S(t’,o) = S(t’,o) + Ps(t’|ti;o) x S(ti,o) – Pa(t’|ti) x S(ti,o)

    • Ps(t’|ti;o) : correlation probability between t’ and ti for the object o

    • Pa(t’|ti) : overlap probability in terms of the concepts between t’ and ti

    • S(t’,o) : goodness measure (score) of the tag t’ to an object o

    • a(u) : authority score of a given user u

Yahoo collaborative tag suggestion1
Yahoo Collaborative Tag Suggestion

  • Experiments based on Yahoo My Web 2.0 tag data

  • Suggested Tags for the URL

Tagging systems
Tagging Systems

  • What is Tagging?

    • keyword, description, classification, user-based, collaboration, easy, linear thinking, flickr,, hotissue, muststudy

    • Best examples: flickr,, gmail, technorati

    • Goals: organizing, sharing, navigating, filtering, searching, etc

  • Taxonomy vs. Folksonomy

    • Taxonomy: hierarchical & exclusive

    • Folksonomy(by tagging): non-hierarchical & inclusive

  • Automatic annotation vs. Tagging

    • Automatic annotation: content-based & good for only text

    • Tagging: context-based & good for multimedia data

Kinds of tags
Kinds of Tags

  • Identifying What it is About

    • ex) ajax, cat, mountain, etc

  • Identifying What it Is

    • ex) article, blog, book, etc

  • Identifying Who Owns It

    • ex) MichaelArrington, DionHinchcliffe, etc

  • Refining Categories

    • ex) 25, 100, etc

  • Identifying Qualities or Characteristics

    • ex) funny, stupid, interesting, inspiration

  • Self Reference

    • ex) mystuff, mycomments, etc

  • Task Organizing

    • ex) toread, jobsearch, musthave, etc

Unified user resource relation model analysis for tagging
Unified User-Resource-Relation Model Analysis for Tagging


(URL, Blog Post,

Photo, Video, etc)















Issues of tagging
Issues of Tagging

Amazon recommendations
Amazon Recommendations

So many


Amazon recommendations1
Amazon Recommendations

  • Item-to-Item Collaborative Filtering

    • Matching user’s purchased and rated items to similar items  similar-items table

    • Identify similarities between different items

      • Items more static than users

      • Offline: Item similarity computation

      • Online: Prediction computation

Recommendation systems
Recommendation Systems

  • Recommendation

    • Solution for Information Overload

      • cf. Reputation System: Solution for Transacting with Strangers

    • Example applications

      • E-commerce : product recommendations

      • Corporate Intranets : Finding domain experts

      • Digital Libraries : Finding pages/books people will like

      • Medical Applications : Matching patients to doctors

      • Customer Relationship Management (CRM) : Matching customer problems to internal experts

  • Types of Recommendation Systems

    • Content-based Recommendations: The user will be recommended items similar to the ones the user preferred in the past.

    • Collaborative Recommendations: The user will be recommended items that are preferred by other people with similar tastes and preferences.

    • Hybrid approaches: These methods combine collaborative and content-based methods.

Case study content based
Case Study: Content-based

  • Pandora (

    • Created by the Music Genome Project

      • The most comprehensive analysis of music

      • Over the past 6 years, the songs of over 10,000 different artists are analyzed the musical qualities of each song one attribute at a time.

    • Musical Genome (Hundreds of musical attributes)

      • melody, harmony, rhythm, instrumentation, orchestration, arrangement, lyrics, singing and vocal harmony, etc

Content based recommendation
Content-based Recommendation

  • Content-based approach

    • Its roots in information retrieval and information filtering

    • Focus on recommending items containing textual information, such as documents, Web sites (URLs), and news messages etc

    • Improvement by the use of user profiles that contain information about users’ tastes, preferences, and needs

  • The process of Content-based recommendation

    • Construction of per-user content-based profile

      • TF : Term Frequency, IDF : Inverse Document Frequency

      • N : Number of the documents

      • ni : How many times keyword ki is seen in the document

      • fi,j : Number of times keyword ki is seen in the document dj

Content based recommendation1
Content-based Recommendation

  • Similarity measurement

    • u(c,s) : the utility function

    • ContentBasedProfile(c) = (wc1, …, wck) : the profile of user c

    • cosine similarity measure

  • Limitations of content-based approach

    • Limited Content Analysis

      • automatic feature extraction is much harder to multimedia data

      • cannot distinguish between a well-written article and a badly written one

    • Overspecialization

      • no experience, no recommendation

    • New User Problem

  • Case study collaborative filtering
    Case Study: Collaborative Filtering

    • (

      • Automatic construction of personalized music profile

        • Scrobbling a song : sending the list of listened songs to

      • Recommendation by collaborative analysis of music profiles

        • Recommended tracks

        • Recommended readings

        • Recommended users

        • Similar artists

    Collaborative filtering
    Collaborative Filtering

    4) Request


    1) Submit Ratings

    6) Select Items &

    Predict Ratings

    C.F. Engine

    5) Identify


    2) Store Ratings

    3) Compute




    Collaborative filtering algorithm
    Collaborative Filtering Algorithm

    User-Item Matrix

    Meg & David: similarity -0.59

    Meg & Amy: similarity 0.67

    Meg & Joe: similarity 0.47

    Recommendations for Meg:

    Movies 7

    Collaborative filtering1
    Collaborative Filtering

    • Limitations

      • New User Problem

        • Can be addressed using hybrid recommendation

      • New Item Problem

        • Until the new item is rated by a substantial number of users, it is not recommended

        • Can be addressed using hybrid recommendation

      • Rating Sparsity

        • the number of ratings already obtained is usually very small compared to the number of ratings that need to be predicted

        • Can be addressed using demographic filtering

    Unified user resource relation model analysis for recommendation

    Links derived from similar attributes, similar content, explicit cross references

    Unified User-Resource-Relation Model Analysis for Recommendation


    (Book, Music, Movie,

    Product, Article, Web Page)


    Links derived from similar attributes, explicit connections















    Observed preferences

    (Ratings, purchases, page views, wish lists, play lists)

    Decentralization explicit cross references

    Contents production consumption

    Text explicit cross references




    Contents Production  Consumption


    (User-Created Contents,

    Ready-Made Contents)

    Contents Consumption


    Recommendation, Search, Tagging

    • Searching

    • Discovering (links, tags, directories)

    • Recommended

    Long Tail



    Personalized News explicit cross references


    Personalized Search

    Personalized Homepage

    The long tail
    The Long Tail explicit cross references

    The long tail1
    The Long Tail explicit cross references

    • The Long Tail

      • Coined by Chris Anderson

      • Infrequent events(the long tail) can cumulatively outweigh the initial portion of the graph, such that in aggregate they comprise the majority

      • Overcoming space-time limitation of offline market

    • Long Tail in Online Ads Market

    Web 1.0 : DoubleClick

    Web 2.0 : Google AdSense

    Openness explicit cross references

    Openness mashup
    Openness & Mashup explicit cross references

    • Openness

      • Data by RSS

      • Service by Open API

    • Mashup

      • Website or web application that seamlessly combines content from more than one source into an integrated experience

      • Examples

        • Google News : News aggregation

        • Newsmap : News visualization (Using Google News)

        • WingBus : Travel info service (Aggregating blog posts on traveling)

        • = Google Maps + Craigslist

        • Amazon Light = Amazon + Other Searches + Yahoo News + …

        • Chicago Crime Map = Google Maps + CPD's Citizen ICAM Web site

        • Aladdin TTB Review = Aladdin + Blog APIs

    RSS explicit cross references

    • Really Simple Syndication

      • A family of XML based web-content distribution and republication (Web syndication) protocols

      • Primarily used by news sites and weblogs

      • Currently used by various types of contents like search results, bug reports, wiki updates, podcasting&videocasting, even fortune-telling

    • Two Software for RSS

      • RSS Feeder: web application by which RSS feeds are dynamically updated with the change of contents

      • RSS Reader(Aggregator): program that checks RSS-enabled feeds and displays any updated information that it finds (ex. HanRSS)

    • Standards

      • RSS 0.9x, RSS 1.x, RSS 2.x, Atom

    Examples of open apis
    Examples of Open APIs explicit cross references

    API Scorecard (

    Types of open apis
    Types of Open APIs explicit cross references

    • SOAP

      • Protocol for exchanging XML-based message, normally using HTTP

      • Much more robust way to make requests, but more robust than most APIs need

      • More complicated to use

    • REST

      • Software architectural style for distributed hypermedia systems like WWW

      • Quickly gained popularity through its simplicity

    • XML-RPC

      • RPC protocol with XML as a encoding and HTTP as a transport

      • More complex than REST but simpler than SOAP

    • JavaScript

      • The newest trend in APIs

      • Offering a free JavaScript library that is the only way to access data

      • Limited integration with other services

    • JSON-RPC

      • RPC protocol encoded in JSON instead of XML

      • Very simple protocol (and very similar to XML-RPC)

    Example scenario book shopping
    Example Scenario: Book Shopping explicit cross references


    Soap simple object access protocol
    SOAP explicit cross references (Simple Object Access Protocol)

    • Simple, lightweight XML protocol for exchanging structured and typed information on the Web

      • Used for communication between applications

      • XML as the standard message format

      • Mostly HTTP as the transport method

      • Platform & Language independent

      • Simple and Extensible

    • Baseline standard for rich set of messaging features

      • Addressing, Routing

      • In-Message Security, Identity Federation

      • Reliable Messaging

    Soap message format request
    SOAP Message Format (Request) explicit cross references





    <t:transId xmlns:t=“”>1234</t:transId>



    <m:Add xmlns:m=“”>






    SOAP namespace

    Encoding style of message

    Transaction ID



    Example scenario with soap 1 1
    Example Scenario with explicit cross referencesSOAP 1.1

    Example scenario with soap 1 11
    Example Scenario with explicit cross referencesSOAP 1.1

    Mashup example using soap
    Mashup Example using SOAP explicit cross references

    • MapMash (Google Maps Mashup)

      • Google Maps +

      • Example from JavaWorld article

      • Apache Tomcat 5.5 (Java Servlet), Apache Axis library, Direct Web Remoting(DWR) library

      • Demo

    • Code example

      • Google Maps (JavaScript)

        <script src="" type="text/javascript"></script>

      • (SOAP)

        GeoCoder.geocode(address, moveMapCallback);

    Mapmash sequence diagram
    MapMash Sequence Diagram explicit cross references

    Rest representational state transfer
    REST explicit cross references(Representational State Transfer)

    • REST

      • Proposed by Roy Fielding in a 2000 doctoral dissertation

        • Architectural Styles and the Design of Network-based Software Architectures

      • Design pattern for creating Web Services (Not a standard)

    • Three fundamental aspects of the REST Design Pattern


    Every distinguishable entity is a resource

    Simple Operations

    Most web interactions

    are done using HTTP CRUD



    Every resource is uniquely

    identified by a URL

    Example scenario with rest
    Example Scenario with explicit cross referencesREST

    Example scenario with rest1
    Example Scenario with explicit cross referencesREST

    Mashup example using rest
    Mashup Example using REST explicit cross references

    • Integrated Search Mashup

      • Yahoo News + YouTube Video + Technorati

      • Example from IBM developerWorks tutorial

      • Apache Tomcat 5.5 (Java Servlet), Xalan-Java 2.7.0 (XPath, XSLT)

      • Demo

    • Code example

      • Yahoo News Search + queryString

      • YouTube Tag Search + queryString

    Web as platform
    Web as Platform explicit cross references

    Web as platform1
    Web as Platform explicit cross references



    Affiliate Sites


    RSS + Open API

    Web Database

    Web Pages, Maps, Offline Data,

    UCC(text, photo, audio, video)

    User Logs, Purchase Logs etc



    Web Platform

    Environment where services are developed, deployed, and executed

    based on massive Web DB

    Platform straw mat
    Platform = Straw mat explicit cross references


    Web 2.0 : Parcipation and Open

    by Dr. JoongHee Ryu

    Case study ebay business platform
    Case Study: eBay Business Platform explicit cross references

    Business and Solution Partner

    • 45% of auction items are exposed to partner sites through API

    • 2 billion API requests every month

    • 30,000 registered external developers who make services and tools for eBay

    • the number of customers using these services and tools increase at the rate

    • of 45% every year

    Auction &

    Open Market

    Electronic Payment

    eBay Business Platform

    Shopping mall



    Online comparison shopping

    Case study google ad platform
    Case Study: Google Ad. Platform explicit cross references

    • Compete with large shopping malls

    • Google search + ad  Evolving into online shopping place

    Shopping & Purchase

    Large shopping mall

    (Walled Garden)



    Google Sites



    Contents Network



    Content Referral Network




    shopping mall

    & Producer

    All types of Ad model

    Google Ad Distribution Network

    (Open Platform)

    Web as platform enabling technologies

    Server Tech. explicit cross references


    Ruby, RoR,

    Web Frameworks, …

    Client Tech. (RIA)

    Ajax, Flash&Flex, XAML,

    XUL, SVG, Widgets …

    Contents Tech.

    Blog, Wiki, RSS, Tagging,

    Podcasting, Mashup …

    Web as Platform : enabling technologies

    • How can Web evolve into platform?

       Web technologies become mature!

    Global Platform Tech.

    Development, Deployment,

    Operation, Management …

    Client technologies
    Client Technologies explicit cross references

    • Web Browsers : Client program of Web Platform

      • Limitations: restricted interactivity and user interface

    • RIA (Rich Internet Application) technologies

      • Ajax (Asynchronous Javascript and XML)

      • Macromedia Flash & Flex

      • SVG (Scalable Vector Graphics)

      • Laszlo

      • XAML on Windows Vista

      • XUL Application for Firefox

      • Yahoo! Widget (aka. Konfabulator)

      • Apple Dashboard

    Ajax explicit cross references

    • New web application model

      • Asynchronous data retrieval using XMLHttpRequest and JavaScript

      • data interchange and manipulation using XML and XSLT

      • dynamic display and interaction using the DOM

      • standards-based presentation using XHTML and CSS

    • Ajax is not technology but approach like LAMP

      • Defined by Jesse James Garrett in 2005

    • Examples

      • Many Google services (Gmail, Google Suggest, Google Maps…)

      • Web page accessory (Naver Suggest, Amazon…)

      • Web-based Office services (Zimbra, Writely, gOffice, Kiko…)

      • Personalized homepages (Windows Live, Google IG, Protopage…)

      • Shopping ( …)

    Ajax communication model
    Ajax Communication Model explicit cross references

    Server technologies
    Server Technologies explicit cross references

    • Importance of server technologies

      • Network computing era

      • Rapid development and Low operating cost of services

    • Agile Web Development

      • Script languages for web development

        • PHP, Python, Ruby

      • Web frameworks

        • Ruby on Rails(RoR), Struts, PEAR, Ajax frameworks (Dojo, Prototype, DWR, Atlas, Google Web Toolkit(GWT))

    • Lightweight server environment based on Open Source

      • LAMP (Linux, Apache, MySQL, PHP&Python&Perl)

    Global platform technologies case study google service platform
    Global Platform Technologies explicit cross referencesCase Study: Google Service Platform

    Service Software Tech.

    Search engine, Email server, IM server,

    Map database, Various Web sites …


    System Software Tech.

    Google Linux, Google File System,

    MapReduce Library, Chubby, BigTable

    Intelligent System,

    Programming Model(River, TACC),

    Replication/Redundancy …

    Service Library

    Google OS




    Hardware Tech.

    Clusters, Geographic distribution,

    Automated Setup, Automated Backup,

    Standard components, Commodity drives,

    Flexible co-location, Easy-access design …

    Google Cluster

    • Advantages

    • Easy Developement

    • Scalability

    • Robustness

    • 450,000 or more servers (maybe!)

    • All PC servers less than $1,000

    • 40 or more pizza box servers per rack

    Web 2.0 explicit cross references



    Web 2 0 startups
    Web 2.0 explicit cross referencesStartups

    한국의웹 2.0 리스트(

    References explicit cross references

    • The Anatomy of a Large-Scale Hypertextual Web Search Engine, WWW 1998

    • The PageRank citation ranking; Bringing order to the Web, Technical Report 1999

    • Towards the Semantic Web; Collaborative Tag Suggestions, Collaborative Web Tagging Workshop 2006

    • From Tagging to Folksonomy, KAIST Google SIG TagDay 2006,

    • Toward the next generation of recommender systems; a survey of the state-of-the-art and possible extensions, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 2005

    • Recommendations, IEEE Internet Computing 2003

    • Item-based Collaborative Filtering Recommendation Algorithms, WWW 2001

    • The Google File System, SOSP 2003

    • MapReduce: Simplified Data Processing on Large Clusters, OSDI 2004

    • Bigtable: A Distributed Storage System for Structured Data, OSDI 2006

    • The Chubby lock service for loosely-coupled distributed systems, OSDI 2006

    Contact : Web 2.0 Hub explicit cross references(