Pushing the Quality Level in Networked News Business
Download
1 / 30

Pushing the Quality Level in Networked News Business semantic-based content retrieval and composition in internation - PowerPoint PPT Presentation


  • 218 Views
  • Uploaded on

Pushing the Quality Level in Networked News Business semantic-based content retrieval and composition in international news publishing. Markus Schranz [email protected] Problem and Project Description Goals and Objectives Approaches and Results

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Pushing the Quality Level in Networked News Business semantic-based content retrieval and composition in internation' - sherlock_clovis


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Slide1 l.jpg

Pushing the Quality Level in Networked News Business semantic-based content retrieval and composition in international news publishing

Markus Schranz

[email protected]


Agenda l.jpg

Problem and Project Description

Goals and Objectives

Approaches and Results

Architectural Design & Communication

Multinational and Multilingual Services

Semantic Content Relations

Future Steps and Exploitation

Agenda


Environmental situation l.jpg
Environmental Situation

problem description

  • Internet gains in importance in the news distribution area

  • Large amount of distributed business information is available

  • European business today is highly segmented and widely unrecognised beyond national borders

  • Business news mostly bear national relevance but hold the potential to spread cooperation opportunities and business chances towards an economically and socially integrated Europe.

  • Business is global  news need to beSupport for old and new economy within entire Europe is required; Appropriate solution beneficial for business in the EU, with special focus on the support and the integration of new member states


Existing approaches l.jpg
Existing approaches

problem description

  • National solutions available

    • Business News Distribution Service in German speaking area

    • Increasing interest from both

      • Subscribers

      • Press distributors

        within the existing services for multinational solutions

  • Limitations

    • Single language limitation

    • Not attractive for European companies to join


Project description l.jpg
project description

Objectives

  • NEDINE has been EC-funded (Apr 2004-Apr2006). The objective of the project is to establish a distributed news network, aimed at European journalists and opinion leaders.

  • NEDINE provides participants with a network for news exchange and distribution. It supports mutual awareness of relevant topics and information content within all European countries.

  • NEDINE focuses on the availability and affordability for all partners to transport national information to the addressed target group, regardless of the origin, nationality and financial capability of the information provider.


Slide6 l.jpg

project description

The Challenge

Austrian reader

Austrian NA

Czech reader

Czech NA

Small CompanyGood product

Slovakian NA

Slovakian reader


Slide7 l.jpg

project description

  • News agency offers to its readers:

  • Multilingual news

  • International news

    • From various sources

    • (Semantic) Relationships independent from source

    • Relevance ranking for search

The Solution

  • News agency offers to its customers:

  • Single access point for international press releases

    • Distribution

    • Payment

    • Editing / Translation

  • Price advantage compared to collection of single press releases

Austrian reader

Austrian NA

Czech reader

Czech NA

  • News agency benefits from the nedine network:

  • Common business model

  • Additional customers

    • more revenues

    • new contacts

    • international presence

Small CompanyGood product

Slovakian NA

Slovakian reader


Architecture reasoning l.jpg

First Approach – Centralized Architecture

Pro‘s:

Single maintenance point

Clear infrastructure

One traffic channel (News agency  NEDINE)

No additional infrastructure required for Partners

Con‘s:

Single point of failure (whole network down)

Huge amount of network traffic

Storage of complete articles

Which organization maintains the central server?

Architecture Reasoning

approaches and results


C entralized configuration l.jpg
centralized configuration

approaches and results

ČIA

SITA

Web Service Interface

NEDINE

Central Server

PTE


Architecture reasoning10 l.jpg

Alternative Approach – Hybrid P2P - Architecture

Why Peer - to - Peer?

Better scalability

No single point of failure

No downtime if central services are down

Less network traffic

Network remains transparent for the peers (they only see Nedine)

Architecture Reasoning

approaches and results


Approaches and results l.jpg

Final Approach – Hybrid P2P - Architecture

Properties of this Architecture:

Democratic System

Identical software components are installed at each partner

Nedine becomes a logically centralized platform

Nedine is technically distributed to the view of all participating peers

Semantic relations and necessary steps for news distribution are done in a local context

approaches and results


P2p configuration l.jpg
P2P configuration

approaches and results

NEDINE

Peer

NEDINE

Peer

ČIA

SITA

NEDINE

Peer

Web Service Interface

Virtually

Central

Services

PTE


Slide13 l.jpg

Communication: Peer  Agency

Web Services as the communication protocol

Standard Interfaces for default peers (SOAP, NewsML Data transfer, Queries, Network Data)

Customized interfaces for each partner, if necessary (database access based on document ID)

Location and functionality of the NEDINE-peer is defined in the corresponding WSDL-file

Functionality is only visible by the local peer, which increases network security

approaches and results


Slide14 l.jpg

Inter - Peer - Communication

Implemented also by XML Web Services

Inter – peer communication is invisible to the agencies

High flexibility, easy to upgrade/change – doesn’t influence the rest of the network

Network traffic is encrypted via PKI (Private-Public-Key Infrastructure)

approaches and results


Slide15 l.jpg

Multinational and Multilingual Services

Multinational Service Integration

Standardized news exchange formats  NewsML

Local Service to Peer communication  SOAP

local service providers hold business critical information

installation of a local peer with well-known (open) source increases trust of the participating organizations and underlines the local character of the relevant business data

Peer-to-Peer communication SOAP

approaches and results


Slide16 l.jpg

Multilingual News Publishing and Distribution

Automatic Translation ?

Multilingual content presentation ?

Multilingual information distribution & retrieval

Semantic relations between the (multilingual) business news contents

approaches and results


Semantic news enrichment l.jpg

Pushing the Quality Level by Semantics

International news describe local business and lack relevant interrelations

“Linking” between sensible business news has been manual work and thus costly

Semantic relationships increase business value of news items, but how to create with reasonable effort?

Semantic News Enrichment

approaches and results


Approaches and results19 l.jpg

The Vector Space Engine

Vectors are assigned to every news article representing keyword occurrences (weights)

Vectors are technically small portions of data, feasible to integrate in peer component

Semantic relationships increase business value of news items

Automatically recognize similarities by creating a vector space on relevant keywords

approaches and results


Approaches and results20 l.jpg
approaches and results

  • What is a keyword?

    • all words (except stopwords)

    • relevant words

      • from frequencies

      • with weights (vector space model)

      • from the domain

  • How does a keyword look like?

    • A word : bodies

    • A stem : bodi

    • A lemma : body

    • A phrase : public bodies


Slide21 l.jpg

approaches and results

Document

- Stemming and/or

- PN Detection and/or

- N-Gram Detection …

Query

Query

Q = (wq1,…,wqn)

Query Processing

Document Processing

Matching

D = (wd1,…,wdn)

- Stemming and/or

- PN Detection and/or

- N-Gram Detection …

Document


Slide22 l.jpg

approaches and results

  • Vector Space Model combined with statistic and linguistic processing.

  • Statistical metrics included are:

    • tfij = Term frequency for word i in document j

    • IDFi = Inverse Document Frequency for word i in the whole document collectionIDFi = 1 +

    • wij = tfij *IDFi

N = Total documents

dfi = Document Frequency for term i


Vector space model l.jpg
Vector Space Model

approaches and results

  • Documents are indexed by vectors

  • Documents are retrieved by similarity

    • Query and Documents are compared using the cosine formula:

      Sim(Q,D) =

    • Local archives must provide term frequency data (internal and document)


The used model l.jpg

Preprocessing of texts

NEWS

Document Vectors

The used model

approaches and results

Statistical process

Metadata information

Linguistic Processing

Taggers and Stemmers

Proper Names Heuristics

Syntactic patterns

Semantic resources (EWN)


Use case distributing news in czech republic and in austria ia cz de l.jpg
Use case: distributing news in Czech republic and in AustriaČIA CZ, DE

approaches and results

NEDINE

Peer

NEDINE

Peer

1. Distribution &Enrichment

SITA

(SK,EN)

ČIA

(CZ,DE,EN)

4.

5. CZ,DE

Subscriber

2. Enrichment (DE)

3.

NEDINE

Peer

PTE

(DE,EN)

6.

7. DE

Subscriber


Future exploitation l.jpg

Recent developments and open issues

Nedine has been extended with translation services (additional service on P2P architecture)

Secure communication infrastructure has been implementation

Performance and scalability tests

Market & Business orientation

 Nedine Association has been funded end 2005

Future Exploitation


Slide30 l.jpg

  • Have a look at NEDINE, we are

  • open to recommendations, news providers

  • and partners from all over Europe.

  • Website http://www.nedine.org/

  • E-Mail [email protected]

  • Nedine Contact Person: Dr. Markus Schranz

  • Tel. ++43-1-81140-444, [email protected]

Good News from Europe


ad