On the evaluation of semantic web service matchmaking systems
This presentation is the property of its rightful owner.
Sponsored Links
1 / 20

On the Evaluation of Semantic Web Service Matchmaking Systems PowerPoint PPT Presentation


  • 80 Views
  • Uploaded on
  • Presentation posted in: General

On the Evaluation of Semantic Web Service Matchmaking Systems. Vassileios Tsetsos , Christos Anagnostopoulos and Stathes Hadjiefthymiades P ervasive C omputing R esearch G roup C ommunication N etworks L aboratory Department of Informatics and Telecommunications

Download Presentation

On the Evaluation of Semantic Web Service Matchmaking Systems

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


On the evaluation of semantic web service matchmaking systems

On the Evaluation of Semantic Web Service Matchmaking Systems

Vassileios Tsetsos, Christos Anagnostopoulos and

Stathes Hadjiefthymiades

Pervasive Computing Research Group

Communication Networks Laboratory

Department of Informatics and Telecommunications

University of Athens – Greece

ECOWS ’06 @ Zurich


Outline

Outline

  • Introduction

  • Problem Statement

  • A Generalized Fuzzy Evaluation Scheme for Service Retrieval

  • Experimental Results

  • A Pragmatic View

  • Conclusions


Sws matchmaking

SWS Matchmaking

  • Matching service requests and advertisements, based on their semantic annotations (expressed through ontologies)

  • Numerous matchmaking approaches

    • Logic-, similarity-, structure-based (graph matching)

  • Various matched entities

    • functional service parameters (e.g., IOPE attributes)

    • Non-functional parameters (e.g., QoS attributes)

  • Ultimate goal: More effective service discovery, based on semantics and not just on syntax of service descriptions


Degree of match

Degree of Match

  • A value that expresses how similar two entities are, with respect to some similarity metric(s)

  • Important feature of almost all SWS matchmaking approaches

  • Allows for ranking of discovered services

  • Example DoM set: exact, plugin, subsumes, subsumed-by, fail


Evaluation basics

Matchmaking Engine

Expert

S1

e(R,S1)

r(R,S1)

S2

r(R,S2)

e(R,S2)

R

R

.

.

.

r(R,Sn)

e(R,Sn)

Sn

Evaluation Basics

  • Most works evaluate the performance of SWS Discovery (i.e., response times, scalability)

  • Limited contributions to the evaluation of retrieval effectiveness (i.e., the ability to discover relevant services)

Q: possible service requests

S: advertisements of published services

e: QxS→W (DoM, analogous to Retrieval Status Value in IR)

r: QxS→W (expert mappings)

Evaluation is the determination of how closely vector e approximates vector r


Evaluation schemes

Evaluation Schemes

  • W is the set of values denoting DoM (for e) or degree of relevance (for r)

  • W defines different evaluation schemes (EVS):


Boolean evaluation evs1

Boolean Evaluation (EVS1)

W={0,1}

Information Retrieval (IR) measures can be used:

Precision (PB) and Recall (RB)

RT: set of retrieved advertisements

RL: set of relevant advertisements


Problem statement 1 2

Si

e(R,Si)

Si

e’(R,Si)

S1A

S2B

S3A

S4D

S5D

S6C

S7B

S11

S21

S31

S40

S50

S60

S71

Threshold = “B”

Problem Statement (1/2)

  • Since, SWS matchmaking systems have multi-valued vectors e, application of Boolean evaluation implies the introduction of a relevance threshold

  • Problem 1: This “Booleanization” process filters out any service semantics captured through DoM

  • Problem 2: An optimal threshold value is hard to find


Problem statement 2 2

Problem Statement (2/2)

  • Problem 3: Boolean expert mappings are too coarse-grained and do not always reflect the intention of the domain expert.

  • Experiment

    • Manually defined multi-valued mappings between 6 requests and 135 advertisements of TC2 with W={0, 0.25, 0.5, 0.75, 1}

    • Calculation of deviation from existing Boolean mappings

  • Only ~33% of the Boolean mappings agree with the multi-valued ones

  • ~40% of the Boolean mappings are not even close to the multi-valued ones (deviation > 0.25)


A generalized fuzzy evaluation scheme

A Generalized Fuzzy Evaluation Scheme

  • Such scheme (EVS2) can provide solutions to the aforementioned problems

  • Main design decisions

    • Expert mappings are fuzzy linguistic terms

    • DoM are fuzzy sets

    • Boolean measures are substituted by generalized ones

  • Why fuzzy modeling?

    • Relevance is an “amorphic” concept (L. Zadeh). I.e., its complexity prevents its mathematical definition

    • Numeric values have vague semantics

    • Fuzzy linguistic variables assume values from a linguistic term set, with each term being a fuzzy variable set

    • Warning: Fuzziness does not refer to the matchmaking process per se


Fuzzification of e and r

I S SW R V

F SB S P E

1.0

1.0

Membership Value

Membership Value

0.0

0.0

0.5

0.5

1.0

1.0

Degree of Relevance

Degree of Match

I: Irrelevant

S: Slightly relevant

SW: Somewhat relevant

F: FAIL

SB: SUBSUMED-BY

S: SUBSUMES

P: PLUGIN

E: EXACT

R: Relevant

V: Very relevant

Fuzzification of e and r

fr: QxS→[0,1]

fe: QxS→[0,1]

If there is not one-to-one correspondence between the number of fuzzy variables in each set, fuzzy modifiers could be used (e.g., dilutions, concentrators)


Generalized evaluation measures

Generalized Evaluation Measures

  • Based on [Buell and Kraft, “Performance measurement in a fuzzy retrieval system”, 1981] the following measures are defined:

  • The cardinalities of the sets RT and RL are transformed to fuzzy set cardinalities, since the above sets are fuzzy.

  • Note: the evaluation measures take into account all services Si


Experimental results 1 3

ExperimentalResults (1/3)

  • Manual assessment of fuzzy relevance in the “Education” subset of TC v2

  • Matchmaking engine: OWLS-MX Matcher

    • Used only logic-based matching algorithms

    • Threshold = FAIL

Difference between RG and RB is due to considerable deviation between Boolean and fuzzy expert mappings


Experimental results 2 3

Experimental Results (2/3)

  • Sensitivity of the proposed scheme

  • Only the generalized measures, are affected by “stronger” false negatives/positives


Experimental results 3 3

EVS1

EVS2

EVS1 (average)

EVS2 (average)

Experimental Results (3/3)

  • Similar overall behavior but better accuracy/sensitivity as already shown


A pragmatic view

Statistics

Logic

implications

Boolean

Value

(e.g., “1”)

Adjusted

Fuzzy

Value

(e.g., “relevant”)

Other inference

rules

Reasoning about “Relevance”

A Pragmatic View

  • A reasonable assumption

    • experts are not willing to provide more than Boolean mappings

  • Automatic fuzzification of Boolean expert mappings would be valuable


A first approach

Service

S1

Sx

S3

R

S5

S6

S7

A First Approach

  • Services are represented as concepts and form a service profile ontology

  • Then an inference matrix is used for adjusting the Boolean r values


Experimental results

Experimental Results

  • The new scheme (EVS2’) approximates EVS2 better than EVS1

  • Under the assumption that EVS2 is more accurate, the EVS2’ seems promising

EVS1

EVS2

EVS1 (average)

EVS2 (average)

EVS2’


Conclusions

Conclusions

  • Service retrieval evaluation should be semantics-aware

  • A generalization of the current evaluation measures is deemed necessary

  • Fuzzy Set Theory may assist towards this direction

  • However, many practical issues remain open


Thank you

Thank You!

Questions???

http://p-comp.di.uoa.gr


  • Login