Similarity evaluation techniques for filtering problems l.jpg
This presentation is the property of its rightful owner.
Sponsored Links
1 / 50

Similarity Evaluation Techniques for Filtering Problems PowerPoint PPT Presentation


  • 85 Views
  • Uploaded on
  • Presentation posted in: General

Similarity Evaluation Techniques for Filtering Problems. ?. Vagan Terziyan University of Jyvaskyla [email protected] Evaluating Distance between Various Domain Objects and Concepts - one of the basic abilities of an intelligent agent. Are these two the same?. … No !

Download Presentation

Similarity Evaluation Techniques for Filtering Problems

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Similarity evaluation techniques for filtering problems l.jpg

Similarity Evaluation Techniques for Filtering Problems

?

Vagan Terziyan

University of Jyvaskyla

[email protected]


Slide2 l.jpg

Evaluating Distance between Various Domain Objects and Concepts - one of the basic abilities of an intelligent agent

Are these two the same?

… No !

The difference is equal to 0.234


Contents l.jpg

Contents

  • Goal

  • Basic Concepts

  • External Similarity Evaluation

  • An Example

  • Internal Similarity Evaluation

  • Conclusions


Reference l.jpg

Reference

Puuronen S., Terziyan V., A Similarity Evaluation Technique for Data Mining with an Ensemble of Classifiers, In: A.M. Tjoa, R.R. Wagner and A. Al-Zobaidie (Eds.), Proc. of the 11th Intern. Workshop on Database and Expert Systems Applications, IEEE CS Press, Los Alamitos, California, 2000, pp. 1155-1159.

http://dlib.computer.org/conferen/dexa/0680/pdf/06801155.pdf


Slide5 l.jpg

Goal

  • The goal of this research is to develop simple similarity evaluation technique to be used for social filtering

  • Result of social filtering here here is prediction of a customer’s evaluation of certain product based on known opinions about this product from other customers


Basic concepts virtual training environment vte l.jpg

Basic Concepts:Virtual Training Environment (VTE)

  • VTEis a quadruple:

    <D,C,S,P>

    • Dis the set of goods D1, D2,..., Dn in the VTE;

    • C is the set of evaluation marks C1, C2,..., Cm ,that are used to rank the products;

    • Sis the set of customers S1, S2,..., Sr , who select evaluation marks to rank the products;

    • Pis the set of semantic predicates that define relationships between D, C, S


Basic concepts semantic predicate p l.jpg

Basic Concepts:Semantic Predicate P


Problem 1 deriving external similarity values l.jpg

Problem 1:Deriving External Similarity Values


External similarity values l.jpg

External Similarity Values

External Similarity Values (ESV): binary relations DC, SC, and SD between the elements of (sub)sets of D and C; S and C; and S and D.

ESV are based on total support among all the customers for voting for the appropriate connection (or refusal to vote)


Problem 2 deriving internal similarity values l.jpg

Problem 2:Deriving Internal Similarity Values


Internal similarity values l.jpg

Internal Similarity Values

Internal Similarity Values (ISV): binary relations between two subsets of D, two subsets of C and two subsets of S.

ISV are based on total support among all the customers for voting for the appropriate connection (or refusal to vote)


Why we need similarity values or distance measure l.jpg

Why we Need Similarity Values (or Distance Measure) ?

  • Distance between products is used to advertise the customers a new product based on evaluation of already known similar products

  • distance between evaluations is necessary to estimate evaluation error when necessary, e.g. in the case of adaptive filtering technologies used

  • distance between customers is useful to evaluate weights of all customers when necessary, e.g. to be able to integrate their opinions by weighted voting.


Deriving external relation dc how well evaluation fits the product l.jpg

Deriving External Relation DC:How well evaluation fits the product

Evaluation marks

Products

Customers


Deriving external relation sc measures customer s competence in the use of evaluation marks l.jpg

Deriving External Relation SC:Measures customer’s competence in the use of evaluation marks

  • The value of the relation (Sk,Cj) in a way represents the total support that the customer Sk obtains selecting (refusing to select) the mark Cj to evaluate all the products.


Example of sc relation l.jpg

Example of SC Relation

Evaluation marks

Products

Customers


Deriving external relation sd measures customer s competence in the products l.jpg

Deriving External Relation SD:Measures customer’s competence in the products

  • The value of the relation (Sk,Di) represents the total support that the agent Sk receives selecting (or refusing to select) all the solutions to solve the problem Di.


Example of sd relation l.jpg

Example of SD Relation

Products

Evaluation marks

Customers


Normalizing external relations to the interval 0 1 l.jpg

Normalizing External Relations to the Interval [0,1]

nis the number of products

mis the number of evaluation marks

ris the number of customers


Slide19 l.jpg

Competence of a customer

Evaluation marks

Goods

Conceptual pattern of evaluation marks definitions

Cj

Conceptual pattern of goods’ features

Di

Competence in the goods

Competence in the evaluation marks

Customer


Customer s evaluation competence quality in products l.jpg

Customer’s Evaluation:competence quality in Products


Slide21 l.jpg

Customer’s Evaluation:competence quality in evaluation marks use


Quality balance theorem l.jpg

Quality Balance Theorem

The evaluation of a customer’s competence (ranking, weighting, quality evaluation) does not depend on the competence area “virtual world of products” or “conceptual world of evaluation marks” because both competence values are always equal.


Proof l.jpg

Proof

...

...


An example l.jpg

An Example

  • Let us suppose that four customers have to evaluate three products from virtual shop using five different evaluation marks available.

  • The customers should define their selection of appropriate mark for every product.

  • The final goal is to obtain a cooperative evaluation result of all the customers concerning the quality of products.


C set evaluation marks in the example l.jpg

C set (evaluation marks) in the Example

Evaluation marks Notation

Nicely designedC1

ExpensiveC2

Easy to useC3

ReliableC4

SafeC5


S customers set in the example l.jpg

S (customers) Set in the Example

Customers IDs Notation

FoxS1

WolfS2

CatS3

HareS4


D products set in the example l.jpg

D (products) Set in the Example

D1 - Ultra Cast Spinning Reel

D2 - Nokia Communicator 9110

D3 - iGrafx Process Management Software


Evaluations made for the good reel l.jpg

Evaluations Made for the Good“Reel”

D1

P(D,C,S)C1C2C3C4C5

S11-1-10-1

S20+-1**0 ++1*-1***

S300-110

S41-1001

Customer Wolf prefers to select mark Reliable*to evaluate “Reel” and it refuses to select Expensive** or Safe***. Wolf does not use or refuse to use the Nicely designed+or Easy to use++ marks for evaluation.


Evaluations made for the good communicator l.jpg

Evaluations Made for the Good“Communicator”

D2

PC1C2C3C4C5

S1-10-101

S21-1-100

S31-1011

S4-10010


Evaluations made for the good software l.jpg

Evaluations Made for the Good“Software”

D3

PC1C2C3C4C5

S1101-10

S2010-11

S3-1-11-11

S4-1-11-11


Example calculating value dc 3 4 l.jpg

Example: Calculating Value DC3,4

D3

PC1C2C3C4C5

S1101-10

S2010-11

S3-1-11-11

S4-1-11-11


Resulting dc relation l.jpg

Resulting DC relation


Normalized and thresholded dc relation l.jpg

Normalized and “Thresholded” DC relation

0

1

-1

0

0.25

0.5

0.75

1


Slide34 l.jpg

Result of Cooperative Goods Evaluation Based on DC Relation

D1 is nicely designed, reliable, not expensive, but not easy to use

D2 is reliable, safe, not expensive, but not easy to use

D3 is easy to use, safe, but not reliable


An example calculating value sd 1 1 l.jpg

An Example: Calculating ValueSD1,1


An example calculating value sc 4 4 l.jpg

An Example: Calculating ValueSC4,4


Resulting sd and sc relations l.jpg

Resulting SD and SC relations


Normalized and thresholded sd relation l.jpg

Normalized and “Thresholded” SD relation

Fox

Wolf

Cat

Hare

Evaluations obtained from the

customer Fox should be accepted if he

evaluates goods similar to “Reels” ...

… or similar to “Software” .

Fox’s evaluations should be rejected if

they concern goods similar to “Communicator”


Slide39 l.jpg

Normalized and “Thresholded” SD relation

Fox

Wolf

Cat

Hare

Only evaluation from the customer

Cat can be accepted if it concerns

goods similar to “Communicator”

All four customers are expected

to give an acceptable evaluations

concerning “Software” related goods


Normalized and thresholded sc relation l.jpg

Normalized and “Thresholded” SC relation

Nicely designed

Easy to use

Expensive

Safe

Reliable

Fox

Wolf

Cat

Hare

Evaluation obtained from the customer

Fox should be accepted if it concern

usability (easy to use) of a good...

Fox’s evaluations

should be rejected

if they concern

design of goods

… or reliability of a good .


Slide41 l.jpg

Problem 2:Deriving Internal Similarity Values


Slide42 l.jpg

Internal Similarity Values

Internal Similarity Values (ISV): binary relations between two subsets of D, two subsets of C and two subsets of S.

ISV are based on total support among all the customers for voting for the appropriate connection (or refusal to vote)


Deriving internal similarity values l.jpg

Deriving Internal Similarity Values

Via one intermediate set

Via two intermediate sets


Internal similarity for customers goods based similarity l.jpg

Internal Similarity for Customers:Goods-based Similarity

Goods

Customers


Internal similarity for customers evaluation marks based similarity l.jpg

Internal Similarity for Customers:Evaluation marks-Based Similarity

Evaluation marks

Customers


Internal similarity for customers evaluation marks goods based similarity l.jpg

Internal Similarity for Customers:Evaluation marks-Goods-Based Similarity

Goods

Evaluation marks

Customers


Internal similarity for evaluation marks l.jpg

Internal Similarity for Evaluation Marks

Goods-based similarity

Customers-based similarity

Goods-customers-based similarity


Internal similarity for goods l.jpg

Internal Similarity for Goods

Evaluation marks-based similarity

Customers-based similarity

Evaluation marks-customers-based similarity


Normalized and thresholded dd c relation l.jpg

Normalized and “Thresholded” DDCrelation

similar

neutral

different


Conclusion l.jpg

Conclusion

  • Discussion was given to methods of deriving the total support of each binary similarity relation. This can be used, for example, to derive the most supported goods evaluation and to rank the customers according to their competence

  • We also discussed relations between elements taken from the same set: goods, evaluation marks, or customers. This can be used, for example, to divide customers into groups of similar competence relatively to the goods evaluation environment


  • Login