Presented by: Muhammad Nuruddin , student ID: 2961230, email address: nuruddin@L3S.de,

Presentation on the papers ZenCrowd: Leveraging Probabilistic Reasoning andCrowdsourcing Techniques for Large-Scale Entity Linking byGianlucaDemartini, DjellelEddineDifallah, and Philippe Cudré-MaurouxeXascaleInfolabU. of Fribourg—Switzerland{firstname.lastname}@unifr.chPick-A-Crowd: Tell Me What You Like,and I’ll Tell You What to DobyDjellelEddineDifallah, GianlucaDemartini, and Philippe Cudré-MaurouxeXascaleInfolabU. of Fribourg—Switzerland Presented by: Muhammad Nuruddin, student ID: 2961230, email address: nuruddin@L3S.de, Internet Technologies and Information Systems(ITIS), M.Sc. 4th Semester Leibniz Universität Hannover Course details: Advanced Methods of Information Retrieval By: Dr. Elena Demidova leibnizuniversitäthannover

Entity Linking Entity linking Algorithm ( Probabilistic Reasoning based ) Entity Linking: A suggested way to automate the construction of a semantic web

Example: Wikipedia provide annotated pages Military Germany France Pacific Ocean Historical Incidence

Crowdsourcing • Obtaining services, ideas, or content by asking contributions from a large group of people, and especially from an online community. • Example: - Wikipedia = Wiki + encyclopedia = quick + encyclopedia - IMDB movie top chart. - AMT ( AmazonMechanicalturk )

Paper1: ZenCrowd: Leveraging Probabilistic Reasoning andCrowdsourcing Techniques for Large-Scale Entity Linking Entity linking Algorithm ( Probabilistic Reasoning based ) Improvement between 4% and 35% Crowdsourcing

Current techniques of Entity Linking • Entity Linking is known to be extremely challenging, since parsing and disambiguating natural language text is still extremely difficult for machines. • The current matching techniques: • Algorithmic Matching: Mostly based on probabilistic reasoning (e.g. TF-IDF based). Not fully reliable as human manual matching. • Manual Matching:Fully reliable. Costly and time consuming. e.g. New York Times (NYT) employs a whole team whose sole responsibility is to manually create links from news articles to NYT identifiers. • This paper represents a step towards bridging the gap between those two classes.

System Architecture The results of algorithmic matching are stored in a probabilistic network: Decision Engine decides: If results has very high probability value, it is directly linked to the entity. If results have very low confidence value, it is discarded and ignored. Promising but uncertain valued entities are passed to Micro-Task Manager to crowdsource the problem and make a decision.

System Architecture After getting vote from Crowdsourcing platform, all information gathered both from the algorithmic matchers and the crowd are fed into a scalable probabilistic store, and used by their decision engine to process all entities accordingly. Lets have a look on decision engine’s mechanism to take a decision.

Example scenario HTML page doc. 1 • plj – Probability of lj computed • from algorithmic matches. After the UNC workshop, Jordan gave a tutorial on nonparametric Bayesian methods. LOD cloud l1 pl1 Berkeley professor pl2 l2 Jordan Jordan Country Country l3 pl3 River River C11 C12 C13 C21 C22 C23 Entities Reliability factor Pw1() Good, or Bad Reliability factor Pw1() Good, or Bad Worker W1 Worker W2

Decision Engine uses Factor-Graph • Factor-Graph can deal with a complicated global problem by viewing it as a factorization of several local functions. • l1,l2,l3 – 3 candidate entities for a linking. • plj – Probability of lj computed from algorithmic matches. • W1,W2 – two workers employed to check these l1,l2,l3 relevancy. • Pw1, pw2 – worker w1 and w2s reliability factor. • Lfi() – linking factor, connects li to related clicks (e.g. C11) and workers ( e.g W1). • Sa1-2() – entities has SameAs link in LOD.

Equations used for Linking factor calculation in Factor-Graph

Reaching a Decision • We will find a posterior probability for all the links running probabilistic inference in the network. • Links with posterior probability > 0.5 are considered to be correct.

Updating Priors • As much as entity linkings come to a decision, workers working profiles get updated. • From the result, workers accuracy of work can be calculated. Reliability factor of W2 Reliability factor of W1

EXPERIMENTS Experimental setup • Collection consists of 25 English news articles • News from CNN.com, NYTimes.com, washingtonpost.com, timesofindia.indiatimes.com, and swissinfo.com • 489 entities extracted using stanford parser. • Crowdsourcing was performed using Amazon Mturk • 80 distince workers • Precision, Recall and accuracy was measured.

Comparison of three matching techniques

Observations • A Hybrid model ( Based on both automated and manual human experts)for entity linking. • 4% to 35% improvement than manually optimized agreement voting approach. • Average 14% improvement over best automated system. • In both cases, the improvement is statistically signicant (t-test p < 0.05) • Manual work makes the total annotation work significantly slow. So there are some questions about time – quality tradeoff. • They classified workers into {Good, Bad} manually and calculated workers reliability P(w) , but did not mention any relation between these two factors.

End of presentation of first paper

Paper 2: Pick-A-Crowd: Tell Me What You Like,and I’ll Tell You What to Do • This paper is about a different Crowdsourcing approach based on push methodology. • This new push methodology yields better results (29% more efficient) than usual pull strategies. Any worker can pull any task Figure: Traditional Crowdsourcing pull strategy

Example of traditional approach: Any worker can pull any task [1] So what’s wrong with this? • Does not care about workers field of expertise. • Not all workers are a good fit for all tasks. tasks requiring background knowledge is important. • “I had no idea what to answer to most questions...” was a comment of a worker from AMT (Amazon Mechanical Turk). [1] https://requester.mturk.com/images/graphic_process.png?1403199990

So how they are going to improve it? • Ranks/orders the workers according to the type of the work and skill of the workers and pushes the work to the most suitable workers. • At first they constructs user models for each workers in the crowd in order to assign HITs ( Human Intelligence Tasks) to the most suitable available worker. • User model/user profile is built based on his social network usage, his fields of interest.

So how this system ranks/orders the workers?Recommender system • Assigning HITs to workers is similar to the task performed by recommender systems. • The recommender systems matches HITs (Human Intelligence tasks) to human workers (i.e. users) profiles that describe worker interests and skills. • Then the system generates a ranking of candidate workers who can do the work better.

System Overview

Workflow of the system • Calculate work difficulty: - Every work is different from other works. HIT Difficulty Assessor takes each HIT and determines a complexity score for it. • Assess Worker skill: - System create workers profile considering his liked pagesand previously work experiences. • Calculate Reward for the work: - As every work is different and every workers ability differs from work to work, Rewards for different works and workers are different. System calculates rewards considering these factors. • Assign works to top-k suitable candidates: - Recommender system finds k top most suitable candidates and assign (pushes ) the work only these n workers.

Calculate work difficulty3 different possible algorithms • Text Compare: Compare the textual description of the task with the skill description of each worker and assess the difficulty. • LOD(Linked Open Data) entity based: • Each Facebook page liked by the workers can be linked to its respective LOD entities. • Then the set of entities related to HITs and the set of entities representing the interests of the crowd can be directly compared. • The task is classified as difficult when the entities involved in the task heavily differ from the entities liked by the crowd. • Machine Learning based: A classifier trained by means of previously completed tasks, their description and their result accuracy. The description of a new task is given as a test vector to the classifier.

4 possible way of Reward Estimation Input: A monetary budget B, HIT hi. • Rewarding the same amount of money for each task of the same type. • Taking into account the difficulty d() of the HIT h. • Computing a reward based on both the specific HIT as well as the worker skill who performs it. • Game theoretic based approaches to compute the optimal reward for paid crowdsourcing incentives in the presence of workers who collude in order to game the system.

Worker Profile Selector • This module uses the similarity measure that used for matching workers to tasks . • The entities included in the workers profiles can be considered. • The Facebook categories of their liked pages also plays significant role. • A generic Similarity measurement equation is • A = set of candidate answers for task hi • sim() = similarity between the worker profile and the task description. • 3 Assignment models for hit ( Human Intelligence task) assignemnt.

HIT ASSIGNMENT MODELS • Category-based Assignment Model • Tasks are assigned according to Facebook pages or page categories. (e.g. Entertainment -> Movie ) • Requestor mentions the category of the task. • Expert Profiling Assignment Model • Scoring function is based on a voting model. • Voting model is based on no. of pages related to the tasks and no. of pages user liked and how many are common. • Semantic-Based Assignment Model • Answers and liked pages are linked to entities and Underlying graph structure is used to measure the distance ( similarity). • Example SPARQL

Example: Expert Finding Voting Model Figure: An example of the Expert Finding Voting Model. The final ranking identifies worker A as the top worker as he likes the most pages related to the query

Summary of the system Hit Assigner assigns tasks to suitable workers Any worker can pull any task

Experimental Evaluation • Experimental Setting: • 170 workers • Overall, more than 12K distinct liked Facebook pages • workers have been recruited via Amazon Mturk. • Task categories: actors, soccer players, anime characters, movie actors, movie scenes, music bands, and questions related to cricket • 50 images/category. • Precision, Recall over majority votes obtained over 3 or 5 workers.

Figure: Crowd performance on the cricket task. Square points indicate the 5 workers selected by their proposed system. The best worker performing at 0.9 Precision and 0.9 Recall.

Figure: OpenTurkworker Accuracyvs no. of relevant Pages a worker likes. Observations: More relevant pages in the worker profile (e.g., >30), more accuracy.

Table: Average Accuracy for different HIT assignment models assigning each HIT to 3 and 5 workers. • Result is based on 320 questions. • Voting Model tiachieves29% relative improvement over the best accuracy obtained by the AMT model AMT – Amazon Mechanical Turk. Cagegory-based – Category of liked page and category of task based comparison. En. type 3/5 - Entity graph – Entity type in the DBPedia knowledge base assigning each HIT to 3 and 5 workers. Voting Model ti– Voting model based on page text relevant to the task. Voting Model Ai– Voting model based on all possible answer based similarity. 1-step – Considers directly related entities within one step in the graph.

Observations • May lead to longer task completion times • Real – time annotation is not possible. • But obtaining high-quality answers is more important rather than getting real-time data in most of the cases.

Thank you! Any Question?

Presented by: Muhammad Nuruddin , student ID: 2961230, email address: nuruddin@L3S.de,