1 / 27

EBM- An Entropy Based Model to Infer Social Strength from Spatiotemporal Data By Group 6

EBM- An Entropy Based Model to Infer Social Strength from Spatiotemporal Data By Group 6 Sruthi Gaddam Madhavi Paidipalli Amra Ananda. Contents. Introduction Related Work Problem Definition EBM Model Shannon Entropy based Diversity Renyi Entropy based Diversity Location Entropy

hart
Download Presentation

EBM- An Entropy Based Model to Infer Social Strength from Spatiotemporal Data By Group 6

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EBM- An Entropy Based Model to Infer Social Strength from Spatiotemporal Data By Group 6 SruthiGaddam MadhaviPaidipalli AmraAnanda

  2. Contents • Introduction • Related Work • Problem Definition • EBM Model • Shannon Entropy based Diversity • Renyi Entropy based Diversity • Location Entropy • Weighted Frequency • Social Strength • Evaluation • Conclusion

  3. INTRODUCTION • The goal is to derive the social network of people and the social strength from the real world location data. • Spatiotemporal data – collection of people’s locations over time is a rich source of information for studying various social behaviors. • Co-occurrences is the other source of information.

  4. Problem: • Consideration of co-occurrence attributes. • Social strength • Missing data-people’s location data may be sparse. • Spatiotemporal data is extremely large. • Over estimation of coincidences.

  5. INTRODUCTION TO ENTROPY BASED MODEL • First we use the Shannon entropy to measure the diversity of co-occurrences which uses only diversity as social strength. • Renyi entropy is used. • However, Renyi cannot estimate the characteristic of a location. • So, weight frequency isincorporated which uses location entropy to weigh each co-occurrence depending on characteristics of the location.

  6. Problem Definition • Social strength is a quantitative measure that tells how socially close two people are using the information - • U=(u1,u2, ..., uM) • L=(l1,l2,…,lN) • User-location-time triplets- < u,l,t > • The problem is to infer social strength for each pair of users.

  7. Problem Definition • Location Representation A quad tree storing areas of different levels of popularity and visits by users

  8. Problem Definition • Visit Vector It represents the visit history of a user, which shows the cell IDs and the check-in time. The general format for User i is: • Co-occurrence Vector Represents all the co-occurrences of users i and j : The local frequency is the number of co-occurrences between Users i and j at location l is

  9. THE EBM MODEL • EBM to quantify social strength between two users from their co-occurrence vectors.

  10. Diversity in Co-occurences • Diversity is the measure that quantifies how many effective locations the co-occurrences between two people represent.

  11. Shannon Entropy based Diversity:

  12. Shannon Entropy based Diversity: • The higher the number of co-occurrence locations, the higher the uncertainty and consequently the higher the diversity. • If the number of co-occurrence locations is fixed, the diversity and the Shannon entropy reach their maximums when all the probabilities are equal to each other.

  13. Shannon Entropy based Diversity:

  14. Renyi Entropy-based Diversity

  15. Renyi Entropy-based Diversity • When q > 1 the Renyi entropy , and consequently the diversity Dij , more favorably considers the high values of • When q < 1, in opposite, the diversity tends to give more weight to the local frequencies with low-values • When q = 0, the diversity is completely insensitive to and gives the pure number of co-occurrence locations. • Case q = 1: The Renyi entropy favors local frequencies in opposite ways when q < 1 versus when q > 1, therefore q = 1 is the pass-through point where Renyi entropy and its diversity stop all of their biased favors and weight the local frequencies

  16. Coincidences • Coincidences often produce high local frequencies , which, if misjudged, can be overestimated. • Renyientropy and its diversity give us the ability to control the impact of coincidences on diversity through q, which is sensitive to the values of local frequencies. • q is one of the optimization parameters and will be determined experimentally

  17. Location Entropy • Location entropy is the logarithm of the number of unique users, who have been at the place. • The dependence of location entropy’s value on the number of unique visitors for this simplified case. • Using location entropy, we can determine the places where coincidences are highly probable, even when the frequency of a user pair in such palce is low • When the number of co-locations is low, the diversity will also be low, hence this type of co-occurrences cannot be captured by the diversity measure.

  18. Location Entropy

  19. Weighted Frequency • Co-occurrences in small uncrowded places, such as private houses, often results in more social interaction, as compared to those in crowded places. • Therefore, the probability of friendships strongly depends on the locations of co-occurrences. • Weighted frequency plays an important role when it comes to data sparseness, i.e., when the availability of spatiotemporal data is very limited - only few co-occurrences for each couple, the Renyi’s diversity can be very low.

  20. Social Strength • Two independent ways, through which co-occurrences contribute to social strength: 1) Diversity (through Renyi entropy) - which measures how diverse the co-occurrences of two people are, and at the same time, can control and tell us how much coincidences can impact diversity. 2) Weighted frequency - which favorably captures the local frequencies of co-occurrences at uncrowded places and can compensate for diversity in case of data sparseness.

  21. Performance Evaluation • Dataset • Data from location based social network is taken, where users shared their location through check-ins. - Data consists of two different sets 1. Spatiotemporal data Check-in format: <user ID, latitude, longitude, timestamp, location ID> 2. Graph

  22. Dataset • Data is divided into two subsets - Training set: Contains check-ins in the West of USA and social network - Evaluation set: Contains check-ins in the East of USA and social network • Overlaps occurred but they are insignificant

  23. Methodology • Metrics used are Precision and Recall. • TC- Set of true social connections reported by social network (ground truth). • RC- Set of user pairs reported as socially connected.

  24. Order of Diversity

  25. Precision vs Recall

  26. Conclusion • EBM model showed us how to infer the social strength of two people and how to avoid coincidences. • Decreased the problem of data sparseness. • This algorithm is efficient and parallelizable with Map-Reduce framework.

More Related