1 / 16

Emerging Trend Detection

Emerging Trend Detection. Shenzhi Li. Introduction. What is an Emerging Trend? An Emerging Trend is a topic area for which one can trace the growth of interest and utility over time. Example : “XML”, a technology that emerged in the mid 1990’s. Goals

yamka
Download Presentation

Emerging Trend Detection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Emerging Trend Detection Shenzhi Li

  2. Introduction • What is an Emerging Trend? • An Emerging Trend is a topic area for which one can trace the growth of interest and utility over time. • Example: “XML”, a technology that emerged in the mid 1990’s. • Goals • Teach the students how to do literature search • offer students ways to go beyond the knowledge presented in the course work, by exploring current research trends.

  3. semi-automatic methodology • Step 1 • decide a domain area to search in • Step 2 • Find the famous journals, conference or workshops among this domain area • Read papers to find several candidate emerging trends • Step 3 • Find more evidence for the candidate trends • Step 4 • Verify the candidate trends using INSPEC database

  4. semi-automatic methodology While (there are more links from the search engine retrieved pages <or> the desired number of candidate trends has not been found) { # Define searchTerm = "Candidate Emerging Trend" <AND> "constraining term“ # Define m = frequency of "searchTerm" in page # Define n = sum of all the frequencies of helper terms in page # Define L2 = an empty list used to store the candidate emerging trends Click on link = 1 // first link of interest in the search results if( "Year of page" in range of [current year-4, current year]) { if( m >= 2 <AND> n >= 2 ){ Accept the page; Add "searchTerm" to L2 if it is a candidate emerging trend; Look for the phrases with the highest frequency of occurrence. Add them to L2 if they qualify as candidate emerging trends (use domain knowledge); Give special attention to the line (or paragraph) containing the pattern “constraining term <and> helper term”. Add phrases appearing in that paragraph (or sentence) that are judged to be candidate emerging trends to L2 (use domain knowledge) }else{ Reject the page; } } else{ reject the page; } Click on link++ or exit // click on next link of interest or exit }

  5. Improvements • Literature search • how to form queries • how to judge the quality of sources • how to tell the authority of the documents • Introduce several search engines

  6. Improvements - tools

  7. Improvements - tools • Noun Phrases Extraction • Fuzzy Phrase Matching • can search using a combination of part of speech tags and exact words to find candidate trends appear in certain patterns • Example: “JJ+programming” will return phrases like “object-oriented programming”

  8. Experiments • Methodology • randomly split the students into two groups of roughly equal numbers • Students from both groups A and B were expected to have attended the lectures of the class. They were also expected to have introductory knowledge in the main topic area before participating in the experiment. • All the students had access to their textbooks, reference books and handouts given in the class. • Only Group B has the access to the multimedia tutorial which introduces the methodology for the algorithmic identification of an emerging trend and the tools designed to help searching.

  9. Experiment • Metrics • Precision • Did not use recall, as we do not have the resources to obtain a complete list of emerging trends at the time of this experimental evaluation, nor was it our pedagogical goal to have students retrieve all trends. • Results • Conducted three experiments • with a confidence level of 95%, Group B in both classes performs significantly better than Group A.

  10. Experiment • Methodology R O1 X O2 R O1 O2 • Results • no difference on the pretest between the two randomly assigned groups • the multimedia group showed a significant improvement in their scores from the pretest to the posttest • the scores on the posttest of multimedia group was higher than the control group • there was a greater increase in learning for the group who used the multimedia tool compared to those who only attended the lecture.

  11. Lattice • Definition • Let P be a set. An Order (or partial order) on P is a binary relation  on P such that, for all x,y,zP, • xx • xy and yx imply x=y • xy and yz imply xz • Let P be a non-empty ordered set. • If xy and xy exist for all x,yP, then P is called a lattice. • If S andS exist for all SP, then P is called a complete lattice.

  12. Fuzzy Lattice • Definition: A fuzzy lattice is a pair (L, (x,y)), where L is a conventional lattice and : S [0,1] is a fuzzy membership function on the universe of discourse S = (x,y): x,yL. It is (x,y) = 1 if and only if xy in L. is called inclusion measure. • Definition: A function h: L  R on a complete lattice L, satisfies the following three properties: • h(O)=0, where O is the least element in L • uwh(u) h(w), u,wL • uw h(xw)-h(xu) h(w)-h(u) x,u,wL • The inclusion measure of a lattice is defined as k(x,y) = h(y)/h(xy) V. Petridis and v.G. Kaburlasos. "Clustering and Classification in Structured Data Domains Using Fuzzy Lattice Neurocomputing". IEEE Trans on Knowledge and Data Engineering, Vol 12, No 2, March, 2001.

  13. Formal Concept Analysis • A context is a triple (G,M,I) where G and M are sets and IGM. The elements of G and M are called objects and attributes respectively. For AG and BM, define • A’ = mM | (gA) gIm , • B’ = gG | (mB) gIm ; • A concept of the context (G,M,I) is defined to be a pair (A,B) where AG, BM, A’=B and B’=A. • The set of all concepts of the context (G,M,I) is denoted by B(G,M,I). • For concepts (A1,B1) and (A2,B2) in B(G,M,I) we write (A1,B1)  (A2,B2), and say that (A1,B1) is a subconcept of (A2,B2), or that (A2,B2) is a superconcept of (A1,B1), if A1A2 (which is equivalent to B2B1). • (B(G,M,I); ) is a complete lattice. it is known as concept lattice of the context (G,M,I).

  14. ABCDEF, CDE, (5) ABCD,(2) BDF,(4) CE,(35) CD, (25) ABC, (12) BD, (24) D, (245) B, (124) C, (1235) , (12345) Lattice vs. Association Rules

  15. Lattice vs. Graph • The power set of graph G is lattice-ordered, and corresponding lattice-ordering, lattice-meet, and lattice-joint are conventional set-inclusion, set-intersection and set-union. • So G is a complete lattice. The least element and the greatest element are empty set and the master graph.

  16. Conclusions • Apply fuzzy concept lattice to association rules building • Embed time element and latent semantic into concept lattice to model term-doc matrix • Develop algorithms to mine fuzzy concept lattice to detect emerging trends

More Related