framework for inferring ongoing activities of workstation users
Download
Skip this Video
Download Presentation
Framework for Inferring Ongoing Activities of Workstation Users

Loading in 2 Seconds...

play fullscreen
1 / 18

Framework for Inferring Ongoing Activities of Workstation Users - PowerPoint PPT Presentation


  • 91 Views
  • Uploaded on

Framework for Inferring Ongoing Activities of Workstation Users. Yifen Huang, Sophie Wang and Tom Mitchell School of Computer Science Carnegie Mellon University. Activity Example: Learned Activity Frame from TM email corpus [1448 msgs, Feb 2004]. ActivityCluster4 (105 emails)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Framework for Inferring Ongoing Activities of Workstation Users' - kaipo


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
framework for inferring ongoing activities of workstation users

Framework for Inferring Ongoing Activities of Workstation Users

Yifen Huang, Sophie Wang and Tom MitchellSchool of Computer ScienceCarnegie Mellon University

activity example learned activity frame from tm email corpus 1448 msgs feb 2004
Activity Example: Learned Activity Framefrom TM email corpus [1448 msgs, Feb 2004]
  • ActivityCluster4 (105 emails)
  • Keywords: CALO, TFC, SRI, examples, heads, labeled, Leslie, HMM, contacts, email, task, estimates, zero, reschedule, baseline, Rebecca
  • PrimarySenders:Mitchell(39), Kaelbling(7), McCallum(6), Perrault(4),
  • UserActivityFraction: 105/1448=.072 of total emails
  • IntensityOfUserInvolvement: created 37% of traffic; (default 31%)
  • ExtractedNames: Leslie(23), Rebecca(21), Carlos(12), Ray(10), Stuart(9), William(9), April(9), …
  • ExtractedDates: Wed(39), Tues(33), Fri(25), Mon(23), Thurs(20),… Feb 18 (16)
  • ExtractedTimes: 5pm(24), noon(14), morning(8), 8am(7), before 5pm(7),...
  • RequestEmails: <emailA>, <emailB>, …
slide3

Activity Example: Learned Activity Framefrom TM email corpus [1448 msgs, Feb 2004]

  • ActivityCluster5 (105 emails)
  • Keywords: CALO, TFC, SRI, examples, heads, labeled, Leslie, HMM, contacts, email, task, estimates, zero, reschedule, baseline, Rebecca
  • PrimarySenders:Mitchell(39), Kaelbling(7), McCallum(6), Perrault(4),
  • UserActivityFraction: 105/1448=.072 of total email
  • IntensityOfUserInvolvement: created 37% of traffic; (default 31%)
  • ExtractedNames: Leslie(23), Rebecca(21), Carlos(12), Ray(10), Stuart(9), William(9), April(9), …
  • ExtractedDates: Wed(39), Tues(33), Fri(25), Mon(23), Thurs(20),… Feb 18 (16)
  • ExtractedTimes: 5pm(24), noon(14), morning(8), 8am(7), before 5pm(7),...
  • RequestEmails: <emailA>, <emailB>, …
activity example learned activity frame from tm email corpus 1448 msgs feb 20041
Activity Example: Learned Activity Framefrom TM email corpus [1448 msgs, Feb 2004]
  • ActivityCluster5 (105 emails)
  • Keywords: CALO, TFC, SRI, examples, heads, labeled, Leslie, HMM, contacts, email, task, estimates, zero, reschedule, baseline, Rebecca
  • PrimarySenders:Mitchell(39), Kaelbling(7), McCallum(6), Perrault(4),
  • UserActivityFraction: 105/1448=.072 of total email
  • IntensityOfUserInvolvement: created 37% of traffic; (default 31%)
  • ExtractedNames: Leslie(23), Rebecca(21), Carlos(12), Ray(10), Stuart(9), William(9), April(9), …
  • ExtractedDates: Wed(39), Tues(33), Fri(25), Mon(23), Thurs(20),… Feb 18 (16)
  • ExtractedTimes: 5pm(24), noon(14), morning(8), 8am(7), before 5pm(7),...
  • RequestEmails: <emailA>, <emailB>, …
activity example learned activity frame from tm email corpus 1448 msgs feb 20042
Activity Example: Learned Activity Framefrom TM email corpus [1448 msgs, Feb 2004]
  • ActivityCluster5 (105 emails)
  • Keywords: CALO, TFC, SRI, examples, heads, labeled, Leslie, HMM, contacts, email, task, estimates, zero, reschedule, baseline, Rebecca
  • PrimarySenders:Mitchell(39), Kaelbling(7), McCallum(6), Perrault(4),
  • UserActivityFraction: 105/1448=.072 of total email
  • IntensityOfUserInvolvement: created 37% of traffic; (default 31%)
  • ExtractedNames: Leslie(23), Rebecca(21), Carlos(12), Ray(10), Stuart(9), William(9), April(9), …
  • ExtractedDates: Wed(39), Tues(33), Fri(25), Mon(23), Thurs(20),… Feb 18 (16)
  • ExtractedTimes: 5pm(24), noon(14), morning(8), 8am(7), before 5pm(7),...
  • RequestEmails: <emailA>, <emailB>, …

I need to get to DARPA by COB tomorrow a list of CALO participants who need access to the IPTO booth. It seems to me we should ask for this for any of you who is likely to be there. Could you let me know asap if you *might* be there? No big deal if you end up not going.

THanks, --r

content
Content
  • Inferring on-going activities by clustering, social network filtering and information extraction
  • Getting information from the whole workstation
  • Accepting user’s feedback
  • Future work
slide7

Inferring Activities

Using Emails

Activity clustersand descriptions

Clustering

Social network filtering

Information extraction

unsupervised learning of activities
Unsupervised Learning of Activities
  • Cluster emails
    • (Text) We use multi-nomial Naïve Bayes model and refine clusters by applying EM algorithm,
      • Represent email by bag of words in subject and body
    • (Socialnetwork) Subdivide each cluster based on graph of email co-recipients
      • Make each cliqueofco-recipients a subcluster
  • For each cluster, extractinformation from the email text and headers
slide9

Web Activity

Directories

Calendar

Email

To: [email protected] cmu.edu

Subj: fMRI meeting

We need to meet soon to discuss the paper deadline.

To: Sue @ cmu.edu

Subj: Re: fMRI meeting

Ok, I suggest Wednesday at 4pm.

fMRI paper writing

People: Sue, Bill

Document: <fileptr>

Meetings: Aug 24,

Emails: 1423, 1644,

Leader: Bill

Deadline: Jan 15

To: [email protected] cmu.edu

Subj: Re: fMRI meeting

See you then. Attached is the current draft.

slide10

fMRI paper writing

People: Sue, Bill

Document: <fileptr>

Meetings: Aug 24,

Emails: 1423, 1644,

Leader: Bill

Deadline: Jan 15

Web Activity

Directories

Calendar

Email

To: [email protected] cmu.edu

Subj: fMRI meeting

We need to meet soon to discuss the paper deadline.

To: Sue @ cmu.edu

Subj: Re: fMRI meeting

Ok, I suggest Wednesday at 4pm.

To: [email protected] cmu.edu

Subj: Re: fMRI meeting

See you then. Attached is the current draft.

getting information from the whole workstation
Getting Information fromthe Whole Workstation
  • Bag of word features for any queries using Google desktop search
  • We can produce feature vectors for meetings,person names, and project keywords.
    • Cluster initialization using project keywords
    • Co-clustering meetings and emails
    • Inferring any queries to activities
slide12

Cluster Initialization Using Bag of Features of Project Keywordsfrom YH email corpus [623 msgs, 2004]

DI: an improved version of random initialization (0.46)

GI: bag of features from Google desktop search for user-provided keywords (0.44)

content1
Content
  • Inferring on-going activities by clustering, social network filtering and information extraction
  • Getting information from the whole workstation
  • Accepting user’s feedback
  • Future work
speclustering model split specific topics from general topics

X

W

β

S

ξ

π

G

N

M

Speclustering Modelsplit specific topics from general topics
  • Each document has a cluster label S.
  • For each word in a document, there is a hidden variable X to indicate the word is generated by the cluster specific topic S or by the general topic G.
  • 3. Parameters can be estimated using the EM algorithm.

Activity

em modification with user s feedback

X

W

β

S

ξ

π

G

N

M

EM Modification with User’s Feedback
  • Email-cluster association
    • Re-assign posterior probability p(cluster|email) according to user’s approval or disapproval.
  • Keyword-cluster association
    • Re-assign if the keyword is confirmed by the user and if the keyword is removed by the user.
folder reconstruction accuracy using speclustering algorithm
Folder Reconstruction Accuracy Using Speclustering Algorithm

accuracy

Iteration

149 feedback entries(76 keyword-cluster pairs, and 73 email-cluster pairs)

future work
Future Work
  • Jointly cluster meetings, people, files and other interesting entities.
    • preliminary results of jointly cluster emails and meetings
      • Found good match between emails and meetings
      • Didn’t visibly improve cluster quality
  • Allow richer user feedback.
  • Move from bag of features to structural data.
ad