ME eting S chedule Extractor (MES)

MEeting Schedule Extractor (MES) November 25, 2010 Kyoungryol Kim Semantic Web Research Center, KAIST

Table of Contents • Introduction • Related Works • MES System • Specific Problems • Meeting Location • Start Time, End Time • Attendee • Schedule

Introduction • Schedule management is now one of the most important task for our life, automatic schedule information extraction is strongly required by people. • Schedule information can be found in 2 types of emails : • Meeting Announcement email • usually use standard format • include all information about the meeting • Appointment email (Conversation-style) • use spoken language • imperfect, uncertain information can be used • schedule can be changed frequently We will have a meeting as follow : Time : Nov. 25, 2010 PM 7:30-9:00 Location : KAIST CS B/D #2441 ..... P1 : ...Are you available on Monday?.. P2 : .. Alright, how about around 3 PM? P1 : .. Dear professor, Then I will be right there at 3.. P2. ...By the way, can you come together with TA?... • * Referenced from [DH Choi 2010]

Introduction • Meeting announcement consist of 2 PARTS: • itemized text (66%) • non-itemized text (34%) : natural language text • 96% of announcements are consist of mixed contents (itemized + non-itemized) • To get accurate information, we should look at both of them. (* The statistic data is based on our corpus) itemized non-itemized Meeting Announcement • Time : Nov. 25, 2010 PM 7:30 • Attendee : Prof. Choi, Henry • ... We will have a meeting on Thursday at CS B/D #2441. ...

Introduction • Assume that we can extract information from itemized text,and we can classify whether the sentence is itemized text or non-itemized text. • In this research, to gather more information, we just focus on the information buried in non-itemized text. Meeting Announcement We will have a meeting on Thursday at CS B/D #2441.... • Time : .... • Location : ... itemized non-itemized Extracted Schedule

Introduction • Domain • Korean Meeting announcement • Task • Schedule Information Extraction to the below information types • Start time, End time • Meeting Location • Attendee • Input • Meeting announcement email • Output • Extracted information Research Goal Extracted Schedule Meeting Announcement Email Internet-based Calendar (Google Calendar, iCalendar) Possible application

Related Works • CMRadar [Modi et al. 2005] • Personal assistant agent for calendar management, from natural language processing of incoming scheduling-related emails, to making autonomous scheduling decisions. • They focused on the design of the system : • Template data structure to communicate between the components. • Modular design. • Limitations • They just followed research on applying state-of-the-art NLP techniques • Defining some parsing rules specialized in English • Even they didn't consider about meeting announcement email.

Related Works • SIES [Min et al. 2005] • Sogang Information Extraction System, SIES • Corpus was Korean email documents for scheduling. • 245 emails (23.5 sent. on average) • Target information types • Attendee, Location, Time, Date • Features • Context feature : Lexico-semantic pattern → to avoid data sparseness problem • Sentence, document feature : position, # of occurence, surrounding word • Limitations • Overall performance was low except time and date, and they used too small corpus • Since they didn't normalize time and date, cannot be integrated with calendar system.

Input MES System Meeting Announcement (Email) itemized text processing module Sentence Type Classifier Output Extracted Information : - Start time / End time - Meeting Location - Attendee non-itemized text processing module Start/End Time Classifier NER Time ( Location ) Person Meeting Location Classifier Attendee Classifier

Classifier : Meeting Location • Input • Location tagged document • Output • Meeting Location • Features • Start Time • Surrounding words • TODO : • Apply position feature forsentences, documents • light syntactic pattern feature (e.g. lexico-syntactic pattern) Input Named Entity Recognition Meeting Location Classification Corpus Tagging Training Corpus Classification Classification Training Model Classification System Training System Output

Other Components • NER • Named Entity Recognizer specialized on meeting announcement • 3 target types: Time / Location / Person • Classifiers • Start / End time classifier to the time-type NE • need to think later • Attendee classifier person-type NE • need to think later

Schedule • ~December 3, 2010 • Study and summarize >5 related research papers • Apply position feature for sentences, documents • light syntactic pattern feature (e.g. lexico-syntactic pattern) • F1 > 85% (Meeting Location classifier) • ~December 31, 2010 • Study and summarize time extraction paper. • Design a classifier for start/end time information. • Construct system, F1>85%(goal)

ME eting S chedule Extractor (MES)

ME eting S chedule Extractor (MES)

Presentation Transcript

Syst mes Embarqu s

Managing entities (ME)s as vehicles for the delivery of Evidenced-Based Practices (EBP)s

C lass S chedule Group Fitness - May 9 - May 27

S CHEDULE AND G ENERAL I NFORMATION

Building Your S chedule

MARK ETING

S chedule Plus

How to manage your S chedule and Risks

Reading an d Phonics Information Me eting 3 rd October 2018