1 / 17

RINGKASAN DOKUMEN

RINGKASAN DOKUMEN. SHINTA P. Pendahuluan. Apa hal pertama yang Anda baca dalam sebuah novel? Memberikan ringkasan halaman web diambil terkait dengan permintaan pengguna. Diperlukan mesin peringkas otomatis. Ringkasan yang dihasilkan manusia yang mahal. Informasi: Headline news.

crevan
Download Presentation

RINGKASAN DOKUMEN

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RINGKASAN DOKUMEN SHINTA P.

  2. Pendahuluan • Apa hal pertama yang Anda baca dalam sebuah novel? Memberikan ringkasan halaman web diambil terkait dengan permintaan pengguna. • Diperlukan mesin peringkas otomatis.Ringkasan yang dihasilkan manusia yang mahal.

  3. Informasi: Headline news SIGIR'99 Tutorial Automated Text Summarization, August 15, 1999, Berkeley, CA

  4. TV-GUIDES — Pengambilan Keputusan SIGIR'99 Tutorial Automated Text Summarization, August 15, 1999, Berkeley, CA

  5. Abstracts of papers — Menghemat waktu SIGIR'99 Tutorial Automated Text Summarization, August 15, 1999, Berkeley, CA

  6. Graphical maps — Orientasi SIGIR'99 Tutorial Automated Text Summarization, August 15, 1999, Berkeley, CA

  7. Contoh: Buatlak Ringkassan untuk pembaca berikut ini1. Event Organizer2. Fans ABG

  8. TRIBUNNEWS.COM, BANDA ACEH -  Sembilan personel girlband Cherrybelle konser di Hotel Hermes Palace, Banda Aceh, Selasa (30/4/2013) malam. Dalam konser tadi malam mereka membawakan lagu dari album pertama maupun album terbarunya, yaitu Diam-diam Suka. Saat tampil membawakan lagu pertama Best Friend Forever, para fans Cherrybelle menyambutnya dengan histeris. Ketika tampil di atas panggung, sembilan perempuan imut ini terlihat memakai busana yang berbeda. Tidak seperti tampil di daerah lain, baju minimalis mereka pun ditanggalkan. Sebagai gantinya blus lengan panjang dengan warna-warna kalem membalut tubuh mereka. Bagaimana dengan tatanan kepala? Cherrybelle tampil polos tanpa aksesoris apa pun menempel di rambutnya. Hanya terlihat gaya ikat ekor kuda dan sebagian jepit. Tak ada pula kerudung atau hijab di kepala mereka, sebagaimana layaknya penampilan perempuan Aceh lainnya. Barulah saat di sela-sela persiapan menjelang tampil tadi malam, personel Cherrybelle menyisihkan waktu untuk melayani pertanyaan Serambinews.com (Tribunnews.com Network). Saat sesi wawancara dan foto barulah mereka mengenakan kerudung putih/ Dalam wawancara singkat mereka katakan bahwa Banda Aceh merupakan kota pertama yang mereka kunjungi dalam roadshow ini. Mereka langsung tertarik pada Kota Banda Aceh sejak pertama kali tiba di Bandara Sultan Iskandar Muda yang berlokasi di Blangbintang, Aceh Besar. Konser di Aceh ini merupakan rangkaian dari agenda roadshow Cherrybelle Beat Indonesia di 33 provinsi selama 31 hari. "Kami tuh sudah tertarik sama Banda Aceh sejak menginjakkan kaki di bandara. Bandaranya bagus banget, beda kali dengan bandara di kota lain. Di sini atapnya berbentuk kubah masjid, indah banget," puji mereka kompak saat wawancara eksklusif dengan Serambinews.com.

  9. Computational Approach: Basics • Bottom-Up: • I’m dead curious: what’s in the text? • Pengguna ingin mendapatkan semua info penting. • System butuh data2 yang penting untuk pencarian. Top-Down: • I know what I want! — don’t confuse me with drivel! • Pengguna hanya ingin jenis info tertentu. • Sistem membutuhkan kriteria tertentu yang menarik, digunakan untuk memfokuskan pencarian. SIGIR'99 Tutorial Automated Text Summarization, August 15, 1999, Berkeley, CA

  10. Top-Down: Info. Extraction (IE) • IE task: Given a form and a text, find all the information relevant to each slot of the form and fill it in. • Summ-IE task: Given a query, select the best form, fill it in, and generate the contents. • Questions: • 1. IE works only for very particular forms; can it scale up? • 2. What about info that doesn’t fit into any form—is this a generic limitation of IE? xx xxx xxxx x xx xxxx xxx xx xxx xx xxxxx x xxx xx xxx xx x xxx xx xx xxx x xxx xx xxx x xx x xxxx xxxx xxxx xx xx xxxx xxx xxx xx xx xxxx x xxx xx x xx xx xxxxx x x xx xxx xxxxxx xxxxxx x x xxxxxxx xx x xxxxxx xxxx xx xx xxxxx xxx xx x xx xx xxxx xxx xxxx xx xxxxx xxxxx xx xxx x xxxxx xxx Xxxxx: xxxx Xxx: xxxx Xxx: xx xxx Xx: xxxxx x Xxx: xx xxx Xx: x xxx xx Xx: xxx x Xxxx: xx Xxx: x SIGIR'99 Tutorial Automated Text Summarization, August 15, 1999, Berkeley, CA

  11. xx xxx xxxx xxx xxxx xxx xx xxx xx xxxxx x xxx xx xxx xx x xxx xx xx xxx x xxx xx xxx x xx x xxxx xxxx xx xx xxxx xxx xxx xx xx xxxx x xxx xx x xx xx xxxxx x x xx xxx xxxxxx xxxxxx x x xxxxxxx xx x xxxxxx xxxx xx xx xxxxx xxx xx x xx xxxx xxx xxxx xx xxxxx xxxxx xx xxx x xxxxx xxx Bottom-Up: Info. Retrieval (IR) • IR task: Given a query, find the relevant document(s) from a large set of documents. • Summ-IR task: Given a query, find the relevant passage(s) from a set of passages (i.e., from one or more documents). • Questions: • 1. IR techniques work on large volumes of data; can they scale down accurately enough? • 2. IR works on words; do abstracts require abstract representations? SIGIR'99 Tutorial Automated Text Summarization, August 15, 1999, Berkeley, CA

  12. IR: • Approach: operate at word level—use word frequency, collocation counts, etc. • Need: large amounts of text. • Strengths: robust; good for query-oriented summaries. • Weaknesses: lower quality; inability to manipulate information at abstract levels. • IE: • Approach: try to ‘understand’ text—transform content into ‘deeper’ notation; then manipulate that. • Need: rules for text analysis and manipulation, at all levels. • Strengths: higher quality; supports abstracting. • Weaknesses: speed; still needs to scale up to robust open-domain summarization. Paradigms: IE vs. IR SIGIR'99 Tutorial Automated Text Summarization, August 15, 1999, Berkeley, CA

  13. A Summarization Machine MULTIDOCS DOC QUERY 50% Very Brief Brief Headline 10% 100% Long ABSTRACTS Extract Abstract ? Indicative Informative CASE FRAMES TEMPLATES CORE CONCEPTS CORE EVENTS RELATIONSHIPS CLAUSE FRAGMENTS INDEX TERMS Generic Query-oriented EXTRACTS Just the news Background SIGIR'99 Tutorial Automated Text Summarization, August 15, 1999, Berkeley, CA

  14. The Modules of the Summarization Machine MULTIDOC EXTRACTS E X T R A C T I O N I N T E R P R E T A T I O N G E N E R A T I O N F I L T E R I N G ABSTRACTS DOC EXTRACTS CASE FRAMES TEMPLATES CORE CONCEPTS CORE EVENTS RELATIONSHIPS CLAUSE FRAGMENTS INDEX TERMS ? EXTRACTS SIGIR'99 Tutorial Automated Text Summarization, August 15, 1999, Berkeley, CA

  15. Typical 3 Stages of Summarization 1. Topic Identification: find/extract the most important material 2. Topic Interpretation: compress it 3. Summary Generation: say it in your own words …as easy as that! SIGIR'99 Tutorial Automated Text Summarization, August 15, 1999, Berkeley, CA

  16. Some Definitions • Language: • Syntax = grammar, sentence structure sleep colorless furiously ideas green — no syntax • Semantics = meaning colorless green ideas sleep furiously — no semantics • Evaluation: • Recall =how many of the things you should have found/did, did you actually find/do? • Precision = of those you actually found/did, how many were correct? SIGIR'99 Tutorial Automated Text Summarization, August 15, 1999, Berkeley, CA

  17. Overview of Extraction Methods • General method: score each sentence; combine scores; choose best sentence(s) • Scoring techniques: • Position in the text: lead method; optimal position policy; title/heading method • Cue phrases in sentences • Word frequencies throughout the text • Cohesion: links among words; word co-occurrence; coreference; lexical chains • Discourse structure of the text • Information Extraction: parsing and analysis SIGIR'99 Tutorial Automated Text Summarization, August 15, 1999, Berkeley, CA

More Related