Project Part 2

Project Part 2 LING 572 Fei Xia 1/26/06

NLP Packages • FST: Carmel, AT&T toolkit • TBL: fnTBL • MaxEnt: • DT: C4.5 • Boosting: AdaBoost • LM: SRI LM • MT: GIZA++, Pharoah, …

Main steps • Download and compile the package, and test the code with given examples. • License, citation • Compilers, libraries, operating system • Create your own test data, write a few wrappers/converters, and test the code. • Fix bugs • Understand the main algorithm of the package: • Read README files, tutorials, and related papers • Check the source code. • Modify and improve the package • Run experiments

Using fnTBL • Download and compile the package, and test the code: (< 1 hour) • Create your own test data, write a few wrappers/converters, and test the code: (about 6 hrs, my time) • Understand the main algorithm of the package: (?? Hrs) • Modify and improve the package: (?? Hrs) • Run experiments: (computer time) • 12 experiments

Main tasks • Understand the code: • Core algorithm: fnTBL-1.1/src • POS tagger: perl_code/pos-train.prl and pos-apply.prl • A wrapper: perl_code/build_TBL_tagger1.pl • Modify the code: • Here you don’t need to change the core algorithm. • A new way of treating unknown words.  In Report2, explaining the algorithms and your modification

Main tasks (cont) • Run the code with different settings • Corpus size: 1K, 5K, 10K, 40K • Feature templates: all the types or a subset • Treatment of unknown words  Report 1

Report1 # of standard fewer feature w/ simple treatment sents case types for unknown words (tagger1.pl) (t=agger2.pl) (tagger3.pl) ================================================= 1K a11 a12 a13 5K a21 a22 a23 10K a31 a32 a33 40K a41 a42 a43 Replace each cell with a(b, c, d): a: tagging accuracy, b: # of lexical rules c: # of context rules, d: running time

Files for the project • Files given to you: • fnTBL-1.1.linux.tar.gz • params/ • data/: • perl_code/ • Files that will be produced by you: • new_params/: feature templates • new_perl_code/: build_TBL_tagger3.pl, pos-train3.prl and pos-apply3.prl. • report/: Report1 and Report2 • result/: a11/, a12/, …., a43/

Project Part 2

Project Part 2

Presentation Transcript

CSE 144 Project Part 2

Introduction to Project Life Cycle (Part 2)

Local Geography Part C Project 2

ECE 545 Project 2 Specification Part I

Microsoft Project 2010 Demo Part 2

Team Project Part 2: Final Report

Project Part 2: Parser

LECTURE 18: PROJECT STAKEHOLDER ANALYSIS PART 2

Project 2, Part 1

Project 2, Part 2

Project Part 2: Regional Analysis and Upscaling

Project Part 2

Project Specification part 2

Project Management – Part 2

Project Mapping: Part 2 Em-Power Solar Project

Project 2 Part D

Project Scope Processes - Part 2

ECON 545 Week 2 Project Part 1

ECON 545 WEEK 5 PROJECT PART 2

Algebra 1 Star Project Part 2

The Eagle Project Part 2: Project Proposal and Project Final Plan

Project HR Management Processes – Part 2