1 / 49

The North American Computational Linguistics Olympiad (NACLO)

The North American Computational Linguistics Olympiad (NACLO). Modified version of the presentation given by Lori Levin and Dragomir Radev in June 2008. Outline. Background and History Pedagogical goals and high school outreach Audience participation – sample problems Organization

luella
Download Presentation

The North American Computational Linguistics Olympiad (NACLO)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The North American Computational Linguistics Olympiad (NACLO) Modified version of the presentation given by Lori Levinand Dragomir Radev in June 2008

  2. Outline • Background and History • Pedagogical goals and high school outreach • Audience participation – sample problems • Organization • Making a NACLO problem book • Running the contest and scoring • Preparing for the ILO • Outcomes • Plans • Discussion of problem ideas

  3. Background and History

  4. What is NACLO • A high school contest in linguistics, computational linguistics, and language technologies. • No prerequisites • Does not require knowledge of specific human languages, advanced math, or computer programming • http://www.naclo.cs.cmu.edu

  5. Goal • To increase participation and diversity in linguistics, language technologies, and other language-related careers by introducing linguistics and computational linguistics to high school students. • Easy problems – everyone has fun; everyone learns something. Students will be encouraged to study linguistics, languages, and computational linguistics. • Hard problems – talent search for our next generation of colleagues. • Unlike the ILO and other national LOs, additional focus on computational/formal problems

  6. Sponsors • US National Science Foundation • Google • NAACL • Cambridge University Press • M*Modal • Vivisimo • Just Systems Evans Research • Leonard Gelfand center for outreach, CMU • Powerset • Donations from individuals

  7. History of Linguistics Olympiads • Similar to the 11 other “science olympiads” – some of which are extremely popular, e.g., in math (90 countries), physics, informatics (programming), biology, philosophy, etc. • Moscow: 1960’s -- • Bulgaria: 1980’s -- • A number of other countries • Linguistic Challenge • Thomas Payne, Eugene, Oregon, 1998-2000 • International Linguistics Olympiad: • 2009: the 7th ILO

  8. Pedagogical Goals and High School Outreach

  9. Contacting high schools • Different procedure in each city • Materials available

  10. What we want high school students to learn • Linguistics: • English has rules that you are not aware of. • Fan-blooming-tastic • *Fantas-bloomin-tic • There is methodology for discovering the rules. • You can use the methodology to discover rules in languages that you don’t speak. • Computer Science • (Dykstra) is no more about computers than astronomy is about telescopes. • Algorithmic thinking • Abstraction in representing a problem space • Search and reduction of the problem space • Evaluation of the solution • Etc.

  11. Fish Story Aymara is a South American language spoken by more than 2 million people in the area around Lake Titicaca, which, at 12,507 feet above sea level, is the highest navigable lake in the world. Among the speakers of Aymara are the Uros, a fishing people who live on artificial islands, woven from reeds, that float on the surface of Lake Titicaca.

  12. d. a. b. c. Below, four fishermen describe their catch. Who caught what? ___ 1. “Mä challwalla mä hach’a challwampiwa challwataxa.” ___ 2. “Paya hach’a challwawa challwataxa.” ___ 3. “Mä hach’a challwa kimsa challwallampiwa challwataxa.” ___ 4. “Mä hach’a challwawa challwataxa.” Also, watch out! One of the fishermen is lying.

  13. d. a. b. c. Below, four fishermen describe their catch. Who caught what? ___ 1. “Mä challwa+lla mä hach’a challwa+mpi+wa challwataxa.” ___ 2. “Paya hach’a challwa+wa challwataxa.” ___ 3. “Mä hach’a challwa kimsa challwa+lla+mpi+wa challwataxa.” ___ 4. “Mä hach’a challwa+wa challwataxa.” Also, watch out! One of the fishermen is lying.

  14. Fish Story Solution • challwataxa I caught • -wa accusative • -mpi and • maa one • Hach’a big • -lla small • kimsa three • 1, d • 2, b (lying about size) • 3, c • 4, a

  15. Organization

  16. Organization • Co-chairs: Lori Levin and Thomas Payne • Program chair and head coach: Dragomir Radev • Sponsorship chair: James Pustejovsky • Web: Eugene Fink, Ida Mayer, Justin Brown • High school liaison: Amy Troyani • Publicity and outreach chair: • Local chairs of host sites • Many volunteers at each site

  17. Making a NACLO Problem Book

  18. Problem committee • 2007: Emily Bender, John Blatz, Ivan Derzhanski, Jason Eisner, Eugene Fink, Boris Iomdin, Mahesh Joshi, Anagha Kulkarni, Will Lewis, Patrick Littell, Ruslan Mitkov, Thomas Payne, James Pustejovsky, Roy Tromble, and Dragomir Radev (chair). • 2008: Emily Bender, Eric Breck, Lauren Collister, Eugene Fink, Adam Hesterberg, Joshua Katz, Stacy Kurnikova, Lori Levin, Will Lewis, Patrick Littell, David Mortensen, Barbara Partee, Thomas Payne, James Pustejovsky, Richard Sproat, Todor Tchervenkov, and Dragomir Radev (chair). • 2009: + Harold Somers, Xiaojin Zhu, Bozhidar Bozhanov, Kate Spriggs, others…

  19. Problem submissions • Call for problems is issued several months before the contest. • Anyone may submit a problem. • 2007 (more than 30 submissions) • Some were reserved for practice • 2008 (more than 40 submissions) • 2009 (15 so far + 15 ideas)

  20. Making a problem • Submit an idea to the problem committee • Draft the problem • Review by the problem committee • Test on high school and college students • Refine and re-submit

  21. Running a contest and scoring

  22. Contest procedures • Email contest book to the host sites • Each site prints and copies • On the contest day: • Different start time for each time zone • Questions can be answered only by the jury by email

  23. Scoring • One person is in charge of scoring each problem. • Alone or with a team of scorers • Scoring rubric • Components of the solution and how many points to assign to each component • Practice (correct solution): about 40% • Theory (explanation): about 60%

  24. Preparing for the ILO

  25. Preparation for the ILO • Online and offline training • Live meetings • Lectures

  26. Outcomes

  27. Diversity • About half of the participants in NACLO were girls in 2007 and 2008. In 2007, 25 out of the top 50 students were female. • The two US teams that went to the ILO in 2007 included three girls, out of eight total team members (two teams of four). The 2008 teams include only one girl • Some random statistics: (a) of the top 20 students in 2008, 14 are from public schools, (b) 26 states, 3 Canadian provinces, and the District of Columbia were represented in 2008 • Canada participated for the first time in 2008 (about 20 students from Toronto, a handful from Ottawa and one from Vancouver). Two students did really well at the 2008 Open (one ranked second and two tied for 13th) but were not in the top 20 at the Invitational.

  28. Outcomes and lessons learned • Tremendous interest among students • A number of clubs started • A large amount of positive feedback • Australian contest • Success at the ILO • A huge team effort

  29. NACLO 2007 • 195 participants • 3 university sites • CMU • Cornell • Brandeis • 20 high school sites

  30. 2007 winners

  31. ILO 2007 • Held in Russia (St. Petersburg) • Two rounds: team and individual • Problems • Turkish/Tatar • Braille • Ndom (Papua New Guinea) • Movima (Bolivia) • Georgian (Caucasus) • Hawaiian

  32. ILO 2007 • Team contest (tied for first place): • USA Team 2: Rebecca Jacobs, Anna Tchetchetkine, Josh Falk, and Michael Gottlieb • Rebecca and Josh are on the 2008 team • Individual contest (highest score): • Adam Hesterberg • Now at Princeton

  33. NACLO 2008 • 763 participants • Top 115 were invited to the second round • 13 university sites • CMU, Cornell, Brandeis • Penn, Columbia, Michigan, Wisconsin, Illinois • Oregon, MTSU, SJSU • Ottawa, Toronto • 65 high school sites

  34. 2008 NACLO winners

  35. ILO 2008 • In Bulgaria • August 4-8 2008 • Problems in: • Drehu, Cemuhí, Micmac, Old Norse, Chinese dialects

  36. ILO 2008 • Many medals: two team golds • 1 individual gold: Hanzhi Zhu • 2 individual silvers: Morris Alper and Anand Natarajan • 3 individual bronzes: Guy Tabachnick, Rebecca Jacobs, and Jeffrey Lim

  37. NACLO 2009 • 21 university sites signed up (new ones in Seattle, Vancouver, Dallas, Memphis, Washington, Baltimore, Lethbridge, Great Falls, Mankato, Princeton).

  38. Future ILOs • 2009 in Poland • 2010 in Sweden? • 2011 in the US?

  39. Plans • SGER for improving computational problem types. • Automated scoring • Reach out to the endangered languages community (note that the ILO avoids very common languages but we still have a long tail of 6,000+ languages to work with) • Interactive on-line problems • Fundraising • Become a non-profit organization • Staying in touch with our students

  40. Acknowledgments • We want to thank everyone who helped turn NACLO into a successful event. Specifically, Amy Troyani from Taylor Allderdice High School in Pittsburgh, Mary Jo Bensasi of CMU, all problem writers and graders (which include the PC listed above as well as Rahel Ringger and Julia Workman) and all local contest organizers (James Pustejovsky, Lillian Lee, Claire Cardie, Mitch Marcus, Kathy McKeown, Barry Schiffman, Lori Levin, Catherine Arnott Smith, Richard Sproat, Roxana Girju, Steve Abney, Sally Thomason, Aleka Blackwell, Roula Svorou, Thomas Payne, Stan Szpakowicz, Diana Inkpen, Elaine Gold). James Pustejovsky was also the sponsorship chair, with help from Paula Chesley. Ankit Srivastava, Ronnie Sim and Willie Costello co-wrote some of the problems with members of the PC. Eugene Fink helped with the solutions booklets, Justin Brown worked on the web site, and Adam Hesterberg was an invaluable member of the team throughout. • Other people who deserve our gratitude include Cheryl Hickey, Alina Johnson, Patti Kardia, Josh Cannon, Christina Hunt, Jennifer Wofford, and Cindy Robinson. Finally, NACLO couldn’t have happened without the leadership and funding provided by NSF and Tanya Korelsky in particular as well as the generous sponsorship from Google, Cambridge University Press, and the North American Chapter of the ACL (NAACL) and our other sponsors. • The authors of this paper are also thankful to Martha Palmer for giving us feedback on an earlier draft. • NACLO was partially funded by the National Science Foundation under grant IIS 0633871 Planning Workshop for a Computational Linguistics Olympiad.

  41. Join us in preparing NACLO 2009!

More Related