1 / 10

SIGCSE 2008 It Sounded Like a Good Idea at the Time… Manipulated by Strings

SIGCSE 2008 It Sounded Like a Good Idea at the Time… Manipulated by Strings. Margaret Menzin Simmons College. A Data Structures Course The assignment:. Read a file of names like President George Washington Identify the titles from a list and strip them

norina
Download Presentation

SIGCSE 2008 It Sounded Like a Good Idea at the Time… Manipulated by Strings

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SIGCSE 2008It Sounded Like a Good Idea at the Time…Manipulated by Strings Margaret Menzin Simmons College

  2. A Data Structures CourseThe assignment: • Read a file of names like President George Washington • Identify the titles from a list and strip them • Isolate the last name and invert it to Washington, George • Alphabetize the list

  3. Known issues for students tohandle: • Equivalence of upper and lower cases for purposes of alphabetization • Generating a list of titles • Matching from the list • Isolating the last name by looking backwards from the end of the name for the last blank • Usual file handling • Use of a simple sort

  4. Some surprises: • Some titles are at the beginning, but also some are at the end • Titles must be stripped recursively:Hon. Father Robert F. Drinan, S.J., L.L.D.Rev. Dr. Martin Luther King, jr.Augusta Ada Byron King, Lady LovelaceMajor General Stanley • Some titles occur in the middleBernard Cardinal Law • Some of these titles can also be first, middle and last names – a problem which is exacerbated when we add other languages • Jr, II, etc. must be handled

  5. More surprises: • In alphabetizing apostrophes and hyphens are ignored ( O’Reilly and OReilly are equivalent) • We need to worry about alphabetical order using other alphabets Alphabetize using first the Latin alphabet and then other alphabets in the order of their names in English (Cyrillic before Greek)

  6. Simplification: • Ignore titles in the middle • Use an abbreviated list of titles • Ignore other alphabets

  7. Still more surprises – where does the last name of these people begin: • Leonardo da Vinci • Catherine de Medici • Ponce de Leon • Vasco da Gama • Jean de la Fontaine • Gabriel Garcia Marquez • Vicente Fox Quesada • Wernher von Braun • Elizabeth Alexandra May Windsor • Thomas a Beckett • Mao Tse-tung (Mao Zedong)

  8. The answers • Leonardo da Vinci • Catherine de Medici • Juan Ponce de Leon • Vasco da Gama • Jean de La Fontaine • Gabriel Garcia Marquez • Vicente Fox Quesada • Wernher von Braun • Elizabeth (Alexandra May Windsor) II • Thomas (a) Beckett • Mao Tse-tung

  9. The solution • Use the alphabetization standards of the American Library Association • According to the A.L.A. you alphabetize using the rules of the language the person wrote/spoke in • There are special rules for monarchs and saints – they are alphabetized by first name • Note: The A.L.A. keeps the name as first_name last_name and has another field to specify the character where the last name begins!

  10. Conclusion • Internationalization is much harder than it looks! • p.s. The British use different rules for alphabetization than the U.S. does; surely other countries use other rules.

More Related