1 / 14

Topics in AI: Applied Natural Language Processing

Topics in AI: Applied Natural Language Processing. Information Extraction and Recommender Systems for Video Games: Gameplay. Krishna Achuthan , Stephanie Hasz , Carl Staab. November 23, 2009. Initial Tasks. Research prior work Video game review analysis Other product review analysis

merrill
Download Presentation

Topics in AI: Applied Natural Language Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Topics in AI: Applied Natural Language Processing Information Extraction and Recommender Systems for Video Games: Gameplay Krishna Achuthan, Stephanie Hasz, Carl Staab November 23, 2009

  2. Initial Tasks • Research prior work • Video game review analysis • Other product review analysis • Recommender methods • Create a lexicon of domain-specific terms for named entity recognition • Crawling sites, existing lexicons

  3. Previous Research • Jose Zagal's paper • Reviews include different commentary types • Found that game review NLP is a virgin topic • One paper finding polarity of adjectives using review scores • A couple papers using presence of feature nouns in user reviews for search

  4. NER & Recommender Research • Reviewed allgame, GameFly, GameSpot, GameSpy, GiantBomb, IGN, IMDB, MobyGames • GiantBomb: API for retrieving metadata • IGN: lexicon of video game terminology • Most sites had no “similar games” feature • Those that did used page views, genre, or user-submitted data

  5. Giantbomb Extraction • Crawled GiantBomb game database and extracted entity names and types for each game • Necessary for efficient tagging • Established a fixed dataset to avoid unexpected errors from editing on live database • Games, franchises and their games, platforms, companies, genres, characters, locations, concepts

  6. Named Entity Tagging • Used GiantBomb data to identify named entities in review text and their types • Tagger underwent several iterations • Result is flexible in terms of specifying capitalization or level of abbreviation for different starting strings, types of NEs • Most effective strategy: prioritize-but-overwrite-shorter

  7. Named Entity Tagging • Example: occurrence of “Super Mario World” in review text for “Mario Galaxy” • Super <Mario CHARACTER> World • <Super Mario FRANCHISE> World • <Mario TITLE_PART> tag rejected - not longer than <Super Mario FRANCHISE> • <Super Mario FRANCHISE> <World LOCATION> • <Super Mario World OTHER_GAME>

  8. Defining Gameplay • Read reviews, looking for sentences describing gameplay • Age of Empire III, Legend of Zelda: Twilight Princess, Animal Crossing, Gauntlet: Dark Legacy, Tony Hawk’s Pro Skater 3, Mario & Luigi: Partners in Time • Lack of emotional content in user reviews • Flaws described in more detail than strengths • Reviews focus on plot description • Categories emerged • Purchasing advice, story/structure, staying power/replay value, non-emotional and emotional gameplay experience, external factors

  9. Gameplay Adjectives • Google bigram dataset gave us 531 adjectives describing gameplay • Separated review files into sentences, extracted sentences containing Google adjectives • Also extracted adjectives from GameSpot reviews • Needed domain-specific data • Adjectives might show that users are describing things we haven't considered • Later used for noun extraction

  10. Review Adjectives • Using Stanford POS tagger, extracted adjectives from a subset of 3,074 reviews • Review subset taken from all genres with > 200 games • 60,000+ “adjectives” • Manually analyzed the list for gameplay words • Eliminated: • < 20 occurrences • Generic qualitative adjectives • Personality descriptors • Kept: action and experience words

  11. Resultant Adjective List • 1,141 adjectives from 20 to 16,094 occurrences • Words describing: • Size: massive/tiny • Pace: quick/slow • Ease: easy/impossible • Uniqueness: innovative/uninspired • Experience: addictive/tedious • Aesthetics: gorgeous/ugly

  12. Towards Using Adjectives • Extracted sentences with potentially interesting adjectives from a sample of reviews and parsed with the Minipar parser • Will allow us to further refine our lists of adjectives and especially nouns of interest • Eventually, will also use the MK-means clustering algorithm implemented this quarter to determine which adjectives are most useful

  13. Interface • Backend-functionality for basic interface coded by Krishna • Utilizes a different database, but ASP code might be portable • Database contains all GiantBomb data vs. the GameSpot subset with review data

  14. Next Steps • Cluster gameplay adjectives using Mkmeans • Description vs. experience? • Derive categories of gameplay • Assign games to gameplay categories • Extract sentences with both a gameplay adjective and noun • Assign games to their adjectives' categories • Incorporate gameplay features into database • Back-end coding of website

More Related