1 / 16

A presentation by W H Inmon

ANALYZING TEXT IN THE BIG DATA ENVIRONMENT. A presentation by W H Inmon. much of the information found in the Big Data environment is text. the queries of the world can be divided into two classes – - simple search - analytical query.

shyla
Download Presentation

A presentation by W H Inmon

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ANALYZING TEXT IN THE BIG DATA ENVIRONMENT A presentation by W H Inmon

  2. much of the information found in the Big Data environment is text

  3. the queries of the world can be divided into two classes – - simple search - analytical query

  4. there are many differences between a simple search and an analytical query one of those differences is the state of the data that the query is made against

  5. some raw text

  6. the tone of the message is good. Another carrier that is mentioned is Singapore Air

  7. the tone of the message is very bad. The message mentions a late flight. The message is not formed in proper English, as might be found in a tweet or IM

  8. this cryptic message must first be expanded to understandable words There is no tone to this message. It is purely informational. Other types of data that have been extracted include flight number, city, activity, operand, and claim number. Note that two cities have been used to determine that the flight type is a US domestic flight

  9. this message written in French has a good tone and a very good tone. Where there are two tones the higher is designated as the official tone of the message. Other types of information found include city, and a reference to personnel. Note that two cities are used to determine that this is an overseas flight.

  10. this message is written in Spanish. The tone is very bad. City, service, on time, and connection are found in the message. A lawsuit is mentioned note that two cities are used to determine that this in international flight but not an overseas flight

  11. the tone of this message is very bad. The flight number,date, and city are mentioned. city is used twice to determine that this was a domestic US flight.

  12. the source of the text can be any electronic source

  13. after transformation occurs, the results are placed in a standard relational data base

  14. and once a standard relational data base is created, there are literally hundreds of analytical tools that can operate against the data base

  15. if you are an analyst, what kind of data do you want to operate on

More Related