Data information knowledge 2
1 / 14

Data, Information & Knowledge 2 - PowerPoint PPT Presentation

  • Uploaded on

Data, Information & Knowledge 2. Introduction. Previous presentation covered what data is* In this presentation we cover where data comes from and factors we need to take into account when gathering data for processing. * Should really be data “are” but nobody talks like this!. Data Sources.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Data, Information & Knowledge 2' - kacy

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript


  • Previous presentation covered what data is*

  • In this presentation we cover where data comes from and factors we need to take into account when gathering data for processing

* Should really be data “are” but nobody talks like this!

Data sources
Data Sources

Data can be collected either:


    • Gathered from an original source



    • Gathered from an another source or as a by-product of another operation

  • In the world of business these would be described as primary and secondary sources of data

Direct original data sources
Direct (Original) Data Sources

  • Sale of an item in a supermarket recorded at EFTPOS terminal

  • Data from sensors e.g. a weather station

  • Data collected in a survey e.g. a questionnaire or an interview

Indirect data sources 1
Indirect Data Sources 1

  • Data collected for one purpose and used for another

    • A credit card company collects data about your spending in order to bill you each month. However, a secondary use of this data is to build up a “profile” of your spending habits. This data can then be used to send you direct marketing about goods and services that may appeal to you.

Direct Use

of Data



Credit Card Transaction

Indirect Use

of Data



Indirect data sources 2
Indirect Data Sources 2

  • Purchased data/data passed on

    • There are a number of ways data can be acquired from 3rd parties and then used for a different purpose

    • A good example is the electoral roll. Its main use is to gather data about who is eligible to vote. However, marketing companies make extensive use of the roll to target customers.

Coding data
Coding Data

  • Before being stored in a computer information can be coded as data e.g.

    • M or F

    • Mo, Tu, We, Th, Fr, Sa, Su

    • I, II, IIIM, IIIN, IV, V

    • S, M, L, XL, XXL

  • In the picture shown we can see the date code for the tyre

This represents the eighth week of 2006

Benefits of coding
Benefits of Coding

  • Less storage space is required

    • M and F require less storage space than male and female

  • Faster data input

    • See above

  • Validation is easier

    • With a limited number of codes it is easier to match them against rules to check they are entered correctly

Drawbacks of coding
Drawbacks of Coding

  • Precision of data can be lost (coarsened)

    • In the example all shades of blue are coded as “blue”

  • The user needs to know the codes used

    • How many of these top level domains do you know?

    • au, ch, de, ie, pk, fr, il, lk, es

Data in





Stored data

Coding value judgements
Coding Value Judgements

  • Coding value judgements can be a particular problem as they are subject to personal opinion

  • What do you think of this presentation?

    • Good? Average? Poor?

    • One person’s good may be another person’s poor!!!

  • Value judgements are very difficult to encode without some coarsening (loss of detail)

  • How would you improve the analysis? What are the time/cost implications?

Quality of the data source 1
Quality of the Data Source 1

  • GIGO (Garbage In Garbage Out)

  • If data input is poor the resulting information output will be poor i.e. corrupt, inaccurate etc.

  • Can you think of any “real life” examples?

Garbage In

Garbage Out

Quality of the data source 2
Quality of the Data Source 2

Examples of GIGO can include:

  • Unreliable questionnaires/surveys

    • e.g. inappropriate samples, badly worded questions etc.

  • Incorrectly calibrated instruments

    • e.g. an incorrectly calibrated balance will give incorrect measures of mass

  • Human error

    • e.g. transcription errors when entering data

  • Incomplete data sets

    • e.g. failing to account for “shrinkage” when measuring supermarket stock

Summary revision topics
Summary/Revision Topics

  • Data can arise from direct and indirect sources

  • Information can be coded as data

  • This has a number of benefits but can lead to coarsening

  • The source/accuracy of data has a major impact on the quality of information produced i.e. GIGO

Revision tasks
Revision Tasks

  • Use your textbook/Internet sources to make your own notes on:

    • Sources of Data

    • Encoding Data

    • Quality of Data Sources

  • Try questions 18-24 on this worksheet

    Diagram/example on slide 9 courtesy of See the original here.