1 / 33

Managing Information Quality in Organisations

Managing Information Quality in Organisations. Based on a presentation by Dr Mikhaila Burgess School of Computer Science & Informatics Cardiff University. Session overview. What is quality? What is Data Quality (DQ)? And why is it important anyway?

kamana
Download Presentation

Managing Information Quality in Organisations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Managing Information Quality in Organisations Based on a presentation by Dr Mikhaila Burgess School of Computer Science & Informatics Cardiff University

  2. Session overview • What is quality? What is Data Quality (DQ)? And why is it important anyway? • Potential impact of poor DQ (data quality) • Defining Data Quality • Designing for Quality Data • Ensuring DQ in databases • So what goes wrong? • Potential causes of poor DQ • Managing DQ … and some exercises

  3. Items about things, events, activities, transactions, … Numeric, alphanumeric, figures, sounds, images, … Recorded, stored, but not organised to convey any specific meaning Data vs Information Data Information • “data that have been organised in a manner that gives them meaning for the recipient” (Turban et al, 2005) • known; ‘surprise’ value One person’s data is another’s information

  4. What is ‘quality’? What does the word actually mean?

  5. Why is DQ important? Impact of Poor Data Quality … some examples

  6. Defining Data Quality How do we know what we all mean when we talk about DQ?

  7. Designing for Quality Data Ensuring a level of quality is your databases

  8. So what goes wrong? Some causes of poor quality data & information

  9. Data Entry: Human Aspect • Unintentional errors in data entry • Lack of understanding • Poor Training • Intentional incorrect data entry • Malicious / Non-malicious • Poorly defined or out-of-date collection process • Multiple levels of data entry Garbage in, Garbage out

  10. Data Entry: Technical Aspect • Inaccurate measuring or counting device • Errors in the data storage process • Missing data fields • Data scanner • Poor quality data scanner • Inappropriate scanner • Incorrect set-up Microfiche Microfilm Aperture cards

  11. Herbarium Catalogue • Approx 7 million specimens • Pressed & dried • Preserved in spirit • 30,000 per year • HerbCat • www.kew.org/herbcat/ • ePIC – electronic Plant Information Centre • www.kew.org/epic/

  12. Type Specimen • Over 350,000 • Original specimen • Fixed species name & description • 18th century • Reference point for botanists – applying names correctly (taxonomy & systematics) http://www.kew.org/collections/herb_types.html

  13. Random Data • Thursday 18th March 2010 – NYPD’s Identity Theft Squad deliver cheesecake to Walter (83) and Rose (82) Martin, Brooklyn, NY • 50 raids over 8 years Cops Sorry For Coming To Wrong Home 50 Times 50 errant visits blamed on computer glitch “The snafu started when police used the address as part of what Browne called “random material’’ to test an automated computer system that tracks crime complaints and records of other internal police information” • Apologise & explain … and to check people “weren’t using that address for identity theft” (Associated Press & Boston Globe)

  14. Organisational Issues • Scattering of databases throughout different departments or organisations • Lack of awareness of data quality issues • Obsession with technology • Old (Legacy) databases • Poorly documented data • Missing/poor documentation about purpose • Obsolete data • Mergers & Acquisitions • Non-merging of databases - autonomy • Merging of databases • Data stored in multiple locations and not correctly linked

  15. Merging Databases • Homonyms & synonyms • Surname, Name, Customer, CustName, … • OrderID • ID for order processed for a customer • ID for order placed with a supplier • Representational inconsistency • Data: eg address • Database: eg • Oracle & SQLServer • Access & Objectivity

  16. Merging Databases • Designed for different purpose • Database design • Data collection • Student database • storing module marks, working out number of resits, allowing to proceed, degree classification • storing financial details, whether fees have been paid, ensuring no awards presented until account is clear • RAF, Navy, Army • Codes for individual stock items • Merged db’s … • Iraq – 3 days out of action!

  17. Merging Databases • Duplicate data • eg customer name: Mikhaila Burgess Variations: Dr Mikhaila Burgess Dr M S E Burgess Ms M Burgess Mr M Burgess M Burgess Michaela Burges Michael Burge Misspellings: Mikalia Burgers Mikkalia Burgese Michelle Barron

  18. Creator Custodian Consumer Introducing DQ problems (Strong et al 1997) Data production Same data collected in different data sets Customer data: Sales, Support, Finance, … Hospital: clinical, diagnosis, specialist treatment, finance, … Different purpose, different data stored Not necessarily the same values Different entry procedures & constraints Different relevant information Cascading updates?

  19. Custodian Consumer Creator Introducing DQ problems (Strong et al 1997) Data storage Potentially large volumes of data Accessibility challenges Access codes (eg country: 1-UK, 2-USA, …) Distributed data Heterogeneous storage systems Potentially inconsistent data formats & values

  20. Custodian Consumer Creator Introducing DQ problems (Strong et al 1997) Data usage Information needs change Personal requirements Organisational environment Data no longer relevant Conflicts between accessibility and security, privacy & confidentiality Access limitation due to lacking IT resources But who are these people?

  21. An Issue of Change • Organisations change • The environment changes • government, competition, market needs, customers, customer requirements … • Requirements & specifications change • Different projects have different requirements • Require data for different purposes • Ideal world: stop data entry, clean, ensure fit for purpose, restart with perfect database • Tomorrow it will no longer be perfect!

  22. 10 Potholes to IQ (Strong et al 1997) #1 Multiple sources of the same information produce different values. #2 Information is produced using subjective judgments, leading to bias. #3 Systemic errors in information production lead to lost information. #4 Large volumes of stored information make it difficult to access information in a reasonable time. #5 Distributed heterogeneous systems lead to inconsistent definitions, formats, and values. #6 Nonnumeric information is difficult to index. #7 Automated content analysis across information collections is not yet available. #8 As information consumers’ tasks and the organisational environment change, the information that is relevant and useful changes. #9 Easy access to information may conflict with requirements for security, privacy, and confidentiality. #10 Lack of sufficient computing resources limits access.

  23. Managing Information Manage data/information as a product, not a by-product … TQM for Data!

  24. The Deloitte CIO club • October 2005 50% of CIOs report that data quality issues have had a negative impact on their business in the last year, and 6% say it affects them on a daily basis. A further 19% are occasionally affected. Panel admits to lack of strategic approach to managing data quality • • 50% of CIOs consider data quality to be an IT issue: even though 88% also believe that their non-IT colleagues are aware of the benefits of better quality data. Data cleansing is reactive, not proactive. Many CIOs stated it only happens “when it’s needed” – for example, when new systems are introduced – with none carrying out regular, programmed data cleansing sweeps. http://www.deloitte.com/uk/cio/

  25. Managing data as a product (Lee et al 2006) (Wang et al 1998) • Data & Information – typically treated as a by-product • Focus on system, not data • Treat data/information as a product • An end deliverable that will satisfy customer needs • Focus on data & fitness for purpose • Fundamental change in organisations understanding of data • Follow four principles … • Understand consumer’s information needs • Manage the data production process • Manage data as a product with a product life-cycle • Data product manager – responsible for managing the data product Consumer Creator Custodian

  26. Product & Information Manufacturing

  27. TQM to TDQM • TQM – typical foundation for DQ/IQ programmes • Define the IP • Identify characteristics of the IP, determine IQ dimensions • Identify IP requirements • Identification of IP manufacturing process, and those involved • Measurement • Determining extent of IQ problems • Looks at results of previous attempts to resolve issues – learning from experience • Analysis • Pinpoints causes of poor IQ; effects on organisation; consider users; Pareto charts, SPC • Improvement • Delivering methods of continuous improvement

  28. Data Quality Policy • For organisation to remain engaged & succeed in maintaining a viable, sustained DQ effort • Proactively support business activities A DQ policy must reflect the vision of the organisation. • Start DQ management programme … effort not sustained • Single DQ Champion or department … others fail to come on board … not disseminated across business Organisational policy must involve all functions and activities relating to the maintenance of data products.

  29. 10 Policy Guidelines (Lee et al 2006) The organisation … • … adopts the basic principle of treating information as product, not by-product. • … establishes and keeps data quality as a part of the business agenda. • … ensures that the data quality policy and procedures are aligned with its business strategy, business policy, and business processes. • … establishes, clearly defined data quality roles and responsibilities as part of its organisation structure. • … ensures that the data architecture is aligned with its enterprise architecture.

  30. 10 Policy Guidelines (Lee et al 2006) • … takes a proactive approach in managing changing data needs. • … has practical data standards in place. • … plans for and implements pragmatic methods to identify and solve data quality problems, and has in place a means to periodically review its data quality and data quality environment. • … fosters an environment conducive to learning and innovating with respect to data quality activities. • … establishes a mechanism to resolve disputes and conflicts among different stakeholders.

  31. Examples … http://www.lancashirecare.nhs.uk/documents/FOI_12DataQualityPolicy.pdf http://www.suffolk.gov.uk/CouncilAndDemocracy/OurPerformance/DataQualityPolicy.htm

  32. Review • What is quality? • Defining Quality & DQ • Importance of quality data • DQ in databases • Database design • Database Integrity • Some examples of poor DQ and it’s impact • http://www.iqtrainwrecks.com/ • Measuring DQ • Managing data as product

  33. References CROSBY, P.B. (1978) Quality is Free: The Art of Making Quality Certain, McGraw-Hill. DROMEY, R. G. (1996) Concerning the Chimera. IEEE Software, 13(1), pp 33-43. JURAN, J. M. & GODFREY, A. B. (1999) Juran's Quality Handbook (Fifth Edition), McGraw Hill, USA. LEE, Y.W., PIPINO, L.L., FUNK J.D. and WANG, R. Y. (2006) Journey to Data Quality, MIT Press, MA, USA. PIRSIG, R. M. (1974) Zen and the Art of Motorcycle Maintenance, Random House. REDMAN, T.C. (1995) “Improve Data Quality for Competitive Advantage,” Sloan Management Review, 36(2), Winter 1995, pp 99-107. REDMAN, T.C. (1997) Data Quality for the Information Age, Artech House. STRONG, D.W., LEE, Y.W. & WANG, R.Y. (1997) 10 Potholes in the Road to Information Quality, IEEE Computer, August 1997, pp 38-46. TURBAN, E., ARONSON, J.E., & LIANG, T.P. Decision Support Systems and Intelligent systems (7th ed), Prentice-Hall. WANG, R., LEE, Y.W., PIPINO, L.L. & STRONG D.M. (1998) “Managing Your Information as a Product,” Sloan Management Review, 39(4), Summer 1998, pp95-105. WANG, R. & STRONG D. (1996) Beyond Accuracy: What data quality means to data consumers. Journal of Management Information Systems, Spring 1996, 12(4), pp 5-33. WATSON, R.T. (2003) Data Management: Database and Organizations, Wiley & Sons.

More Related