1 / 28

Comparing linked maternity data sets to check data quality in SPSS 

Comparing linked maternity data sets to check data quality in SPSS . Preeti Datta-Nemdharry, Nirupa Dattani and Alison Macfarlane. Background (1). Birth registration By law, live births must be registered within 42 days of birth

breck
Download Presentation

Comparing linked maternity data sets to check data quality in SPSS 

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Comparing linked maternity data sets to check data quality in SPSS  Preeti Datta-Nemdharry, Nirupa Dattani and Alison Macfarlane

  2. Background (1) Birth registration • By law, live births must be registered within 42 days of birth • Information recorded from parents is mainly socio-demographic, such as names, address of residence, occupation of parents, marital status and country of birth

  3. Background (2) NHS Numbers for Babies (NN4B) • Central Issuing System introduced in 2002 for issuing NHS numbers at birth for babies born in England, Wales and the Isle of Man • A small set of data is collected, including gestational age for live births, ethnicity of baby and date and time of birth

  4. Background (3) Maternity Hospital Episode Statistics (HES) • Data should be collected for all births occurring in England • Core admitted patient care record for mother plus ‘maternity tail’ with details of delivery and the baby. • Core birth record for baby plus ‘baby tail(s)’

  5. Background (4) National Community Child Health database (NCCHD) and Patient Episode Database for Wales (PEDW) • Data collected for all births occurring in Wales • Information collected on maternity similar to HES

  6. Method • Link data for 2005 and 2006 for England and Wales • Phase 1 involving linkage of birth registration data to NN4B data • Phase 2 involving linkage of registration/NN4B data to Maternity HES for England and Child Health/PEDW databases for Wales

  7. Method cont… Phase 2 • Linkage to maternity HES carried out by Northgate Solutions using algorithm devised by City University • Key data items for linkage, e.g. NHS no, DOB and unique ID compiled by ONS sent to Northgate solutions for linkage • Linkage to Child Health and PEDW databases carried out by NHS Wales Informatics Service using the same algorithm

  8. After the linkage was done… • HES records, linked to registration/NN4B data, had multiple records for the same mother for each episode. • So needed to omit the duplicates by keeping records with most information. • Ensure one-to-one linkage to registration/NN4B

  9. Identifying duplicates, triplicates.. • GET • FILE='C:\Users\trial\Desktop\exampleHES.sav'. • Dataset name DataSet1 Window=Front. • * Identify Duplicate Cases after sorting by id and within id by epikeys. • Dataset activate Dataset1. • Sort cases by id(D) epikeys(D). /* sorts the cases first by id(D) and then by epikeys(D)*/. • compute flag=1. /*computes a variable called flag with default value of 1 */. • if id=lag(id) flag=0. /*replaces any initial ‘1’ value to 0 if id = the same id in the row before*/. • exe.

  10. id and epikey sorted – descending 1.00 allocated to the highest epikey per id

  11. Creating a file with only one id per row… • *Create wodups - without duplicates dataset. • Dataset Activate dataset1. /*exampleHES dataset is the active dataset */. • Dataset copy wodups. • Select if (flag=1). /*selecting the record with the most information ie the highest epikey*/. • Exe.

  12. Merge with exampleNN4BREG data • *merge exampleHES with exampleNN4BREG. • *first sort the key variable e.g. id. • *main dataset. • Dataset activate wodups. • Sort cases by id(A). /*make sure the cases are sorted in both the datasets */. • *dataset to be merged. • Dataset Activate NN4BREG. • Sort cases by id(A). • *merging. • Match files file=wodups. • /file=NN4BREG • /by id. • Exe.

  13. Data quality checks • Quality of maternity HES based on completeness and consistency of the HES data in relation to birth registration data where ever possible • NN4B data used to validate maternity HES where information not available from registration.

  14. Missing data • *Missing data - for string variables eg NHS No. • Dataset activate wodups. • missing values NHSnoHES (" "). • freq var = NHSnoHES/format=notable. • /*gives only the total numbers */. • *OR. • compute var1 = (length(rtrim(NHSnoHES)) = 0). • execute. • desc var = var1 • /statistics = sum.

  15. *Missing data - for dates, after checking formats. • freq var=dobHES/format=notable. • *Missing data for numeric variables e.g. birthweight. • Freq var=birthweightHES/format=notable • *OR. • Compute noBWT=missing(birthweightHES). /*codes 1 as missing */. • Exe.

  16. Cross checking dates… • *Cross checking baby's dob • *1) Formatting dates. • *if one date is string - reformat to date. • Compute datevar2=Number(dobReg,ADATE10). /*converting date in string eg 01/01/2005 into date format*/. • Formats datevar2 (ADATE10). • Execute. • *if both are in date format but need to reformat into eg yyyy/mm/dd. • formats dobHES (sDate10). /*other way around ie mm/dd/yyyy - (aDate10) */. • execute. • *2) cross checking dates. • compute equal=dobHES=dobReg. /*gives value of 1 =same dates and 0 = dates differ*/. • Execute. • freq var=equal/format=notable. /* shows how many are equal*/.

  17. Birthweight • *cross checking birthweight between two datasets. • *one way- create another variable which will give value of 0 if not equal and 1 if equal. • DATASET ACTIVATE wodups. • Compute birthweight3=birthweightHES= • birthweightReg. • Execute. • Freq var birthweight3.

  18. *OR group birthweight into categories and see how many cases fall into each category. • *recoding birthweight data for HES. • Recode birthweightHES (0=0) (9998=0) (MISSING=0) (1 thru 499=1) (500 thru 999=2) (1000 thru 1499=3) (1500 thru 1999=4) (2000 thru 2499=5) (2500 thru • 2999=6) (3000 thru 3499=7) (3500 thru 3999=8) (4000 thru 4499=9) (4500 thru 4999=10) (5000 thru 5499=11) (5500 thru • Highest=12) INTO BWTgroupHES. • Var labels BWTgroupHES 'BWTgroupHES'. • Exe. • *recoding birthweight data for registration. • Recode birthweightReg (0=0) (9998=0) (MISSING=0) (1 thru 499=1) (500 thru 999=2) (1000 thru 1499=3) (1500 thru 1999=4) (2000 thru 2499=5) (2500 thru • 2999=6) (3000 thru 3499=7) (3500 thru 3999=8) (4000 thru 4499=9) (4500 thru 4999=10) (5000 thru 5499=11) (5500 thru • Highest=12) INTO BWTgroupReg. • Var labels BWTgroupReg 'BWTgroupReg'. • Exe. • Crosstabs • /tables=birthweightHES BY birthweightReg • /format=avalue tables • /cells=count /*row column-If want row percentage or column percentage */. • /count round cell.

  19. Gestational age • *recoding gestational age data. • Recode gestNN4B (0=0) (missing=0) (1 thru 21=1) (44 thru Highest=2) (Else=Copy) into GestGroupNN4B. • Var Labels GestGroupNN4B 'GestGroupNN4B'. • Execute. • Recode gestHES (0=0) (missing=0) (1 thru 21=1) (44 thru Highest=2) (else=Copy) into GestGroupHES. • Var labels GestGroupHES 'GestGroupHES'. • Execute. • Crosstabs • /tables=GestGroupHES BY GestGroupNN4B • /format=avalue tables • /cells=count row column total • /count round cell.

  20. Ethnicity • *Recoding ethnicity. • Recode ethnicNN4B ('A'=1) ('B'=1) ('C'=1) ('D'=9) ('E'=9) ('F'=9) ('G'=9) ('H'=2) ('J'=3) ('K'=4) ('L'=9) ('M'=6) ('N'=5) ('P'=7) ('R'= • 8) ('S'=9) ('Z'=10) (missing=10) into ethnicgroupNN4B. • Var labels ethnicgroupNN4B 'ethnicgroupNN4B'. • Execute. • Recode ethnicHES ('A'=1) ('B'=1) ('C'=1) ('D'=9) ('E'=9) ('F'=9) ('G'=9) ('H'=2) ('J'=3) ('K'=4) ('L'=9) ('M'=6) ('N'=5) ('P'=7) ('R'= • 8) ('S'=9) ('Z'=10) (missing=10) into ethnicgroupHES. • Var labels ethnicgroupHES 'ethnicgroupHES'. • Execute. • *also rename the variable values into the relevant ethnic group.

  21. Results 91% of maternity HES delivery records could be linked to the birth registration/NN4B records

  22. Linked records for singleton births with missing data items in common data fields, 2005

  23. Comparison of sex for singletons in the linked records, 2005

  24. Concordance in data items between NN4B and maternity HES, 2005 * using birth registration rather than NN4B

  25. Conclusion • Good linkage rate was obtained • To gain maximum benefit, data quality and completeness needs to improve in maternity HES • SPSS is useful in data quality checks.

More Related