Row level metadata
This presentation is the property of its rightful owner.
Sponsored Links
1 / 11

Row – Level Metadata PowerPoint PPT Presentation


  • 87 Views
  • Uploaded on
  • Presentation posted in: General

Row – Level Metadata. Gregory Steffens Associate Director, Programming Novartis. Why Do We Need Row-Level Metadata?. If we know why, we will know how to design it and when to use it. A requirement for describing tall-thin data sets in studies and in data standards

Download Presentation

Row – Level Metadata

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Row level metadata

Row – Level Metadata

Gregory Steffens

Associate Director, Programming

Novartis


Why do we need row level metadata

Why Do We Need Row-Level Metadata?

If we know why, we will know how to design it and when to use it

| Presentation Title | Presenter Name | Date | Subject | Business Use Only

  • A requirement for describing tall-thin data sets in studies and in data standards

    • Storing data in –TESTCD --ORRES kinds of data set requires more than a simple metadata that can describe data sets and variables

    • These data sets have several variables that the simple metadata cannot describe, including ORRES, ORRESU, STRESN, STRESC, STRESU, STRESPOS, etc.

    • In the simpler world these test results and attributes would be stored in short-wide data sets in variables like HEIGHT, HEIGHT_UNIT, WEIGHT, WEIGHT_UNIT, SYSBP, SYSBP_UNIT, SYSBP_POS

    • Storing these test results in ORRES kinds of variables does not mean we need less metadata, a lesser number of variables does not mean a lesser amount of metadata. ORRES contains many virtual variables we need to describe just as if they were in a simple short-wide data set.

  • A prerequisitefor software to transform data for reporting purposes


An example of a short wide data set

An Example of a Short-Wide Data Set

A variable for each result and result unit

| Presentation Title | Presenter Name | Date | Subject | Business Use Only


Some of the metadata to describe short wide data

Some of the Metadata to Describe Short-Wide Data

A simple description of the attributes of these variables

| Presentation Title | Presenter Name | Date | Subject | Business Use Only


Same values in a tall thin data set

Same Values in a Tall-Thin Data Set

Results now all in 1 variable and units in 1 other variable

| Presentation Title | Presenter Name | Date | Subject | Business Use Only


Some metadata to describe the tall thin data set

Some Metadata to Describe the Tall-Thin Data Set

Row-level metadata must define all the attributes of a variable but for a subset of the rows defined by each unique value of xxTESTCD

| Presentation Title | Presenter Name | Date | Subject | Business Use Only


Categories of variables in tall thin data sets

Categories of Variables in Tall-Thin Data Sets

Metadata must fully describe all the attributes of all the categories

| Presentation Title | Presenter Name | Date | Subject | Business Use Only


What row level metadata is not

What row-level metadata is NOT!

Not meant to define other relationships in study metadata

| Presentation Title | Presenter Name | Date | Subject | Business Use Only

  • NOT a list of values, ValueList is not simply a list of values

  • Row-level metadata is not designed to define all the other relationships between study variables

  • It is designed as metadata, i.e. to describe the ItemDef attributes of virtual variables. That is, to describe the attributes of parameter-related variables for each value of –TESTCD

  • It should not be used for non-metadata purposes

    • NOT to define the height unit of measure as being inches in the USA but centimeters in the EU

    • NOT to look for males with positive pregnancy test results

    • NOT to define all the edit checks. That can be data driven but NOT by row-level metadata, which is inadequate to this task because it only enables single-domain where conjuncts


Problem solved

Problem Solved

Metadata and a pair of macros enables easy transformation of data

| Presentation Title | Presenter Name | Date | Subject | Business Use Only

Transforming data between short-wide and tall-thin data sets is now a very simple macro call

%dt_wide2thin(data=vitals,out=vs,mdlib=md)

%dt_thin2wide(data=vs,out=vitals,mdlib=md)

The tall-thin and short-wide data structures are not perfect for all uses, summary tables, listings, deriving new parameter results from mutiple parameter results, comparing parameter results, etc.

Tall-thin is very better for storage, summary tables

Short-wide is better for listings, deriving, comparing

Define file and data transparency achieved


Variable categories described

Variable Categories Described

| Presentation Title | Presenter Name | Date | Subject | Business Use Only

  • Primary Keys

    • Defined in the COLUMNS metadata set

  • Parameter Name

    • A special primary key that defines the kind of result for the current row

    • Defined in the COLUMNS_PARAM metadata set

  • Parameter Value

  • Parameter-related

    • Each non-key variable whose attributes each differ across rows but are the same attributes for the subset of rows defined by parameter variable xxTESTCD. These are “virtual variables”.

    • Defined in the COLUMNS_PARAM metadata set

  • Parameter-nonrelated

    • Each non-key variable whose attributes do not differ across rows and are not dependent on the parameter variable

    • Defined in the COLUMNS metadata set


Columns param metadata set

Columns_param Metadata Set

| Presentation Title | Presenter Name | Date | Subject | Business Use Only

The list of attributes in columns_param are identical to the list in columns. That is to say, everything you need to describe about a short-wide column must be described about the tall-thin parameter-related column.

Storing the study data in tall-thin data sets does not reduce the amount of metadata definition that is required

In data set TABLE, when variable COLUMN equals the value PARAM then the attributes of variable PARAMREL are described in the columns_param metadata set row

There are many other variable attributes than in the example, but these were subsetted to fit in a slide


  • Login