Introduction to data management and relational databases
Download
1 / 77

nature.berkeley - PowerPoint PPT Presentation


  • 184 Views
  • Uploaded on

Introduction to Data Management and Relational Databases Outline of presentation Data Management Compare database Versus spreadsheets, word processor docs, … Relational Databases Parts & Terms tables, forms, queries, reports (we’ll skip reports) fields, records, keys Relationships

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'nature.berkeley' - benjamin


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Introduction to data management and relational databases l.jpg

Introduction to Data Management and Relational Databases


Outline of presentation l.jpg
Outline of presentation

  • Data Management

  • Compare database

    • Versus spreadsheets, word processor docs, …

  • Relational Databases

    • Parts & Terms

      • tables, forms, queries, reports (we’ll skip reports)

      • fields, records, keys

    • Relationships

    • Linking Tables


Slide3 l.jpg

1

1

2

2

3

3

Mature Forest

A

B

1

2

1

2

3

3

A

Old Growth

C

1

2

3

B

1

1

1

2

2

2

3

3

3

B

C

A

1

2

3

Clear Cut

C


Data management issues l.jpg
Data Management Issues

  • Organization!

  • Data Entry (error-prone process)

  • Quality Control – Quality Assurance

  • Metadata (possible data values, how collected, etc.)

  • Tracking specimens, samples

  • Data retrieval


Spreadsheet vs relational databases l.jpg
Spreadsheet vs. Relational Databases

  • Relational Database

    • Data entry

    • Data storage

    • Data retrieval

  • Spreadsheet

    • Manipulating Data (eg. Pivot tables)

    • Summarizing & Presenting Data (eg. graphing)

    • (Formatting data for statistics programs)


Slide8 l.jpg

?


Embedded information l.jpg

Embedded Information

Spreadsheets = “flat files”

Databases = “multi-dimensional”



Slide13 l.jpg

It is possible to sort and filter records in the spreadsheet (look under DATA in the menu bar). Filtering temporarily removes all unwanted records from view. This is also possible in a database, with some more sophisticated options available.


Slide14 l.jpg

It is very easy to copy cells or entire rows of data in spreadsheets, but more difficult in databases (one of the few advantages of spreadsheets over databases). However, if one needs to copy-down a lot of data, then the database is not well “normalized” (discussed later).


Slide15 l.jpg

It is easy to search for and replace words in spreadsheets. This is also possible in databases, but with more sophisticated search and replace options.


Slide16 l.jpg

The “auto-fill” option in a spreadsheet completes a word it recognizes from entries immediately above the current one. In databases one can use a “lookup table” (discussed later) for a full list of values (eg. names), which might not yet appear in the data set.


Slide17 l.jpg

Word processor files are the least capable of filtering, finding & replacing, and assisting data entry compared with spread sheets and databases.


Slide18 l.jpg

Although with proper formatting a word processor document can look like a spreadsheet or database table, one cannot manipulate the rows and columns in the same way.


Archiving data l.jpg
Archiving Data can look like a spreadsheet or database table, one cannot manipulate the rows and columns in the same way.

  • As an aside, the best form to archive data in is tab delimited (.txt) or comma separated values (.csv) text files

  • Although programs and formats come and go, all database, spreadsheet, and word processor programs know how to handle .txt and .csv files


Comparing databases spreadsheets and documents l.jpg
Comparing databases, spreadsheets, and documents can look like a spreadsheet or database table, one cannot manipulate the rows and columns in the same way.

1 Auto complete is done in very different ways

2 Not linking in a true relational sense, except through a database

3 Properly set up (normalized) data can be back-filled


Tabulations l.jpg
Tabulations can look like a spreadsheet or database table, one cannot manipulate the rows and columns in the same way.

  • Matrix-style synopsis of data

    • “crosstab query” in MS Access

    • “pivot table” in MS Excel


Slide22 l.jpg

? can look like a spreadsheet or database table, one cannot manipulate the rows and columns in the same way.


Relational databases l.jpg
Relational Databases can look like a spreadsheet or database table, one cannot manipulate the rows and columns in the same way.

  • Four major components

    • Tables – these are where ALL data reside

    • Queries – select subsets of data (retrieve data)

    • Forms – “windows” into data tables (views of data)

    • Reports – summaries of data (formatted synopses)


Slide24 l.jpg

Tables can look like a spreadsheet or database table, one cannot manipulate the rows and columns in the same way.

All data in relational databases reside in tables. Queries, forms, and reports are just convenient ways of looking at the data in the tables. As we shall soon see, the sizes and types of data that can be entered into a table can be regulated for better efficiency and error-proofing. And two or more tables that have a field in common can be linked to draw information from all related tables.


Slide25 l.jpg

Some terminology: can look like a spreadsheet or database table, one cannot manipulate the rows and columns in the same way.

Each square is a “cell” of data


Slide26 l.jpg

Columns = “Fields” can look like a spreadsheet or database table, one cannot manipulate the rows and columns in the same way.


Slide27 l.jpg

Rows = “Records” can look like a spreadsheet or database table, one cannot manipulate the rows and columns in the same way.


Slide28 l.jpg

To change between datasheet view and design view can look like a spreadsheet or database table, one cannot manipulate the rows and columns in the same way.


Slide29 l.jpg

Another way to enter design view is to click on the table name once (so that it is highlighted), then click on the design view icon. Or right-click the table name and choose design view.


Slide30 l.jpg

The DESIGN VIEW of a table is where one dictates the type and range of data that can be entered into each field. This can include formatting (such as capitalization), default values, and valid/non-valid entries.


Slide32 l.jpg

“Lookup” is used to create a list of possible values that a field can take. This example uses a list of values in the field’s properties settings (in DESIGN VIEW). In DATA VIEW the field will have a drop down list of values (“Combo Box”). The full value will be filled in when the first letter is typed.


Slide34 l.jpg

In this example the “lookup” is set to the list of species codes in the table “Species”


Miscellaneous l.jpg
Miscellaneous species codes in the table “Species”

  • New records are always added at the end of the table (many people find this annoying)

  • “Esc” once to undo current typing

  • “Esc” twice to undo the whole record

  • Changes are saved when you move off the cell

    • No need to save the data in a database after any changes (formatting changes must be saved)


Slide37 l.jpg

? species codes in the table “Species”


Linking tables l.jpg
Linking Tables species codes in the table “Species”

  • Fields common b/w two or more tables can link

  • Keyed fields prevent duplicate entries

  • Keyed fields determine relationships b/w tables

  • Linked tables can reduce data entry and storage needs (using an idea called data normalization)


Keys and relationships l.jpg
Keys and relationships species codes in the table “Species”

  • A keyed field is one that does not allow repeated values.

    • For example, if the field “Code Name” is keyed in a table, then the user would not be allowed to enter the same Code Name more than once (an error, “key violation” would appear). In this way, one constructs a list of unique values (eg. Code Names).


Slide46 l.jpg

? species codes in the table “Species”


Slide47 l.jpg

One-to-many relationship species codes in the table “Species”

Because each Species Code is unique in the “keyed” Species table, and can be repeated many times in the Collections table, a “one-to-many” relationship is created between the two (indicated by the “1” and infinity).

Referential integrity means that a Species Code cannot be entered into the Collections table if it is not in the Species table.

Cascade Update allows one to change the species code once and propagate that change through Collections table.

Cascade Delete deletes that species code in all tables that are connected. Use this feature cautiously.

This is the “relationships view” of the database that allows the user to define which tables are linked and how. Keyed fields are in bold.


Slide49 l.jpg

One-to-one relationship? species codes in the table “Species”


Normalization l.jpg
Normalization species codes in the table “Species”

  • Identical data should not be repeated

  • Fields with records that are repeated should be pulled out into a separate table and referenced when needed


Slide56 l.jpg

? species codes in the table “Species”


Slide57 l.jpg

Forms species codes in the table “Species”

“Windows into the tables” (which really contain the data). Forms allow the user to customize/arrange fields on the screen for ease of data entry and retrieval. Forms also allow the user to view several linked tables in one display. Forms display all the records in the underlying tables as opposed to subsets (see queries for this). Data can be entered and updated in forms.


Slide58 l.jpg

Forms can (should) look like field collection sheets species codes in the table “Species”



A form with a subform ex loans masterform here two tables are combined in one display l.jpg
A form with a subform five)(ex: Loans Masterform) Here two tables are combined in one display

subform


Using forms l.jpg
Using Forms: five)

  • Create data entry forms that look like field sheets

  • Viewing multiple related tables in one view

  • The easiest way to create forms in MS Access is to use the “form wizard”

    • Select first table to be in form

    • Select fields to be in form

    • Select next table to be in form

    • Select fields to be in form …

    • Select style of query to create

    • Once the form is created, go to Design View to modify the size and position of fields in the display


Slide62 l.jpg

? five)


Slide63 l.jpg

Queries five)

Queries are used to ask questions of the data. That is, to select subsets of records based on particular search criteria. The real power of relational database queries is that data can be summarized across tables that are linked by common fields.


Queries l.jpg
Queries five)

  • Used to select subsets of data or combinations of data from linked tables

  • Two or more tables may be combined in one query using linked fields

  • Data can be entered and changed (changes will propagate through the underlying tables)

  • Queries can also be used to append one table to another, update records given certain criteria, or make new tables for exporting as spreadsheets or documents


Creating queries l.jpg
Creating Queries five)

  • Queries are easily created in Design View

    • Click “Create query in design view”

    • Select tables to be in the query

    • Select the fields to be in the query

    • Type in any criteria to search for in each field

    • Set any sorting preferences

    • Press run button (!)


Review l.jpg
Review five)

  • Steps for creating a relational database


Slide71 l.jpg

Step 1: Nested Data Structure five)

1

1

2

2

3

3

Mature

A

B

1

2

1

2

Old Growth

3

3

A

C

1

2

3

B

1

1

1

2

2

2

3

3

3

B

C

A

1

2

Clear Cut

3

C


Step 2 sketch out tables l.jpg
Step 2: Sketch-out tables five)

  • Table 1

    • Trap

    • Plot

    • Site

    • Date

    • Species

    • Genus

    • Family

    • Order

    • Quantity

    • Collector

  • Table 2

    • Dissolved Oxygen

    • Dissolved Nitrogen

    • pH

    • Temperature

    • Plot

    • Site

    • Collector


Step 3 data normalization l.jpg
Step 3: Data Normalization five)

  • Look for fields with repeated records

    • Consider pulling these fields out into a separate table that links back to the first table

    • Set up look up tables or look up values

  • Designate Keyed fields

    • These are fields with non-repeated values

    • Keys can be sequential numbers, or unique codes



Step 5 set up forms and queries l.jpg
Step 5: Set up forms and queries five)

  • Set up forms for easier data entry

  • Set up queries to retrieve specific data sets


Slide76 l.jpg

? five)



ad