Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
MySQL Part II IDIA 618 Fall 2014 Bridget M. Blodgett
Indexing • For very small relational databases looking up information directly is quick and easy • But production level databases can quickly become large enough to create delays while searching • An index uses a small amount of memory and HD space in order to speed searching
Index Types • There are three different index types in MySQL: • INDEX • PRIMARY KEY • FULLTEXT • Determining what type to use and where to apply it • ALTER TABLE classics ADD INDEX (author(20)); • CREATE INDEX author ON classics (author(20));
Primary Keys • Primary keys are unique identifiers for each row in a table • They can be added after a table is created • ALTER TABLE classics ADD isbn CHAR(13) PRIMARY KEY; • But this only works for an unpopulated table • Use another column (or create a new one) that will have unique data for each entry and feed it data for the key
FULLTEXT • Saves each words of the data string in an index that can be searched using natural language • Still only works for small tables • Only works for Char, Varchar, Text columns • ALTER TABLE classics ADD FULLTEXT(author,title);
Querying • After creating and deleting databases the most important thing you will need to do is query them • Basic format is: • SELECT author, title FROM classics; • This can then be expanded to refine the information you are receiving back
SELECT Modifications • SELECT COUNT • Returns the number of rows that match that result • SELECT COUNT(*) FROM classics; • SELECT DISTINCT • Will only display rows that match the search criteria just once • SELECT author FROM classics; • SELECT DISTINCT author FROM classics;
WHERE & LIKE • Where allows you to narrow down the results that you get based upon certain qualifiers • SELECT author,title FROM classics WHERE author="Mark Twain"; • This requires specific knowledge of what you need to search for • Like allows for vague searches using a wildcard keyword • SELECT author,title FROM classics WHERE author LIKE "Charles%";
Activity • Display all columns for everyone that is over 40 years old. • Create a select query that finds all people with the last name “Jones” • Create a select query that finds people with first names that start with “Er” • Create a select query that finds people with last names ending in “s”
LIMIT • Limit can be used to make sure that only a certain number of rows are returned in response to a query • Using just one identifier means that it will start at the beginning and return that number. Giving it two will skip the first number of responses • Offsets begin like arrays (at 0) • SELECT author,title FROM classics LIMIT 3; • SELECT author,title FROM classics LIMIT 1,2;
MATCH…AGAINST • Only usable on columns with a FULLTEXT index • Allows you to do searches like in an Internet search engine (using normal language) • There are certain words that will be ignored by this modifier (and, or, not) and return an empty set • WHERE MATCH(author,title) AGAINST('old shop'); SELECT author,title FROM classics • In Boolean mode adding a + requires a word and – bars it from being in the result • SELECT author,title FROM classics WHERE MATCH(author,title) AGAINST('"origin of"' IN BOOLEAN MODE);
UPDATE…SET • Update set allows for the contents within a search set to be updated • Makes sure that only the rows you want changed will be • UPDATE classics SET author='Mark Twain (Samuel Langhorne Clemens)' WHERE author='Mark Twain';
ORDER BY • Order by is a sort feature that you can apply to one or more columns • Allows for both ascending and descending with ascending being the default • SELECT author,title FROM classics ORDER BY author; • SELECT author,title FROM classics ORDER BY title DESC; • SELECT author,title,year FROM classics ORDER BY author,year DESC;
GROUP BY • Also orders the results of a search by groups them according to another category in the table • Useful when you want to see how the data relates to other information in the table • SELECT category,COUNT(author) FROM classics GROUP BY category;
Activity • Select the lastname, firstname, and city for all customers in the customers table. Display the results in Ascending Order based on the lastname. • How many people are in each unique state in the customers table? Select the state and display the number of people in each. Hint: count is used to count rows in a column, sum works on numeric data only. • How many orders did each customer make?
Joining Tables • Sometimes the information you need is spread across several tables • A JOIN allows for that information to be combined in the results table (reducing the amount of information you need to handle) • Simple joins are very easy to perform, simply list both tables after the SELECT • SELECT name,author,title from customers,classics WHERE customers.isbn=classics.isbn;
Types of Joins • Joining can be altered to create more specific results • Natural Join – automatically joins columns that have the same column name • Join…On – allows you to specify the column to join the two tables • AS – allows you to create aliases to shorten table names when used as reference
Activity • Using the previous two tables: • Write a query using a join to determine which items were ordered by each of the customers in the customers table. Select the customerid, firstname, lastname, order_date, item, and price for everything each customer purchased in the items_ordered table.
Logical Operators • Used with WHERE queries to narrow down the results • Useful when data may be saved in a couple different ways: • SELECT author,title FROM classics WHERE author LIKE "%Mark Twain%" OR author LIKE "%Samuel Langhorne Clemens%";
Database Design • Databases thrive on good and efficient design • The correct layout will improve the efficiency, speed, and usefulness of your site • Begin with trying to anticipate what types of queries will be commonly needed on your site • What could they be for the project site? • What types of information is needed to answer these questions? • What seem like some naturally occurring groups?
Primary Keys • One of the most important parts of good database design • Have a quick unique identifier makes storing and retrieving information much easier • Keys should be truly unique and not repeatable for different objects • Auto increment works great for this • Unfortunately its not the most natural of keys
Normalization • You want to avoid duplication of information in the database • Redundancy increases the size of the database and how long it takes for results to be returned • Duplicates also make consistency (one of those key principles!) hard to maintain since making sure all instances of an entry are updated (or deleted)
Normalization Schemas • There are three separate schemas for normalization (yes I know the list has 4) • First • Second • Third • Normal Form • Normalizing for each of these forms will make sure your database stays in that sweet spot
First Normal Form • For a database to satisfy the First Normal Form, it must fulfill three requirements: • There should be no repeating columns containing the same kind of data. • All columns should contain a single value. • There should be a primary key to uniquely identify each row. • Columns which are needed but not fitting this form can (and should) be spun off to another table
Second Normal Form • Only after achieving First Normal Form can Second Normal Form be evaluated • Second Normal Form is achieved by identifying columns whose data repeats in different places and then removing them to their own tables.
Third Normal Form • Third normal form is considered to be the strictest rule to follow and isn’t always needed to have a productive database • Data that is not directly dependent on the primary key but that is dependent on another value in the table should also be moved into separate tables, according to the dependence. • In this example it would require making 3 new tables for ZIP, State, and CITY
When to Use Third Normal Form • When additional information may be needed • Eg. The 2 letter state abbreviation • Book suggests that if you answer “yes” to either of these questions then you should follow third normal form: • Is it likely that many new columns will need to be added to this table? • Could any of this table’s fields require a global update at any point?
When Not to Normalize • Spreading info across so many tables can make MySQL work hard to return your results • On a very popular site, if you have normalized tables, your database access will slow down considerably once you get above a few dozen concurrent users
Databases and Anonymity • A lot of information about people gets stored in databases of dynamic websites • This can be a benefit or a danger • How Target Figured Out A Teen Girl Was Pregnany Before Her Father Did