130 likes | 285 Views
This guide explores the construction of efficient search queries in MySQL, focusing on the Bag-of-Words model commonly used in Information Retrieval (IR). It discusses essential concepts such as stop words exclusion, stemming for content analysis, and the implementation of Boolean logical operators like AND, OR, and NOT within SQL queries. By understanding how to filter and structure your queries effectively, you can enhance search accuracy and manage large datasets efficiently. Additionally, tips for paginating results and linking to documents are included.
E N D
Constructing Search Queries Searching a MySQL Database
IR’s “Bag of Words” Model • Typical IR data model: • Each document is just a bag (multi-set) of words (“terms”) • Detail 1: “Stop Words” • Certain words are considered irrelevant and not placed in the bag • e.g., “the” • e.g., HTML tags like <H1> • Detail 2: “Stemming” and other content analysis • Using English-specific rules, convert words to their basic form • e.g., “surfing”, “surfed” --> “surf” • Database Management Systems, R. Ramakrishnan
Boolean Text Search • Find all documents that match a Boolean containment expression: • “Windows” AND (“Glass” OR “Door”) AND NOT “Microsoft” • Note: Query terms are also filtered via stemming and stop words. • When web search engines say “10,000 documents found”, that’s the Boolean search result size (subject to a common “max # returned’ cutoff). • Database Management Systems, R. Ramakrishnan
SQL Logical Operators • There are three Logical Operators: AND, OR, and NOT. • These operators compare two conditions at a time to determine whether a row can be selected for the output. • When retrieving data using a SELECT statement, you can use logical operators in the WHERE clause, which allows you to combine more than one condition. http://beginner-sql-tutorial.com/sql-logical-operators.htm
"OR" Logical Operator: • If you want to select rows that satisfy at least one of the given conditions, you can use the logical operator, OR. • For example: if you want to find the names of students who are studying either Math or Science, the query would be like, • The following table describes how logical "OR" operator selects a row. http://beginner-sql-tutorial.com/sql-logical-operators.htm
"AND" Logical Operator: • If you want to select rows that must satisfy all the given conditions, you can use the logical operator, AND. • For Example: To find the names of the students between the age 10 to 15 years, the query would be like: • The following table describes how logical "AND" operator selects a row. http://beginner-sql-tutorial.com/sql-logical-operators.htm
"NOT" Logical Operator: • If you want to find rows that do not satisfy a condition, you can use the logical operator, NOT. NOT results in the reverse of a condition. That is, if a condition is satisfied, then the row is not returned. • For example: If you want to find out the names of the students who do not play football, the query would be like: • The following table describes how logical "NOT" operator selects a row. http://beginner-sql-tutorial.com/sql-logical-operators.htm
Nested Logical Operators: • You can use multiple logical operators in an SQL statement. When you combine the logical operators in a SELECT statement, the order in which the statement is processed is • 1) NOT 2) AND 3) OR • For example: If you want to select the names of the students whose age is between 10 and 15 years, or those who do not play football, the SELECT statement would be • The filter works as follows: • Condition 1: All the authors not Smith are selected.Condition 2: All the articles dated between 1/23/10 and 2/16/10 are selected.Condition 3: Finally the result is, the rows which satisfy at least one of the above conditions is returned. • NOTE: The order in which you phrase the condition is important, if the order changes you are likely to get a different result. http://beginner-sql-tutorial.com/sql-logical-operators.htm
Boolean Operators • OR – increases recall (gets more) • AND – Increases precision (gets less)
Building Search into your Application • First decide which fields in your table(s) will be searchable. • Second, Before you code anything in PHP, use PHPMyAdmin or some other utility to test the SQL statement you will use to search your data. • Third, add the SQL statement to your PHP code. • Forth, the PHP script should use ‘pagination’ – if you have 100 results, it should break it up into pages i.e. 5 pages of 20 results each or 10 pages of 10 results each. • Fifth, each result should link back to the actual article, recipe, image etc. • Sixth, any other enhancements you want to add.
Resources • SQL Logical Operators • http://beginner-sql-tutorial.com/sql-logical-operators.htm • PHP search engine - • http://www.roscripts.com/PHP_search_engine-119.html • Autosuggest/autocomplete from database - • http://www.roscripts.com/Ajax_autosuggest_autocomplete_from_database-154.html
Resources • PHP / MySQL select data and split on pages • http://www.phpjabbers.com/phpexample.php?eid=25 • PHP Basic Pagination • http://www.phpfreaks.com/tutorial/basic-pagination • Perfect PHP Pagination • http://articles.sitepoint.com/article/perfect-php-pagination • Tag Schema • http://forge.mysql.com/wiki/TagSchema