Sql server full text search using full text search in sql server 2005
1 / 30

SQL Server Full-Text Search - PowerPoint PPT Presentation

  • Updated On :

SQL Server Full-Text Search Using full-text search in SQL Server 2005. Edward Forgács <[email protected]>. What is full text search?. Allows language-specific queries of text data stored in SQL Server char varchar nchar nvarchar image (containing DOC or PDF data)

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'SQL Server Full-Text Search' - Philip

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Sql server full text search using full text search in sql server 2005 l.jpg

SQL Server Full-Text SearchUsing full-text search in SQL Server 2005

Edward Forgács <[email protected]>

What is full text search l.jpg
What is full text search?

  • Allows language-specific queries of text data stored in SQL Server

    • char

    • varchar

    • nchar

    • nvarchar

    • image (containing DOC or PDF data)

  • Supported by all editions ofSQL Server (except Express)

Results of like l.jpg
Results of LIKE

  • Only one result returned

  • Does not match

    • "email"

    • "e-mails"

    • "emails"

Why use full text search l.jpg
Why use full text search?

  • Much more powerful than LIKE

    • Used correctly, will produce more specific, relevant results

    • Better performance – LIKE queries are designed for small amounts of text data, full-text search scales to huge documents

    • Provides ranking of results

  • Common uses

    • Search through the content in a text-intensive, database driven website, e.g. a knowledge base

    • Search the contents of documents stored in BLOB fields

    • Perform advanced searches

      • e.g. with exact phrases - "to be or not to be" (however needs care!)

      • e.g. Boolean operators - AND, OR, NOT, NEAR

New for fts in 2005 l.jpg
New for FTS in 2005

  • XML support

  • Linked server support for queries

  • Integrated backup, restore and recovery

  • Faster queries and index building

  • Data definition language (DDL) statements for creating and altering indexes

    • System stored procedures deprecated

  • Noise Insensitivity – noise words no longer break the query

  • Accent Insensitivity (optional) – e.g. café and cafe are the same

  • Multiple columns can be included in full-text searches

  • Pre-computed ranking optimizations when using FREETEXTTABLE

  • Improved ranking algorithm

  • Catalogs can be set to populate continuously, track changes, or index when the CPU is idle

Setting up full text search l.jpg
Setting up full-text search

  • Enable FTS for the database – check "Use full-text indexing" on the Files page, orEXEC sp_fulltext_database 'enable'

  • Create catalogs for the required tables and select columns to index

Slide8 l.jpg

Demo: Setting up FTSon a database

Freetext query l.jpg

  • SyntaxSELECT ColumnsFROM TableWHEREFREETEXT((Column1, Column2), 'word1 word2')orFREETEXT(*, 'word1, word2')

  • Returns rows where the specified columns contain ANY form of ANY of the specified words – like AltaVista

  • Usually returns hundreds of irrelevant results, often poorly ranked

  • Quick and dirty – really not suitable for production usage

Freetexttable query l.jpg

  • SyntaxSELECT Columns, KEY_TBL.RANKFROM CategoriesINNER JOIN FREETEXTTABLE(Categories, Column1, Column2, 'word1 word2') AS KEY_TBL ON Categories.CategoryID = KEY_TBL.[KEY]ORDER BY KEY_TBL.RANK DESC

  • Produces the same results as a FREETEXT query, but additionally provides ranking information

Contains query l.jpg

  • SyntaxSELECT Column1, Column2FROM TableWHERE CONTAINS((Column1, Column2), '"exact phrase" AND FORMSOF(INFLECTIONAL, "word")')

  • Special full-text query syntax allows you to utilize the power of full-text search

  • Allows far more sophistocated queries, with boolean operators and exact phrases

  • Customizable word matching

  • Produces far more specific results than FREETEXT

Containstable query l.jpg


  • Returns the same results as a CONTAINS query, but with ranking information

  • Necessary to sort results by relevance

Slide13 l.jpg

Demo: Running a simple

full-text search query

Writing fts terms l.jpg
Writing FTS terms

  • The power of FTS is in the expression which is passed to the CONTAINS or CONTAINSTABLE function

  • Several different types of terms:

    • Simple terms

    • Prefix terms

    • Generation terms

    • Proximity terms

    • Weighted terms

Simple terms l.jpg
Simple terms

  • Either words or phrases

  • Quotes are optional, but recommended

  • Matches columns which contain the exact words or phrases specified

  • Case insensitive

  • Punctuation is ignored

  • e.g.

    • CONTAINS(Column, 'SQL')

    • CONTAINS(Column, ' "SQL" ')

    • CONTAINS(Column, 'Microsoft SQL Server')

    • CONTAINS(Column, ' "Microsoft SQL Server" ')

Prefix terms l.jpg
Prefix terms

  • Matches words beginning with the specified text

  • e.g.

    • CONTAINS(Column, ' "local*" ')

      • matches local, locally, locality

    • CONTAINS(Column, ' "local wine*" ')

      • matches "local winery", "locally wined"

Generation terms l.jpg
Generation terms

  • Two types:

    • Inflectional

      • FORMSOF(INFLECTIONAL, "expression")

    • Thesaurus

      • FORMSOF(THESAURUS, "expression")

  • Both return variants of the specified word, but variants are determined differently

Inflectional l.jpg

  • Matches plurals and words which share the same stem

  • When vague words such as "best" are used, doesn't match the exact word, only "good"

Thesaurus l.jpg

  • Supposed to match synonyms of search terms – but the thesaurus seems to be very limited

  • Does not match plurals

  • Not particularly useful

Proximity terms l.jpg
Proximity terms

  • SyntaxCONTAINS(Column, 'local NEAR winery')CONTAINS(Column, ' "local" NEAR "winery" ')

  • Matches words which are NEAR each other

  • Terms on either side of NEAR must be either simple or proximity terms

Weighted terms l.jpg
Weighted terms

  • Each word can be given a rank

  • Can be combined with simple, prefix, generation and proximity terms

  • e.g.

    • CONTAINS(Column, 'ISABOUT( performance weight(.8), comfortable weight(.4))')

    • CONTAINS(Column, 'ISABOUT( FORMSOF(INFLECTIONAL, "performance") weight (.8), FORMSOF(INFLECTIONAL, "comfortable") weight (.4))')

Fts with blob fields l.jpg
FTS with BLOB fields

  • Full text search can be used with data stored in BLOB fields, such as Word and PDF documents

  • Two columns are necessary in the table

    • A BLOB column, such as IMAGE or VARBINARY, which holds the data to search

    • A data type column, e.g. VARCHAR(3), which holds the file extension to specify the type

  • When the catalog is created, the data type field also needs to be specified

  • Needs to be used with caution to avoid a massive performance hit

Managing population schedules l.jpg
Managing Population Schedules

  • In SQL 2000, full text catalogs could only be populated on specified schedules

  • SQL 2005 can track database changes and keep the catalog up to date, with a minor performance hit

  • Additionally, schedules can now be set to run when the CPU is idle

  • Immediate population only accessible from the Catalog Properties dialog

Backing up full text catalogs l.jpg
Backing up full text catalogs

  • Full text catalogs are now included in SQL backups by default, and are retained when a database is detached and re-attached

  • Option in detach dialog to include keep the full text catalog

Disadvantages l.jpg

  • Full text catalogs consume large amounts of disk space and must be kept up-to-date

  • If catalogs aren’t set to update continuously, results can be out-of-date

  • Continuous updating results in a performance hit

  • Queries are complicated to generate and must be generated as a string on the client

  • Requires care in stored procedures – full text clauses are always evaluated, regardless of OR operators

  • Contextual information not provided with any full-text search functions

Sql server full text search using full text search in sql server 200530 l.jpg

SQL Server Full-Text SearchUsing full-text search in SQL Server 2005

Edward Forgács <[email protected]>