introduction to full text searching in sql server 2012
Download
Skip this Video
Download Presentation
Introduction to Full-Text Searching in SQL Server 2012

Loading in 2 Seconds...

play fullscreen
1 / 47

Introduction to Full-Text Searching in SQL Server 2012 - PowerPoint PPT Presentation


  • 94 Views
  • Uploaded on

Introduction to Full-Text Searching in SQL Server 2012. Adolfo J. Socorro, Ph.D. IT Impact, Inc. [email protected] Outline. What can we do with FTS? How to install FTS FTS components Creating FTS indexes How to query with FTS FILESTREAM and FileTable. FTS Basics.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Introduction to Full-Text Searching in SQL Server 2012' - uri


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
outline
Outline
  • What can we do with FTS?
  • How to install FTS
  • FTS components
  • Creating FTS indexes
  • How to query with FTS
  • FILESTREAM and FileTable
fts basics
FTS Basics

FTS allows searching against character-based data

  • char
  • varchar
  • nchar
  • nvarchar
  • text
  • ntext
  • image
  • xml
  • varbinary
  • varbinary(max)
a first look
A First Look
  • Let’s run some simple examples to get a feel for FTS!
like vs fts
LIKE vs FTS
  • LIKE works on character patterns only
  • Cannot use the LIKE predicate to query formatted binary data
  • FTS is much faster against large amounts of unstructured text data
supported sql server editions
Supported SQL Server Editions
  • Enterprise
  • Business Intelligence
  • Standard
  • Web
  • Express with Advanced Services

Available since at least SQL Server 2000

language support
Language Support
  • 50+ languages
  • Language-specific components
    • Word breakers and stemmers
    • Stoplists
    • Thesaurus files
fts indexes
FTS Indexes
  • One index per table or indexed view
  • Must have a unique, single-column, non-nullable index on the table
  • Grouped within the same database into one or more full-text catalogs (“containers”)
full text catalogs
Full-Text Catalogs
  • A logical construct
  • A way to manage FT indexes together
index population
Index Population
  • Population: the addition of data to full-text indexes
contains
CONTAINS
  • Precise or prefix matches to single words and phrases
  • Proximity matches
  • Logical operations between conditions: AND, OR, AND NOT
  • Optional use of inflectional formsand thesaurus
freetext
FREETEXT
  • Matching the meaning, but not the exact wording, of specified words or phrases
  • Always uses inflectional forms and thesaurus
containstable and freetexttable
CONTAINSTABLE AND FREETEXTTABLE
  • Return a relevance ranking value (RANK) and full-text key (KEY) for each row
  • The actual RANK values are unimportant and typically differ each time the query is run
  • ISABOUT/WEIGHT influence the rankingin CONTAINSTABLE
stoplists
Stoplists
  • A mechanism to discard commonly occurring strings that do not help the search
thesaurus
Thesaurus
  • Nicknames: Robert/Bob
  • Common misspellings: calendar/calender
  • Homophones: Geoff/Jeff
  • Technical terms: proc/procedure

Very powerful if you log searches and learn what users are commonly searching for

thesaurus1
Thesaurus
  • One file per language
filters
Filters
  • Extract textual information from the document (removing the formatting)
  • Send the text to the word-breaker component for the language associated with the column
  • Need to manually install Office 2010 and PDF filters
why store in the database
Why Store in the Database?
  • Integrating unstructured data into the relational database provides significant benefits:
    • Integrated storage and data management capabilities (e.g., backup)
    • Ease of administration and policy management
    • Full-text search
filestream
FILESTREAM
  • A database/file system hybrid
  • FILESTREAM is an attribute that can be assigned to a varbinary(max) column
  • Allows storing BLOB data in the file system
  • Not restricted to the 2 GB limitSQL Server imposes on BLOBs
filestream1
FILESTREAM
  • SQL Server buffer pool is not used
  • Isolation semantics are governed byDatabase Engine transaction isolation levels
filestream2
FILESTREAM
  • All data access must be transactional
  • Must use specific APIs for file I/O
  • Do not edit the files directly!
when to use filestream
When to Use FILESTREAM
  • Objects that are being stored are, on average, larger than 1 MB
    • Store smaller objects in the database
  • Fast read access is important
  • You are using a middle tier for application logic
filetables
FileTables
  • A special, fixed-schema kind of table
  • Builds on top of existing FILESTREAM capabilities
  • Store files and documents in in the database, but access them from Windows applications as if they were stored in the file system (WIN32 API)
filetables1
FileTables
  • Hierarchical namespace
  • Includes file system properties as columns
  • Preserves full file names
  • Non-transactional access through the FS
filetables2
FileTables
  • Calls to create or change a file or directory through the Windows share are intercepted by a SQL Server component and reflected in the corresponding relational data in the FileTable
filetables vs filestream
FileTables vs FILESTREAM
  • File and directory hierarchy maintained in the database
  • Windows application compatibility
  • Relational access to file attributes
  • Both are available in all editions
wrap up
Wrap Up
  • Advanced searching on character-based data, including documents
  • FTS setup, components, and queries
  • FILESTREAM
  • FileTables
other topics
Other Topics
  • Document-property search
  • Semantic search
  • Optimizations
  • Query plans and execution traces
references
References
  • Posts and presentations by Bob Beauchemin
    • http://www.sqlskills.com/blogs/bobb/
  • Blog: SQL Server FTS Team Blog
    • http://blogs.msdn.com/b/sqlfts
  • SQL Server 2012 Books Online
    • http://msdn.microsoft.com/en-us/library/cc645577(SQL.110).aspx
filter packs
Filter Packs
  • Adobe PDF Filter
    • http://www.adobe.com/support/downloads/thankyou.jsp?ftpID=4025&fileID=3941
  • Office 2010 Filters
    • http://www.microsoft.com/en-us/download/details.aspx?id=17062
ad