1 / 19

Integrated Full Text Search (iFTS) with SQL S e rver 2008

Integrated Full Text Search (iFTS) with SQL S e rver 2008 . Nauzad Kapadia Quartz Systems nauzadk@quartzsystems.com. Session Objectives And Takeaways. Session Objective(s): Discover what we have learned from using the new Integrated Full Text Search

schuyler
Download Presentation

Integrated Full Text Search (iFTS) with SQL S e rver 2008

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Integrated Full Text Search (iFTS) with SQL Server 2008 Nauzad Kapadia Quartz Systems nauzadk@quartzsystems.com

  2. Session Objectives And Takeaways • Session Objective(s): • Discover what we have learned from using the new Integrated Full Text Search • Find out what works and what doesn’t work • iFTS is faster than SQL 2005 full text • A great base for future improvements

  3. Background • Based on MSSearch • Lack of Integration limited the performance of queries • Limits the ability to integrate with high availability and scalability functionality

  4. New Architecture

  5. Demo – Using Full Text Search

  6. StopLists • NewSTOPLISTsupport • Simplified noise words utilization and manageability. • DB object associated with the FT index. • CREATE FULLTEXT STOPLIST stoplist_name • [ FROM {[database_name.] source_stoplist_name} | SYSTEM STOPLIST] • [AUTHORIZATION owner_name] • ALTER FULLTEXT STOPLIST stoplist_name • { • | ADD <keyword> LANGUAGE language_term • | DROP • { • | <keyword> LANGUAGE language_term • | ALL LANGUAGE language_term • | ALL • }

  7. Demo – Stop Lists and Creating FullText Indexes

  8. Thesaurus • Thesaurus improvements • Stored in internal tables (in tempdb) in XML form instead of being parsed from external files • Instance level thesaurus sys.sp_fulltext_load_thesaurus_file (lcid) Loads all the data specified in the Thesaurus XML corresponding to the language with specified lcid.

  9. Demo - Thesaurus

  10. SQL Server 2008: IFTS • New family ofWord-Breakers (WB): WBs are components responsible of parse the textual data in a given language and pass the tokenized result to the Full-Text Index. • 51 languages/WBs out of the box • Improved quality in many already existing word-breakers

  11. SQL Server 2008: IFTS • WBs available in SQL Server 2008: Neutral Punjabi Romanian Serbian Cyrillic Serbian Latin Slovak Slovenian Tamil Telugu Ukrainian Urdu Lithuanian Malay Icelandic Indonesian Hindi Gujarati Vietnamese Arabic Norwegian Portuguese Brazilian Russian Dutch Malayalam Marathi Hebrew Canada Latvian Swedish Danish Polish Turkish • English • English UK • Simplified Chinese • Traditional Chinese • Chinese (Hong Kong) • Chinese (Macau) • Chinese (Singapore) • Thai • Korean • French • German • Japanese • Italian • Spanish • Bengali • Bulgarian • Catalan • Croatian Languages present but disabled by default New languages supported in SQL Server 2008 Existing in SQL Server 2005, and being replaced by new WBs in SQL Server 2008 Unchanged language/WB from SQL Server 2005

  12. The indexing performance has improved in most scenarios Indexing Performance Measured on 4 processor AMD64 2793 MHz, 8G RAM. Numbers are in HH:MM format. Total time is combining time to crawl and time of merge into index For some HW configuration and data types, specific best practices are recommended to improve indexing performance (i.e: capping SQL Server’s memory)

  13. To see the word frequency Sys.dm_fts_index_keywords() Sys.dm_fts_index_keywords_by_document() Get number and size of fragments Sys.fulltext_index_fragments Understanding Query Behavior Sys.dm_fts_parser(““This is test” AND “This is also a test”, 1033,0) Helpful Commands

  14. Demo – Understanding Indexes

  15. Upgrading • - Due to new architecture, we have now new Full-Text Indexes. Former ones are not compatible in SQL Server 2008. • Solution: Full-Text Catalog Upgrade Option • - Import: (default) Faster method although performance and semantic implications are possible. • - Rebuild: Slower method although ideal final state of new FTCatalogs guaranteed. • - Reset: Faster Upgrade method although your Search app will not have the FTCatalogs available afterwards. You need to rebuild them when possible. • - Possible Upgrade methods: • In place Upgrade: User will be prompted for what Upgrade Option to choose for existing FTCatalogs. • Restore/Attach: Instance level setting will be applied to former Full-Text Catalogs brought up with the former DB.

  16. Put full text index on separate file group to avoid fragmenting main data files Use varchar(max) instead of text/image If you see excessive blocking by FTGATHER turn off auto change tracking Schedule manual job to do updates Watch number of fragments to determine how often to schedule Don’t run full text master merges with other index rebuilds or reorgs at the same time If you have large documents (>2MB) may need to reduce SQL memory a bit so FDHost daemon has memory to run. Best Practices

  17. Related Content • Webcasts MSDN Webcast: Using Full-Text Search in SQL Server Express • http://msevents.microsoft.com/CUI/EventDetail.aspx?EventID=1032294979&Culture=en-US • White Paper on SQL Server FTS 2005 http://msdn2.microsoft.com/en-us/library/ms345119.aspx • MSDN http://msdn2.microsoft.com/en-us/library/ms142571.aspx • Technical Case Study http://www.microsoft.com/technet/itshowcase/content/intdocmgmtsql2005.mspx • FTS 2008 (iFTS) White Paper http://msdn.microsoft.com/en-us/library/cc721269(SQL.100).aspx • Program Manager on SQL Server FTS Fernando.azpeitia@microsoft.com

  18. Demo

  19. © 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

More Related