slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Taking Your Database beyond Relations with Microsoft SQL Server 2008 PowerPoint Presentation
Download Presentation
Taking Your Database beyond Relations with Microsoft SQL Server 2008

Loading in 2 Seconds...

play fullscreen
1 / 54

Taking Your Database beyond Relations with Microsoft SQL Server 2008 - PowerPoint PPT Presentation


  • 56 Views
  • Uploaded on

Taking Your Database beyond Relations with Microsoft SQL Server 2008. Michael Rys Principal Lead Program Manager Microsoft Corp. Session Code: DAT321. Dealing with relational and non-relational data platforms Growth in application complexity and duplicated functionality

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Taking Your Database beyond Relations with Microsoft SQL Server 2008' - kalea


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
taking your database beyond relations with microsoft sql server 2008

Taking Your Database beyond Relations with Microsoft SQL Server 2008

Michael Rys

Principal Lead Program Manager

Microsoft Corp.

Session Code: DAT321

beyond relational data
Dealing with relational and non-relational data platforms

Growth in application complexity and duplicated functionality

Compensating for unavailable services

Goals

Beyond Relational Data

Pain Points

Reduce the cost of managing all types of data

Simplify the development of applications which use relational and non-relational data

Extend services currently available for relational data to non-relational data

Provide non-relational services to relational data

customer quote
Customer Quote

“The non-relational features available in SQL Server 2008 allowed us to manage our large datasets composed of millions of images and associated metadata as single entities and eliminated our need to maintain CLR functions for geospatial calculations.

Switching to SQL Server 2008 reduced our operational costs, increased our team productivity, and allowed us to focus on our core business of collecting and processing geospatial data.”

– Telmo Sampaio, CTO – Blue Dasher Technologies

session objectives and takeaways
Session Objectives and Takeaways
  • SQL Server 2008 has added improved support for a variety of non-relational data management scenarios
  • You will understand what SQL Server 2008 will offer in addition to SQL Server 2005’s functionality to
    • Store and manage unstructured data
    • Store and manage semi-structured data
  • You will better understand when to use which of the presented technologies
beyond relational feature overview
Beyond Relational Feature Overview

SQL Server 2008

SQL Server 2005

  • Remote BLOB Store API
  • Filestream
  • Integrated FTS

Full Text Indexing

Documents &

Multimedia

  • Fully supported Geometry and Geography data types and Functions

XML Data Type and Functions

  • XML Upgrades

Relational

BR Support

Large UDTs

Sparse Columns

Wide Tables/Column Set

Filtered Indices

HierarchyID

Spatial

XML

User Defined Types

street level location based imagery
partner demo

Street-level, location-based imagery

Telmo Sampaio Chief Technology OfficerBlue Dasher Technologies

blue dasher application overview
Blue Dasher Application Overview
  • Capture and provide street-level imagery for all public roads in the 50 top metropolitan areas in the U.S.
  • Manage data with SQL Server 2008:
    • Metadata: 70GB of data
    • Image data: 50TB of data
  • Metadata: Uses spatial GEOGRAPHY Type to pinpoint the location of the images, Spatial Index defined
  • Image data: 3 sizes
    • small (7 Kb) in file system
    • medium (60 Kb) and large (1.3 Mb) stored using filestream
    • Medium used for navigation
blue dasher sql server benefits
Blue Dasher-SQL Server benefits
  • Geography data type and functions
    • No need to manage .NET assemblies - use of STIntersects(), STDistance(), STBuffer()
    • Ability to save different shapes in single GEOGRAPHY column
  • Manageability
    • Single backup/restore operations for metadata and images
    • Replication support
  • Performance
    • Streaming files to client application faster using Win32 API
documents and multimedia
Documents and Multimedia

Documents &

Multimedia

Dedicated BLOB Store

Store BLOBs in Database

Use File Servers

Application

Application

Application

BLOBs

BLOBs

BLOBs

DB

DB

DB

Advantages

  • Low cost per GB
  • Streaming Performance
  • Lower cost/GB at scale
  • Scalability & Expandability
  • Integrated management
  • Data-level consistency
  • Complex application development and deployment
  • Integration with structured data
  • Complex application development and deployment
  • Separate management
  • Enterprise-scales only
  • Poor data streaming support
  • File size limitations
  • Highest cost per GB

Challenges

Example

  • Windows File Servers
  • NetApp NetFiler
  • EMC Centera
  • Fujitsu Nearline
  • SQL Server VARBINARY(MAX)
sql server 2008 blobs
SQL Server 2008 BLOBs

Documents &

Multimedia

Dedicated BLOB Store

Store BLOBs in DB + File System

Store BLOBs in Database

Use File Servers

Application

Application

Application

Application

BLOB

BLOB

BLOB

BLOB

DB

DB

DB

DB

FILESTREAM Storage

Remote BLOB Storage

SQL BLOB

filestream
Filestream

Documents &

Multimedia

Store BLOBs in DB + File System

  • Storage Attribute on VARBINARY(MAX)
    • Works with integrated FTS
  • Unstructured data stored directly in the file system (requires NTFS)
  • Dual Programming Model
    • TSQL (Same as SQL BLOB)
    • Win32 Streaming APIs with T-SQL transactional semantics
  • Data Consistency
  • Integrated Manageability
    • Back Up/Restore
    • Administration
  • Size limit is the file system volume size
  • SQL Server Security Stack

Application

BLOB

DB

filestream api
FILESTREAM API

// New TSQL Function:

// Get_filestream_transaction_context()

//

SELECT Get_filestream_transaction_context()

// New TSQL Function :

// PathName()

//

SELECT ClaimImage.PathName() FROM

Insurancedb..Claims

managed sqlfilestream read
Managed SqlFileStream: READ

// New SqlFileStream Class in VS08 SP1

//

SqlFileStreamsfs= newSqlFileStream(path, txnId,

System.IO.FileAccess.Read);

// output file to read into

//

System.IO.FileStreamfs = new System.IO.FileStream (

"c:\\output2.jpg", System.IO.FileMode.Create);

{

   byte[] buffer = new byte[512 * 1024];

intcbBytesRead = buffer.Length;

   while (cbBytesRead == buffer.Length)

   {

cbBytesRead = sfs.Read(buffer, 0, buffer.Length);

fs.Write(buffer, 0, cbBytesRead);

     }       

managed sqlfilestream write
Managed SqlFileStream: WRITE

sfs = new SqlFileStream(path, txnId, System.IO.FileAccess.Write, 0);

using (System.IO.Streamres =

Pictures.GetResourceStream(HealthCare.MRI.JoeSmith)) //input

{

    byte[] buffer = new byte[512 * 1024]; // 512KB

intcbBytesRead = buffer.Length;

    while (cbBytesRead == buffer.Length)

    {

cbBytesRead = res.Read(buffer, 0, buffer.Length);

sfs.Write(buffer, 0, cbBytesRead);

    }

}

// commit SQL transaction and close SQL connection.

//

updates using filestream file i o access
Updates Using FILESTREAM File I/O Access
  • “In place partial updates” not supported
  • Update through Win32 handle (Write access)
    • Done by creating a new file (0-bytes)
    • New content replaces old (committed) content on Transaction Commit (example: high resolution pictures)
  • For the new file to contain the committed BLOB content first use FSCTL_SQL_FILESTREAM_FETCH_OLD_CONTENT
    • Useful for LOB append
some tips
Some Tips…
  • Reading bigger buffers gives better performance
  • Volumes hosting FILESTREAM data should have 8.3 name generation and LastAccessTime disabled
  • FILESTREAM containers to reside on dedicated volumes
  • Have one volume per FILESTREAM container
    • enables space management at volume level
    • “Magic” SMB buffer size = ~60KB
      • Another “good” value is 480KB
    • ROWGUID unique index for aligned partitioning
    • AntiVirus programs should be configured not to delete infected files but to quarantine them
    • If using compressed volumes, use cluster size 4 KB
migration sql blobs to filestream
Migration SQL BLOBs to FILESTREAM
  • Conceptual walkthrough
    • Create one (or more) FILESTREAM (filegroup)
    • Alter table to add FILESTREAM column
    • For each existing row, update data in new row with empty FILESTREAM value
    • Write LOB data (File I/O or T-SQL access)
    • Optional: drop Varbinary(MAX) column
  • BLOB migration utilities support (intended for SQL Server 2008 R2)
remote blob store architecture
Customer applications can transparently support different BLOB stores

Each Remote Blob Store vendor responsible for delivering their own providers

Remote Blob Store Architecture

Documents &

Multimedia

SQL DB

Customer Application

  • RBS Services
  • Create
  • Fetch
  • GC
  • Delete

SQL RBS API

NetApp lib

IBM lib

Centera lib

Provider API

NetApp

IBM

Centera

rbs workflow
RBS Workflow

Machine Boundary

Write BLOB(Photo)

1

Application

Return Blob ID

2

2

Write Blob ID to PhotoRef field

3

RBS Client Library

BLOB Store Provider Library

1

3

BLOB Store

SQL Server

rbs fundamentals
RBS Fundamentals
  • Most useful in environments where interoperability is required
  • No restrictions on back-end store
  • Back-end can change with no app change
  • Looser (link level) consistency guarantees
  • SQL Server handles link consistency and garbage collection
full text search challenges
Full Text Search Challenges

Documents &

Multimedia

  • Indexes stored outside SQL Server lead to manageability challenges
  • Mixed query performance suffers from having to pull over complete full-text result set
  • Scaling issues on big boxes
  • See also DAT303 – Answering the Queries your users really want to ask, with Full-Text Search in SQL Server 2008
full text search improvements
Full Text Search Improvements

Documents &

Multimedia

  • Full-Text Engine and Indexes fully integrated
    • Catalog, index and stopword lists now inside the database
  • Better performance in many common scenarios
    • Make mixed queries perform and scale
    • Optimizer has knowledge about FT index
    • X= selectivity (%) of rows that match the relational conditionIFTS FTQuery time : (SQL 2005 FTQuery time) x (X/100)
  • Exposing the FT Index content and any given WB behaviours
  • New word breakers (WB)
    • Better supportability tools
    • From 23 to over 40 word breakers/locales

SELECT * FROM candidates WHERE CONTAINS(resume,’”SQL Server”’) AND ZipCode = ‘98052’

ifts query architecture
IFTS Query Architecture

Documents &

Multimedia

SQL SERVER process

FDHOST process

Algebrizer

SQL Algeb.

FTS Algeb.

FTS Algeb.

T-SQLParser

Shared Memory

Parse

WB client

WB

Bind

Stemmer

QUERY

Language Module

STOPLIST

SQL/FTS integrated query tree

iFilters

Results

Ranking Func. Integration

THESAURUS

CREATE FULLTEXT INDEX ft_idx

ON Doc_table(document TYPE COLUMN doc_type)KEY INDEX doc_id_idx;

Execution Plan

QE

QO

Cardinality

FTLogicalOperator

FTLogicalOperator

FTLogicalOperator

FTExecutionOperator

Full-Text Index

FTExecutionOperator

FTExecutionOperator

indexing performance
Indexing Performance

Documents &

Multimedia

2 min

1 min

Populating an index of 20 million rows of 1K data on identical hardware (time in minutes)

spatial data overview
Spatial Data Overview

Spatial

  • Storage and retrieval of spatial data using standard SQL syntax
    • New Spatial Data Types (geometry, geography)
    • New Spatial Methods (intersects, buffer, etc.)
    • New Spatial Indexes
  • Offers full set of Open Geospatial Consortium components (OGC/SQL MM, ISO 19125)
  • Spatial Builder Interface
  • SSMS Visualization
  • Integration with Virtual Earth
  • See also:
  • DAT324 WiE: Building Location-Aware Services with Microsoft SQL Server
  • DAT03-HOL Integrating Microsoft SQL Server 2008 Spatial Support with Microsoft Virtual Earth
xml improvements
XML Improvements

XML

  • Improved XML Schema Validation
    • Support for storing and validating Office 12 Document formats
    • Support for lax validation
    • Full xs:dateTime support
      • Support for values without timezone
      • timezone preservation
    • Improved support for lists and union types
  • Added support for let-clause in XQuery
  • Added fn:upper-case()/fn:lower-case()
  • Added support forinsert sql:variable(“@xml”) into /a/b
semi structured data scenarios
Semi-Structured Data Scenarios
  • Property bag scenarios
    • Distinct customized property sets associated with data
    • Large number of unique properties, user annotations
    • Examples
      • Document management systems (SharePoint)
      • Media stores
  • Databases with heterogeneous record types in a table
    • Type specific properties, inherited properties in a type hierarchy
    • Examples
      • Product catalogs (Commerce Server), Location/business specific properties(VE), etc…

Relational

BR Support

semi structured data characteristics
Semi-Structured Data Characteristics
  • Large number of sparsely populated properties
  • Distinct property sets
  • Heterogeneous structures
  • Sets, nested structures

Relational

BR Support

semi structured data requirements
Semi-Structured Data Requirements
  • Efficient storage for sparse properties
  • Efficient relational access to sparse properties: Query, DML, Indexing
  • Ability to get/set sparse property sets generically
  • Index subsets of relevant properties for a property set
  • Retrieve and analyze hierarchical data

Relational

BR Support

semi structured data sql server 2008 features
Semi-Structured Data SQL Server 2008 features
  • Sparse Columns: Optimized storage for sparse columns
  • Column Sets/Wide Tables: Support thousands of sparse columns
  • Filtered Indexes: Ability to index a subset of rows in a table
  • Hierarchy ID: System CLR type for hierarchical organization of data
  • XML: For fast dynamic evolution (open schema), lists and tree objects

Relational

BR Support

hierarchical data scenarios
Hierarchical Data Scenarios
  • Forum and mailing list threads
  • Business organization charts
  • Content management categories
  • Product categories
  • Files/folders management
  • SQL Server 2005
    • Parent/Child column
    • XML datatype
  • New in SQL Server 2008
    • HierarchyID type

Relational

BR Support

hierarchyid key properties
HierarchyIDKey properties
  • A system data type with variable length
    • CLR UDT
    • Microsoft.SqlServer.Types
    • Varbinary encoding ( < 900 bytes)
  • To represent position in a hierarchy
  • Logically encodes the path from the root of the tree to a node
  • Rich built-in methods for manipulating hierarchies
  • Simplifies storage and querying of hierarchical data
  • Comparison a<b is in depth-first order
  • Support for arbitrary insertions and deletions

See also:

DAT04-INT Using the HIERARCHYID Datatype in Microsoft SQL Server 2008 to Maintain and Query Hierarchies

Relational

BR Support

sparse columns
Sparse Columns
  • “Sparse” as a storage attribute on a column
    • Storage Optimization: 0 bytes stored for a NULL value
    • Co-location of data: Performance benefits
    • NULL Compression in the TDS layer
    • No change in Query/DML behavior

Relational

BR Support

sparse columns1
Sparse Columns
  • Wide Tables/Column Sets
    • Large number of sparse columns allowed in a table (30,000 Columns, 1000 indexes, 30,000 statistics)
    • Requires defining a “Sparse Column Set”
      • An un-typed XML column, with a published format
      • Logical grouping for all sparse columns in a table
      • Select * returns all non-sparse-columns, sparse column set (XML)
      • Allows generic retrieval/update of all sparse columns as a set

Relational

BR Support

sparse column storage
0 Bytes stored for NULL Values

~20% CPU overhead for non-null value access

Additional 2- 4 bytes for non-null values

Sparse columns are beneficial when space savings >40%

Sparse Column Storage

Sparse storage

Non-sparse storage

Relational

BR Support

filtered indexes
Filtered Indexes
  • Filtered Indexes and Statistics
    • Indexing a portion of the data in a table
    • Filtered/co-related statistics creation and usage
    • Query/DML Optimization to use filtered indexes and statistics
  • Restrictions
    • Simple limited grammar for the predicate
    • Only on non-clustered indexes
  • Benefits
    • Lower storage and maintenance costs for large number of indexes
    • Query/DML performance benefits: IO only for qualifying rows

Relational

BR Support

summary
Summary
  • SQL Server 2008 will make it easier to create information-centric applications that require
    • Unstructured documents
    • XML
    • Semi-structured information
    • Combine the above with relational data
    • by:
      • Reducing the cost of managing all types of data
      • Simplifying the development of applications which use relational and non-relational data
      • Extending services currently available for relational data to non-relational data
future of beyond relational
Future of Beyond Relational
  • Rich unstructured data
    • Enable existing BR apps to store data in SQL Server
      • Win 32 File I/O API compatibility
    • Better integration of FileStream and RBS programming models
    • Better scalability of FileStream
    • Property Search and promotion
    • iFTS improvements in functionality and scale/performance
  • Deep Spatial
    • More functionality
    • Across BI components
resources

WebLog

http://sqlblog.com/blogs/michael_rys/

Whitepapers & Videos

FileStream/RBS Whitepapers:

http://msdn.microsoft.com/en-us/library/cc949109.aspx

http://www.microsoft.com/sqlserver/2008/en/us/wp-sql-2008-manage-unstructured.aspx

What’s new for XML in SQL Server 2008: http://www.microsoft.com/sqlserver/2008/en/us/wp-sql-2008-whats-new-xml.aspx

iFTS: http://msdn.microsoft.com/en-us/library/cc721269.aspx

Semistructured data: To be published soon (see blog for announcement)

PDC Presentation on Beyond Relational: http://channel9.msdn.com/pdc2008/BB16/

And of course Books Online!

SQL Server 2008 Business Value Calculator:

http://www.moresqlserver.com

Resources
sql server community resources

Become a FREE PASS Member: www.sqlpass.org/RegisterforSQLPASS.aspx

Learn more about the PASS organization www.sqlpass.org/

Additional Community Resources

SQL Server Community Center www.microsoft.com/sqlserver/2008/en/us/community-center.aspx

TechNet Community for IT Professionals

http://technet.microsoft.com/en-us/sqlserver/bb671048.aspx

Developer Center

http://msdn.microsoft.com/en-us/sqlserver/bb671064.aspx

SQL Server 2008 Learning Portalhttp://www.microsoft.com/learning/sql/2008/default.mspx

SQL Server Community Resources
  • Connect: Local Chapters, Special Interest Groups, Online Community
  • Share: PASSPort Social Networking, Community Connection Event
  • Learn: PASS Summit Annual Conference, Technical Articles, Webcasts
    • More about the PASS organization www.sqlpass.org/

The Professional Association for SQL Server (PASS) is an independent, not-for-profit association, dedicated to supporting, educating, and promoting the Microsoft SQL Server community.

sql server word of the day

POLICY-BASED

MANAGEMENT

SQL Server Word of the Day

Monday, May 11

*Game cards may be picked up at the SQL Server booths in the TLC

related content
Related Content

DAT324 WiE: Building Location-Aware Services with Microsoft SQL Server, Ed Katibah, Olivier Meyer

Fri 5/15 | 1:00 PM-2:15 PM | Room 402

DAT320 Optimizing Microsoft SQL Server 2008 Applications Using Table Valued Parameters, XML, and MERGE, Tobias Ternstrom

Fri 5/15 | 9:00 AM-10:15 AM | Room 153

DAT303 Answering the Queries Your Users Really Want to Ask, with iFTS in Microsoft SQL Server 2008, Greg Low

Tue 5/12 | 4:30 PM-5:45 PM | Room 404

DAT403 What's New in Microsoft SQL Server 2008, Stephen ForteMon 5/11 | 1:00 PM-2:15 PM | Room 151

related content1
Related Content

DAT04-INT Using the HIERARCHYID Datatype in Microsoft SQL Server 2008 to Maintain and Query Hierarchies, Itzik Ben-Gan

Mon 5/11 | 1:00 PM-2:15 PM | Blue Thr 1

PAN56 SQL Server Server Programmability Q & A

Tue 5/12 | 10:00AM | 501C

DAT03-HOL Integrating Microsoft SQL Server 2008 Spatial Support with Microsoft Virtual Earth

resources1
Resources
  • www.microsoft.com/teched

Sessions On-Demand & Community

  • www.microsoft.com/learning
  • Microsoft Certification & Training Resources
  • http://microsoft.com/technet
    • Resources for IT Professionals
  • http://microsoft.com/msdn

Resources for Developers

slide54

© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.

The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.