Big data working with terabytes in sql server
Download
1 / 45

Big Data Working with Terabytes in SQL Server - PowerPoint PPT Presentation


  • 400 Views
  • Updated On :

Big Data Working with Terabytes in SQL Server. Andrew Novick www.NovickSoftware.com. Agenda. Challenges Architecture Solutions. Introduction. Andrew Novick – Novick Software, Inc. Business Application Consulting SQL Server .Net www.NovickSoftware.com Books:

Related searches for Big Data Working with Terabytes in SQL Server

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Big Data Working with Terabytes in SQL Server' - Mia_John


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Big data working with terabytes in sql server l.jpg

Big DataWorking with Terabytes in SQL Server

Andrew Novick

www.NovickSoftware.com


Agenda l.jpg
Agenda

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server

  • Challenges

  • Architecture

  • Solutions


Slide3 l.jpg

Introduction

  • Andrew Novick – Novick Software, Inc.

  • Business Application Consulting

    • SQL Server

    • .Net

  • www.NovickSoftware.com

  • Books:

    • Transact-SQL User-Defined Functions

    • SQL 2000 XML Distilled

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


What s big l.jpg
What’s Big?

  • 100’s of gigabytes and up

    to 10’s of terabytes

  • 100,000,000 rows an up

    to 100’s of Billions of rows

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Big scenarios l.jpg
Big Scenarios

  • Data Warehouse

  • Very Large OLTP databases (usually with reporting functions)

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Big hardware l.jpg
Big Hardware

  • Multi-core 8-64

  • RAM 16 GB to 256 GB

  • SAN’s or direct attach RAID

  • 64 Bit SQL Server

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server



Challenges8 l.jpg
Challenges

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server

  • Load Performance (ETL)

  • Query Performance

  • Data Management Performance


Slide9 l.jpg

How fast can you load

rows into the database?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Slide10 l.jpg

Load 1,000,000 rows into a 400,000,000 million row table that has 12 indexes?

12 Hours

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Slide11 l.jpg

Load 1,000,000 rows into an empty table and add 12 indexes? that has 12 indexes?

5 Minutes

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Slide12 l.jpg

How do you speed up queries on that has 12 indexes?1,000,000,000

row tables?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Challenge backup l.jpg
Challenge - Backup that has 12 indexes?

  • Let’s say you have a 10 TB database.

  • Now back that up.

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Backup calculation l.jpg
Backup Calculation that has 12 indexes?

  • 10 TB = 10000 GB

  • Typical Backup speed - 1 to 20 GB / Min

  • At 10 GB/Minute

Who as 16 hours to spare?

Who’s got 1000 minutes?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Architecture l.jpg

Architecture that has 12 indexes?

What do we have to work with?


Sql server storage architecture l.jpg

Logical Disk System – Windows Drives that has 12 indexes?

Drive C:

Drive D:

Drive E:

Physical IO - subsystem

Disk

Disk

Disk

Disk

Disk

Disk

SQL Server Storage Architecture

SQL Server Storage

Table1

Table2

FileGroupA

FileGroupB

FileA1

FileB1

FileB2

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Solutions l.jpg

Solutions that has 12 indexes?


Solutions18 l.jpg
Solutions that has 12 indexes?

  • Use Multiple FileGroups/Files

  • INSERT into empty unindexed tables

  • Partitioned Tables and/or Views

  • Use READ_ONLY FileGroups

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


At 3 pm on the 1 st of the month where do you want your data to be l.jpg
At 3 PM on the 1 that has 12 indexes?st of the month:Where do you want your data to be?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Slide20 l.jpg

Spread to as many disks as possible that has 12 indexes?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


I o performance l.jpg
I/O Performance that has 12 indexes?

  • Little has changed in 50 years

  • Size for Performance Not for Space

    • My app needs 1500 reads/sec and 800 writes/sec

I/O throughput is a function of the number of disk drives.

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Solution load performance l.jpg
Solution: Load Performance that has 12 indexes?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server

  • Insert into empty tables

  • Index and add foreign keys after the insert

  • Add the Slices to

    • Partitioned Views

    • Partitioned Tables


Partitioning l.jpg
Partitioning that has 12 indexes?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Partitioned views l.jpg
Partitioned Views that has 12 indexes?

CREATE VIEW Fact AS

SELECT * FROM Fact_20080405

UNION ALL SELECT * FROM Fact_20080406

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server

Created like any view


Partitioned views check constraints l.jpg
Partitioned Views: that has 12 indexes?Check Constraints

  • Check constraints tell SQL Server which data is in which table

ALTER TABLE Fact_20080405

ADD CONSTRAINT CK_FACT_20080405_Date CHECK (FactDate >= ‘2008-04-05’

and FactDate < ‘2008-04-06’)

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Partitioned view 2 l.jpg
Partitioned View - 2 that has 12 indexes?

  • Looks to a query like any table or view

SELECT FactDate, ….. FROM Fact WHERE CustID=334343

AND FactDate = ‘2008-04-05’

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Partitioned view l.jpg

Logical Disk System – Windows Drives that has 12 indexes?

Drive C:

Drive D:

Drive E:

Physical IO - subsystem

Disk

Disk

Disk

Disk

Disk

Disk

Partitioned View

SQL Server Storage

View Fact

Table1

Table2

Fact_20080330

Fact_20080401

Fact_20080401

Fact_20080331

FGF1

FGF1

FGF2

FGF2

FGF3

FGF3

FGF4

FGF4

FileGroupA

FileGroupB

FileA1

FileB1

FileB2

F1

F1

F2

F2

F3

F3

F4

F4

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Partition elimination l.jpg
Partition Elimination that has 12 indexes?

  • The query compiler can eliminate partitions from consideration in the plan

  • Partition elimination happens at query compile time.

  • It is often necessary to make partition values string constants.

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Demo 1 partitioned views l.jpg
Demo 1 – Partitioned Views that has 12 indexes?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Partitioned tables l.jpg
Partitioned Tables that has 12 indexes?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server

SQL Server Enterprise/2005

Require a non-null partitioning column

Check constraints tell SQL Server what data is in each parturition

All tables are partitioned!


Partitioned function l.jpg
Partitioned Function that has 12 indexes?

  • Defines how to split data

CREATE PARTITION FUNCTION

Fact_PF(smalldatetime)

RANGE RIGHT FOR VALUES

(‘2001-07-01’, ‘2001-07-02’)

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Partition scheme l.jpg
Partition Scheme that has 12 indexes?

  • Defines where to store each range of data

CREATE PARTITION SCHEME Fact_PS

AS PARTITION Fact_pf

TO (PRIMARY, FG_20010701, FG_20010702)

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Creating a partitioned table l.jpg
Creating a Partitioned Table that has 12 indexes?

  • The table is created ON thePartitioned Scheme

CREATE TABLE Fact (Fact_Date smalldatetime

, all my other columns)

ON Fact_PS (Fact_Date)

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Partitioned table l.jpg

Logical Disk System – Windows Drives that has 12 indexes?

Drive C:

Drive D:

Drive E:

Physical IO - subsystem

Disk

Disk

Disk

Disk

Disk

Disk

Partitioned Table

SQL Server Storage

Table Fact

Table1

Table2

Fact.$Partition=1

Fact.$Partition=3

Fact.$Partition=4

Fact.$Partition=2

FileGroupA

FileGroupB

FGF1

FGF2

FGF3

FGF4

FileA1

FileB1

FileB2

F1

F2

F3

F4

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Slide35 l.jpg

Demo 2 – Partitioned Tables that has 12 indexes?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Partitioning goals l.jpg
Partitioning Goals that has 12 indexes?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server

  • Adequate Import Speed

  • Maximize Query Performance

    • Make use of all available resources

  • Data Management

    • Migrate data to cheaper resources

    • Delete old data easily


Achieving query speed l.jpg
Achieving Query Speed that has 12 indexes?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server

  • Eliminate partitions during query compile

  • All disk resources should be used

    • Spread Data to use all drives

    • Parallelize by querying multiple partitions

  • All available memory should be used

  • All available CPUs should be used


Issues with partitioning l.jpg
Issues with Partitioning that has 12 indexes?

  • No foreign keys can reference the Partitioned Table

  • Identity columns must be more closely managed.

  • UPDATES on partitioned tables with part of the table in READ_ONLY filegroups must have partition elimination that restricts the updates to READ_WRITE filegroups.

  • INSERTs into partitioned views require all columns and face additional restrictions.

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Solution backup performance l.jpg
Solution: Backup Performance that has 12 indexes?

  • Backup less!

  • Maintain data in a READ_ONLY state

  • Compress Backups

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Read only filegroups l.jpg
Read_Only FileGroups that has 12 indexes?

  • Requires only one Backup after becoming read_only

  • Don’t require page or row locks

  • Don’t require maintenance

  • The ALTER requires exclusive access to the database before SQL 2008

ALTER DATABASE <database> MODIFY FILEGROUP <filegroup> SET READ_ONLY

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Partial backup l.jpg
Partial Backup that has 12 indexes?

  • Partial Base

    • Backs up read_write filegroups

  • Partial Differential

    • Differential backup of read_write filegroups

BACKUP DATABASE <db name>

READ_WRITE_FILEGROUPS …..

BACKUP DATABASE <db name>

READ_WRITE_FILEGROUPS

WITH DIFFERENTIAL ….

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Maintenance operations l.jpg
Maintenance Operations that has 12 indexes?

  • Maintain only READ_WRITE data

    • DBCC CHECKFILEGROUP

    • ALTER INDEX

      • REBUILD PARTITION =

      • REORGANIZE PARTITION =

  • Avoid SHRINK

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Sliding window l.jpg
Sliding Window that has 12 indexes?

Always

There

Data

Temporal

Data

2008-01

Temporal

Data

2008-02

Temporal

Data

2008-03

Temporal

Data

2008-04

Temporal

Data

2008-05

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Sql server 2008 what s new l.jpg
SQL Server 2008 – What’s New that has 12 indexes?

  • Row, page, and backup compression

  • Filtered Indexes

  • Optimization for star joins

  • Lock Escalation to the Partition Level

  • Partitioned Indexed Views

  • Fewer operations require exclusive access to the database

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Slide45 l.jpg

Thanks for Coming that has 12 indexes?

Andrew Novick

  • anovick@NovickSoftware.com

    www.NovickSoftware.com

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server