Big data working with terabytes in sql server
Download
1 / 45

Big Data Working with Terabytes in - PowerPoint PPT Presentation


  • 396 Views
  • Updated On :

Big Data Working with Terabytes in SQL Server. Andrew Novick www.NovickSoftware.com. Agenda. Challenges Architecture Solutions. Introduction. Andrew Novick – Novick Software, Inc. Business Application Consulting SQL Server .Net www.NovickSoftware.com Books:

Related searches for Big Data Working with Terabytes in

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Big Data Working with Terabytes in' - Mia_John


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Big data working with terabytes in sql server l.jpg

Big DataWorking with Terabytes in SQL Server

Andrew Novick

www.NovickSoftware.com


Agenda l.jpg
Agenda

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server

  • Challenges

  • Architecture

  • Solutions


Slide3 l.jpg

Introduction

  • Andrew Novick – Novick Software, Inc.

  • Business Application Consulting

    • SQL Server

    • .Net

  • www.NovickSoftware.com

  • Books:

    • Transact-SQL User-Defined Functions

    • SQL 2000 XML Distilled

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


What s big l.jpg
What’s Big?

  • 100’s of gigabytes and up

    to 10’s of terabytes

  • 100,000,000 rows an up

    to 100’s of Billions of rows

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Big scenarios l.jpg
Big Scenarios

  • Data Warehouse

  • Very Large OLTP databases (usually with reporting functions)

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Big hardware l.jpg
Big Hardware

  • Multi-core 8-64

  • RAM 16 GB to 256 GB

  • SAN’s or direct attach RAID

  • 64 Bit SQL Server

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server



Challenges8 l.jpg
Challenges

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server

  • Load Performance (ETL)

  • Query Performance

  • Data Management Performance


Slide9 l.jpg

How fast can you load

rows into the database?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Slide10 l.jpg

Load 1,000,000 rows into a 400,000,000 million row table that has 12 indexes?

12 Hours

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Slide11 l.jpg

Load 1,000,000 rows into an empty table and add 12 indexes? that has 12 indexes?

5 Minutes

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Slide12 l.jpg

How do you speed up queries on that has 12 indexes?1,000,000,000

row tables?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Challenge backup l.jpg
Challenge - Backup that has 12 indexes?

  • Let’s say you have a 10 TB database.

  • Now back that up.

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Backup calculation l.jpg
Backup Calculation that has 12 indexes?

  • 10 TB = 10000 GB

  • Typical Backup speed - 1 to 20 GB / Min

  • At 10 GB/Minute

Who as 16 hours to spare?

Who’s got 1000 minutes?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Architecture l.jpg

Architecture that has 12 indexes?

What do we have to work with?


Sql server storage architecture l.jpg

Logical Disk System – Windows Drives that has 12 indexes?

Drive C:

Drive D:

Drive E:

Physical IO - subsystem

Disk

Disk

Disk

Disk

Disk

Disk

SQL Server Storage Architecture

SQL Server Storage

Table1

Table2

FileGroupA

FileGroupB

FileA1

FileB1

FileB2

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Solutions l.jpg

Solutions that has 12 indexes?


Solutions18 l.jpg
Solutions that has 12 indexes?

  • Use Multiple FileGroups/Files

  • INSERT into empty unindexed tables

  • Partitioned Tables and/or Views

  • Use READ_ONLY FileGroups

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


At 3 pm on the 1 st of the month where do you want your data to be l.jpg
At 3 PM on the 1 that has 12 indexes?st of the month:Where do you want your data to be?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Slide20 l.jpg

Spread to as many disks as possible that has 12 indexes?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


I o performance l.jpg
I/O Performance that has 12 indexes?

  • Little has changed in 50 years

  • Size for Performance Not for Space

    • My app needs 1500 reads/sec and 800 writes/sec

I/O throughput is a function of the number of disk drives.

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Solution load performance l.jpg
Solution: Load Performance that has 12 indexes?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server

  • Insert into empty tables

  • Index and add foreign keys after the insert

  • Add the Slices to

    • Partitioned Views

    • Partitioned Tables


Partitioning l.jpg
Partitioning that has 12 indexes?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Partitioned views l.jpg
Partitioned Views that has 12 indexes?

CREATE VIEW Fact AS

SELECT * FROM Fact_20080405

UNION ALL SELECT * FROM Fact_20080406

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server

Created like any view


Partitioned views check constraints l.jpg
Partitioned Views: that has 12 indexes?Check Constraints

  • Check constraints tell SQL Server which data is in which table

ALTER TABLE Fact_20080405

ADD CONSTRAINT CK_FACT_20080405_Date CHECK (FactDate >= ‘2008-04-05’

and FactDate < ‘2008-04-06’)

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Partitioned view 2 l.jpg
Partitioned View - 2 that has 12 indexes?

  • Looks to a query like any table or view

SELECT FactDate, ….. FROM Fact WHERE CustID=334343

AND FactDate = ‘2008-04-05’

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Partitioned view l.jpg

Logical Disk System – Windows Drives that has 12 indexes?

Drive C:

Drive D:

Drive E:

Physical IO - subsystem

Disk

Disk

Disk

Disk

Disk

Disk

Partitioned View

SQL Server Storage

View Fact

Table1

Table2

Fact_20080330

Fact_20080401

Fact_20080401

Fact_20080331

FGF1

FGF1

FGF2

FGF2

FGF3

FGF3

FGF4

FGF4

FileGroupA

FileGroupB

FileA1

FileB1

FileB2

F1

F1

F2

F2

F3

F3

F4

F4

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Partition elimination l.jpg
Partition Elimination that has 12 indexes?

  • The query compiler can eliminate partitions from consideration in the plan

  • Partition elimination happens at query compile time.

  • It is often necessary to make partition values string constants.

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Demo 1 partitioned views l.jpg
Demo 1 – Partitioned Views that has 12 indexes?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Partitioned tables l.jpg
Partitioned Tables that has 12 indexes?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server

SQL Server Enterprise/2005

Require a non-null partitioning column

Check constraints tell SQL Server what data is in each parturition

All tables are partitioned!


Partitioned function l.jpg
Partitioned Function that has 12 indexes?

  • Defines how to split data

CREATE PARTITION FUNCTION

Fact_PF(smalldatetime)

RANGE RIGHT FOR VALUES

(‘2001-07-01’, ‘2001-07-02’)

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Partition scheme l.jpg
Partition Scheme that has 12 indexes?

  • Defines where to store each range of data

CREATE PARTITION SCHEME Fact_PS

AS PARTITION Fact_pf

TO (PRIMARY, FG_20010701, FG_20010702)

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Creating a partitioned table l.jpg
Creating a Partitioned Table that has 12 indexes?

  • The table is created ON thePartitioned Scheme

CREATE TABLE Fact (Fact_Date smalldatetime

, all my other columns)

ON Fact_PS (Fact_Date)

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Partitioned table l.jpg

Logical Disk System – Windows Drives that has 12 indexes?

Drive C:

Drive D:

Drive E:

Physical IO - subsystem

Disk

Disk

Disk

Disk

Disk

Disk

Partitioned Table

SQL Server Storage

Table Fact

Table1

Table2

Fact.$Partition=1

Fact.$Partition=3

Fact.$Partition=4

Fact.$Partition=2

FileGroupA

FileGroupB

FGF1

FGF2

FGF3

FGF4

FileA1

FileB1

FileB2

F1

F2

F3

F4

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Slide35 l.jpg

Demo 2 – Partitioned Tables that has 12 indexes?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Partitioning goals l.jpg
Partitioning Goals that has 12 indexes?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server

  • Adequate Import Speed

  • Maximize Query Performance

    • Make use of all available resources

  • Data Management

    • Migrate data to cheaper resources

    • Delete old data easily


Achieving query speed l.jpg
Achieving Query Speed that has 12 indexes?

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server

  • Eliminate partitions during query compile

  • All disk resources should be used

    • Spread Data to use all drives

    • Parallelize by querying multiple partitions

  • All available memory should be used

  • All available CPUs should be used


Issues with partitioning l.jpg
Issues with Partitioning that has 12 indexes?

  • No foreign keys can reference the Partitioned Table

  • Identity columns must be more closely managed.

  • UPDATES on partitioned tables with part of the table in READ_ONLY filegroups must have partition elimination that restricts the updates to READ_WRITE filegroups.

  • INSERTs into partitioned views require all columns and face additional restrictions.

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Solution backup performance l.jpg
Solution: Backup Performance that has 12 indexes?

  • Backup less!

  • Maintain data in a READ_ONLY state

  • Compress Backups

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Read only filegroups l.jpg
Read_Only FileGroups that has 12 indexes?

  • Requires only one Backup after becoming read_only

  • Don’t require page or row locks

  • Don’t require maintenance

  • The ALTER requires exclusive access to the database before SQL 2008

ALTER DATABASE <database> MODIFY FILEGROUP <filegroup> SET READ_ONLY

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Partial backup l.jpg
Partial Backup that has 12 indexes?

  • Partial Base

    • Backs up read_write filegroups

  • Partial Differential

    • Differential backup of read_write filegroups

BACKUP DATABASE <db name>

READ_WRITE_FILEGROUPS …..

BACKUP DATABASE <db name>

READ_WRITE_FILEGROUPS

WITH DIFFERENTIAL ….

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Maintenance operations l.jpg
Maintenance Operations that has 12 indexes?

  • Maintain only READ_WRITE data

    • DBCC CHECKFILEGROUP

    • ALTER INDEX

      • REBUILD PARTITION =

      • REORGANIZE PARTITION =

  • Avoid SHRINK

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Sliding window l.jpg
Sliding Window that has 12 indexes?

Always

There

Data

Temporal

Data

2008-01

Temporal

Data

2008-02

Temporal

Data

2008-03

Temporal

Data

2008-04

Temporal

Data

2008-05

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Sql server 2008 what s new l.jpg
SQL Server 2008 – What’s New that has 12 indexes?

  • Row, page, and backup compression

  • Filtered Indexes

  • Optimization for star joins

  • Lock Escalation to the Partition Level

  • Partitioned Indexed Views

  • Fewer operations require exclusive access to the database

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


Slide45 l.jpg

Thanks for Coming that has 12 indexes?

Andrew Novick

  • [email protected]

    www.NovickSoftware.com

PASS Community Summit 2008 AD-202 Big Data: Working with Terabytes in SQL Server


ad