1 / 34

Advanced Veritas NetBackup Performance Tuning

oshin
Download Presentation

Advanced Veritas NetBackup Performance Tuning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Advanced Veritas NetBackup Performance Tuning Dave Little – Sr. Distinguished System Engineer

    3. Most Common Questions Asked According to SE’s, Tech Support Engineers and Consultants, which of the following questions are most commonly asked? Are you taking us golfing this weekend? Where is the “Any” Key? What was up with Britney? Seriously, what was up with Britney?? I just added 6 new tape drives so why am I still not meeting my backup window? I just replaced my robot with a VTL so why am I still not meeting my backup window?

    4. Backup System Infrastructure As we talk about “Tuning” today we are talking about matching the components in the environment properly AND proper configuration of those components, in addition to actual “tuning” Some of the most commonly asked questions posed to NetBackup technical support relate to performance Most performance issues can be traced to hardware / environmental issues Basic understanding of the entire backup data path is important in determining maximum obtainable performance Poor performance is usually the result of unrealistic expectations and/or poor planning The Bottom Line – It’s all about Bandwidth

    5. Why Tuning Is So Critical All hardware has different throughputs “Matching” the throughput will help to avoid bottlenecks The ultimate goal is to make the final storage point the bottleneck (i.e. the tape drive) Tuning provides higher ROI This reduces the need to buy more hardware Improves scaling Helps provide understanding of how everything works together As you tune, you become more familiar with your environment A data protection solution has a lot of “moving parts” Each must be balanced from end to end Reduces management If things are running smoothly, less time is needed to baby sit backups

    6. Hardware Capacity Planning – Overview Never exceed 70% of the rated capacity of any component Manufacturer throughput and performance specifications based on theoretical environment – seldom (if ever) achieved in real world Trust us on this one The 70% rule applies to nearly every component in the environment Disk CPU Internal Bus Response time significantly increases after 70% utilization threshold is exceeded Tape drives are an exception to this rule

    7. Throughput vs. Response Time

    8. The NetBackup Servers So … what servers should I purchase? If we had a dollar for every time we had been asked this, we wouldn’t be here presenting to you This question is like the “How long is a piece of string” question It all depends on your requirements and what type of shop you are in (i.e. Windows, Unix, both) and your goals/needs Some guidelines: Master Servers – High CPU requirement, lower I/O As a general guideline, any Master Server should have multiple CPU’s and lots of RAM Media Servers – Lower CPU requirement, High I/O As a general guideline, any Media Server should have PCIExpress bus Both Servers – Expandability can reduce hardware expenditure later This means don’t buy something that will be maxed out tomorrow

    9. The NetBackup Servers – Media Servers What kind of I/O do I need in my Media Servers? PCIexpress is your Friend Newer servers with PCIexpress make I/O bottlenecks less of an issue than it was even 18 months ago (most servers now have PCIe) A Media Server with PCIexpress means the I/O bus is no longer the bottleneck like it was in PCI days (two years ago) Server Example – SUN T5220 ($20 – 25k) 4GB/sec Bus throughput when reading and writing at the same time 10GbE optional With dual port PCIe 4Gb/sec FC-HBA able to move 400MB/sec if properly configured and tuned This equates to over 17TB in a 12 hour backup window for the server itself Obviously the rest of the environment would need to be matched for this type of speed Brings us back to why tuning/matching is important

    10. The NetBackup Servers – Media Servers How many CPU’s do I need in my Media Servers? I/O Bus is more important than CPU in Media Servers Experiments on SUN systems have shown that a useful, conservative estimate is 5MHz of CPU capacity per 1MB/sec of data movement in AND out of the Media Server Example A LAN Media Server backing up 20 Clients at 5MB/sec each to a tape drive would need 1000MHz of available CPU power 500MHz to receive the data across the LAN 500MHz to send the data to the tape device Depending on the Media Server, other applications and the OS may use CPU cycles You can see why, with modern servers, the CPU is not as important as the I/O This is one reason we don’t like to see Master/Media Server Combos

    11. The NetBackup Servers – Media Servers How much RAM do I need in my Media Servers? More is always better Server prices are coming down, therefore it doesn’t make sense to not have a robust system Other apps on the Media Server and OS use memory NetBackup uses shared memory for local backups Buffer tuning on the Media Server can increase throughput dramatically Buffer tuning (covered later) is a requirement. NBU is “out of the box” geared toward low end systems rather than newer modern hardware Buffers use shared memory – a finite resource To determine how much memory is being used use this formula (buffer_size * number_buffers) * number of drives * MPX Defaults Size = 65536 (64 * 1024) Number = 30

    12. The NetBackup Servers - Master How much CPU and RAM do I need on my Master? CPU is more important than I/O on the Master More horsepower is always better than less If you can’t fully configure it now ($$) room for growth is always good Provided the technology will still be available when it is time to upgrade There is not a single “correct” Master Server or configuration Back to the “how long is a piece of string” question Master is typically based on the number of Clients, amount of data being backed up, number of drives and number of Media Servers as well as the number of jobs per day The JAVA GUI takes memory so consider this when sizing RAM This is all important, but modern hardware has come a long way and most of it can do the job quite well Bottom line – Look at Tuning Guide, do the math and decide what is right for you

    13. The NetBackup Servers – Master How much disk space do I need on my Master? Assume 120 bytes per backed up file and use 1.5 multiplier for growth and error Example 100 systems with 100,000 files on each backed up FULL daily 100 systems * 100,000 files * 30 backups = 300,000,000 files total 300,000,000 * 120bytes = 36,000,000,000 bytes = 36GB to store this much in the catalog Use this with Retention to determine long term catalog needs with the 1.5 multiplier Use the 2% rule Total data tracked * 2% If you are backing up 3TB of data based on fulls, incrementals, and retention then catalog space needed is 60GB Using a disk manager so you can grow on the fly is very important

    14. The NetBackup Servers – Master Master Example We always recommend a dedicated Master Most modern systems will work very well V490 (SUN) T5220 (SUN) DL580 (Compaq) Similar system from your preferred Vendor 4 x CPU 16GB RAM 200GB Disk to start Catalog EMM database No real difference (except management) between Windows and Unix in Master Performance

    15. Disk vs. Tape Discussion

    16. Disk & Tape Performance Comparison

    17. Disk & Tape Performance Comparison

    18. Some Disk & Tape Comparisons Disk Disk price is at historical lows Disk offers performance advantages tape cannot provide Tape Tape offers TCO advantages Tape can easily and cheaply be sent offsite Overall Disk staging can take advantage of specific strengths of disk and tape while avoiding weakness of each Think you know Disk pretty well? Lets take a small quiz…

    19. Other Disk / Array Considerations Quiz: Disk drives made in which year are the fastest?

    20. Other Disk / Array Considerations 1993: Number of IOPS per GB in the 1000's IOPS per disk = 50 I/O spread across lots of disks by necessity 2007: Number of IOPS per GB in the 10's IOPS per disk = 250 Density of data per disk MUCH higher => fewer spindles!

    21. Other Disk / Array Considerations An important number for performance is IOPS / GB With larger disk capacities, I/O tends to be spread across fewer spindles Place data over as many spindles as possible High capacity disks make this less obvious Test your disk subsystem: Iometer (www.iometer.org)

    22. Other Disk / Array Considerations

    23. Enhanced Disk Capabilities in NBU 6.5

    24. Disk vs. Tape Discussion

    25. Creating a New Solution? Updating Your Existing Solution? … Recommended Configuration

    26. NetBackup Configuration Tuning Main NetBackup tuning settings NetBackup buffers Tape Multiplexing Client Multi-Streaming Use exclude lists – not include lists Guaranteed you will pick up added drives Do you really need 1000’s of copies of the “WINDOWS” folder? Have you looked at Deduplication? Perform fewer full backups Most RTO are not strict enough to warrant frequent Full backups Change your backup paradigm Synthetic backups can help here

    27. NetBackup Configuration Tuning Main NetBackup tuning settings NetBackup buffers – Very critical SIZE_DATA_BUFFERS, NUMBER_DATA_BUFFERS, NET_BUFFER_SZ Think of buffers as “buckets”. If you are trying to drain a pool of water, the number of buckets is important, as is the size A higher number of larger buckets will move more water Take care when tuning as too big/many can decrease performance Requires testing to determine correct setting for your environment Check out the tuning guide listed at the end for detailed tuning steps – Document Number 281842 Tape Multiplexing (MPX) Modern tapes can write a great deal more than Clients can send MPX = Multiple streams of data interleaved onto the same tape Restore performance issues? Used to be a big issue, not so much any more Client Multi-Streaming Clients can send more than one stream if on GbE however this can easily overload the Client CPU so be careful

    28. NetBackup Configuration Consider advanced technologies Many people not aware of all capabilities in base NetBackup Snapshot Client has many advanced backup technologies Offload the backup from the Client to a Media Server Deduplication Stop backing up the same file over and over Flexible Disk Create pools of disk to back up to SAN Client Stop using your LAN for high volume data transfer Quite a few others to choose from

    29. NetBackup Capacity Planning Some prep work is required to properly size the solution How much data will be backed up? What is the amount of daily change? What types of data will be backed up – text, graphics, databases etc How many files will be backed up? What are your SLA’s? Do you plan to use Tape, Disk, Snaps, Dedupe, all of the above? What are your recovery requirements? Can you use chargeback to help pay for a higher end solution? There is more to a properly planned strategy than simply buying a couple of servers and some tape drives or disk

    30. Scaling – Can You Continue to Grow?

    31. Special Backup Problems Millions of files – FlashBackup Recommended with >200,000 files 6.5 now supports all Unix and Linux Very Large Database’s Use built-in API’s effectively – stagger incremental backups Snapshot client has many solutions here VMware NetBackup for VMware in 6.5 Enhances VCB based backup technologies Don’t see your problem listed here? Check out with you Symantec account team!

    32. Tuning Is Critical - Summary Most hardware, out of the box, is not set up to perform optimally Adding new hardware without thinking of the other components that will be affected can reduce overall performance By matching hardware performance from end to end, higher ROI can be achieved NetBackup Media Servers require tuning for optimal throughput Increased throughput means Shorter backup windows Reduced infrastructure needs Reduced management requirement Increased scaling Increased ROI Happier CEO’s (which makes your lives much better!)

    33. Analyzing Drive Utilization & Performance

    34. Analyzing utilization – The devil is in the details Drive Composite Average across 24 hours. Sorted by utilization to easily identify underutilized drives Configurable % utilization ranges (you determine the shades of blue) “All Drives Average” - The ultimate utilization number across entire drive inventory Easily visualize utilization across backup windows by composite average for each hour of day Rich set of filters enabling analysis by Policy, Policy Type, Level and Job Type

    35. Analyze across Library, Media Server, Drive Type Aggregate at Library level for cross-library analysis Aggregate at Media Server to understand which ones are over/underutilized Aggregate at Drive Type

    36. Perspective across time frames Time frame of analysis is day of week. Is utilization by day consistent? Aggregation is logical drive (Library/Media Server/Drive) . Visualize drives that are shared across media servers

    37. Analyzing Performance Configurable throughput (Kbytes/Sec) ranges Hair-splitting aggregation and averaging - Aggregation begins at Image/Copy/Fragment level. - Weighted averaging. - Throughput reflects time delta between begin write and end write time for each fragment Observe drives with fluctuating performance. What is behind this?

    38. Where is the bottleneck? Media Server? Client? Analyze throughput by drive type – Is performance consistent with manufacturers specs? Aggregate at Media Server – Is the bottleneck here? Is the Media Server overwhelmed and impacting performance?

    39. Can we meet the Recovery Time Objective (RTO)? Analyze throughput for restores only – Which drives are being used? What is the throughput? Can it meet the RTO? See hours of day in which restores taking place? Compare same drives performance for backup jobs. Supporting tabular report providing job level details – See which Master/Media servers and client? Size of restore?

    40. Developing Your DP Solution Some great publications out there to help you out “Backup & Recovery” By W. Curtis Preston Available on amazon.com “Implementing Backup and Recovery: The Readiness Guide for the Enterprise” By David B. Little and David A. Chapa Available on amazon.com “Veritas NetBackup™ Backup Planning and Performance Tuning Guide” Available for download at support.veritas.com http://seer.entsupport.symantec.com/docs/281842.htm

More Related