1st Edition

Enterprise Systems Backup and Recovery A Corporate Insurance Policy

By Preston de Guise Copyright 2009
    326 Pages 48 B/W Illustrations
    by Auerbach Publications

    326 Pages
    by Auerbach Publications

    The success of information backup systems does not rest on IT administrators alone. Rather, a well-designed backup system comes about only when several key factors coalesce—business involvement, IT acceptance, best practice designs, enterprise software, and reliable hardware. Enterprise Systems Backup and Recovery: A Corporate Insurance Policy provides organizations with a comprehensive understanding of the principles and features involved in effective enterprise backups.

    Instead of focusing on any individual backup product, this book recommends corporate procedures and policies that need to be established for comprehensive data protection. It provides relevant information to any organization, regardless of which operating systems or applications are deployed, what backup system is in place, or what planning has been done for business continuity. It explains how backup must be included in every phase of system planning, development, operation, and maintenance. It also provides techniques for analyzing and improving current backup system performance.

    After reviewing the concepts in this book, organizations will be able to answer these questions with respect to their enterprise:

    • What features and functionality should be expected in a backup environment?
    • What terminology and concepts are unique to backup software, and what can be related to other areas?
    • How can a backup system be monitored successfully?
    • How can the performance of a backup system be improved?
    • What features are just "window dressing" and should be ignored, as opposed to those features that are relevant?

    Backup and recovery systems touch on just about every system in an organization. Properly implemented, they can provide an enterprise with greater assurance that its information is safe. By utilizing the information in this book, organizations can take a greater step toward improving the security of their data and preventing the devastating loss of data and business revenue that can occur with poorly constructed or inefficient systems.

    Introduction
    Who Should Use this Book?
    Concepts
    What Is a Backup?
    Think Insurance
    Information Lifecycle Protection (ILP)
    Backups VERSUS Fault Tolerance
    Risk VERSUS Cost
    Dispelling myths
    Myth: Tape Is Going to Die Within a Few Years, and We’ll All Be
    Backing Up to Cheap Disk
    Myth: Commercial Backup Software Is Not as “Trustworthy” as
    Operating System Invoked Tools
    Myth: Commercial Backup Software Is Not as Efficient as Customized
    Backup Scripts Written by a Good System Administrator with Local
    Environment Knowledge
    Myth: The Use of Commercial Backup Software Would Require Staff
    Training
    Myth: Commercial Backup Software Offers No Tangible
    Improvements over Regular Operating System Backups
    Myth: Deploying Commercial Backup Software Requires Budgeting
    for Additional Yearly Maintenance Fees
    Myth: Backup Is a Waste of Money
    Myth: It Is Cheaper and More Appropriate to Develop In-House
    Backup Systems Than to Deploy Commercial Backup Systems
    Myth: If a Department Can’t Fund Backups for Its Systems, They
    Don’t Get Backed Up
    The Top Ten Rules
    Human and Technical Layers
    Introduction
    Human Layers: Roles and Responsibilities
    Overview
    Technical Staff
    Operators
    Help Desk Staff
    Backup Administrators
    System Administrators
    Application Administrators
    Management
    Local Management and Team Leaders
    Upper Management
    The Board and the CEO
    Users
    Key Users
    End Users
    Domain Disputes
    Technical Layers
    Introduction
    Technical Service Layers
    External and Security
    Client Systems
    Processing Systems/Servers
    Virtualization Systems
    Storage Systems
    Backup/Protection Systems
    Service Component Layers
    Backup and Recovery Concepts
    Introduction
    Host Nomenclature
    Backup Topology
    Decentralized Backups
    Centralized Backups
    Backup levels
    Full Level
    Incremental Level
    Differential Level
    Simple Differential Backups
    Multi-Layered Differential Levels
    Consolidated Level
    Manual Backups
    Skipping Backups
    Full Once, Incrementals Forever
    Data Availability
    Offline
    Online
    Snapshot Backups
    Data Selection Types
    Inclusive Backups
    Exclusive Backups
    Backup Retention Strategies
    Dependency-Based Retention
    Simple Model
    Manual Backups Revisited
    Recovery Strategies
    Recovery Types
    Aggregated Filesystem View
    Last Filesystem View
    Point-in-Time Recovery
    Destructive Recovery
    Non-Index Recovery
    Incremental Recovery
    Recovery Locality
    Local Recovery
    Server-Initiated Recovery
    Directed Recovery
    Cross-Platform Directed Recovery
    Client Impact
    Server-Based Backups
    Serverless Backups
    Filesystem/Volume Clones and Snapshots
    Array Replication
    Summarizing Serverless Backups
    Virtual Machine Snapshots
    Database Backups
    Cold Backup
    Hot Backup
    Export Backup
    Snapshot Backup
    Backup Initiation Methods
    Server Initiated
    Client Initiated
    Externally Scheduled
    Miscellaneous Enterprise Features
    Pre- and Post-Processing
    Arbitrary Backup Command Execution
    Cluster Recognition
    Client Collections
    Backup Segregation
    Granular Backup Control
    Backup Schedule Overrides
    Security
    Duplication and Migration
    Alerts
    Command Line Interface
    Backup Catalogues
    Media Handling Techniques
    Spanning
    Rapid Data Access
    Multiplexing
    Media Tracking
    Backup
    Introduction
    What to Back up
    Servers
    Storage Devices
    SAN
    NAS
    Non-Traditional Infrastructure
    Desktops and Laptops
    Hand-Held Devices
    Removable Storage: Devices and Media
    Documentation and Training
    Introduction
    Documentation
    System Configuration
    System Map
    Administrative Operations
    Media Handling
    Backup and Recovery Operations
    Disaster Recovery Operations
    Troubleshooting
    Acceptance Test Procedures
    Test Register
    Vendor-Supplied Documentation
    Release Notes
    Training
    The Case for Training
    Backup Administrators
    System Administrators
    Application and Database Administrators
    Operations Staff
    Help Desk Staff
    End Users
    Management
    Performance Options, Analysis, and Tuning
    Introduction
    Performance techniques
    Backup Bandwidth
    Multiplexing
    NDMP
    Backup Efficiency
    Client-Side Compression
    Bandwidth Limiting
    File Consolidation
    Block-Level Backup
    Data Deduplication
    Diagnosing Performance Issues
    Network Performance Analysis
    Ping Test
    Speed and Duplexing
    File Transfer Test
    Name Resolution Response Times
    Client Performance Analysis
    Hardware
    Filesystem
    Software
    Device Performance Analysis
    Altering Tape Block Size Can Affect Recovery
    Backup Server Performance Analysis
    Improving Backup Performance
    Multi-Tiered Backup Environments
    Incrementals Forever, Revisited
    Upgrade Hardware
    Tape Robots
    Example: The Cost of Not Accepting the Need for
    Autochangers
    Faster Backup Devices
    Backup to Disk
    Disk Backup Units
    Virtual Tape Libraries
    Dynamic Device Allocation
    Serverless Backup
    NDMP
    Snapshots
    Multiplex Larger Filesystems
    Filesystem Change Journals
    Archive Policies
    Archive Is Not HSM
    Anti-Virus Software
    Slower Backup Devices
    Recovery
    Introduction
    Designing Backup for Recovery
    Recovery Performance
    Facilitation of Recovery
    How Frequently Are Recoveries Requested?
    Backup Recency versus Recovery Frequency
    Who May Want to Perform Recoveries?
    Recovery Procedures and Recommendations
    Read the Documentation before Starting a Recovery
    Choosing the Correct Recovery Location
    Provide an Estimate of How Long the Recovery Will Take
    Give Updates during Recoveries
    Write-Protect Offline Media before Using
    Don’t Assume a Recovery Can Be Done if It Hasn’t Been Tested
    Recall All Required Media at the Start of the Recovery
    Acclimatize Off-Site Recovery Media whenever Possible
    Run Recoveries from Sessions That Can Be Disconnected
    From/Reconnected To
    Know the Post-Recovery Configuration Changes
    Check Everything Before It Is Done
    Remember Quantum Physics
    Be Patient
    Document the Current Status of the Recovery
    Note Errors, and What Led to Them
    Don’t Assume the Recovery Is an Exam
    If Media/Tape Errors Occur, Retry Elsewhere
    Ensure the Recovery Is Performed by Those Trained to Do It
    Read and Follow the Instructions if They’ve Never Been Used Before
    Write a Post-Recovery Report
    Update Incorrect Instructions
    Preserve the Number of Copies of Backups
    Send Off-Site Media Back Off Site
    Remind Vendors of SLAs
    Cars Have Bandwidth, Too
    Disaster recovery
    Maintenance Backups
    Perform a Backup before Maintenance
    Perform a Full Backup Following Maintenance
    If Time Permits, Backup after Recovery
    Avoid Upgrades
    Read the Documentation before the Backups Are Performed
    Disaster Recoveries Must Be Run by Administrators
    Test and Test and Test Again
    Use the Same Hardware
    Know Dependencies (and How to Work around Them)
    Keep Accurate System Documentation
    Do You Know Where Your Licenses Are at 1 A.M.?
    Disaster Recovery Exercises
    Off-Site Storage
    Keep the Disaster Recovery Site Current
    Hot or Cold Disaster Recovery Site?
    Service Level Agreements
    Recovery Time Objective SLAs
    Recovery Point Objective SLAs
    Planning SLAs
    Map IT Systems
    Establish SLAs on a Per-System Basis
    Confirm SLAs Are Realistic
    Upgrade IT Environment or Revisit SLAs
    Failure Costs
    Formally Agree to, and Publish SLAs
    Enact Policies to Protect SLAs
    Verify SLAs
    Testing
     Protecting the Backup Environment
    Introduction
    Why Protect the Backup Server?
    Protecting the Backups
    Via Backup Software
    Post-Backup Cloning
    Inline Cloning
    Storage of Duplicates
    Hardware-Level Protection
    Hot-Pluggable Tape Libraries
    RAIT
    RAID for Disk Backup
    Physical Protection
    Physical Security
    Protecting the Backup Server
    Backup Server Components
    Ensuring Availability
    Historical Considerations
    Migration
    Maintenance
    Archives
    Problem Analysis
    Introduction
    Network
    Basic Configuration
    Switch/NIC Settings
    Hostname Resolution
    Basic Connectivity
    Ping Test
    Port Test
    Backup Software Connectivity
    Hardware Validation
    Backup Device Validation
    Physical Inspection
    Operability Validation
    Media Validation
    Firmware Validation
    System Hardware Validation
    Server/Storage Node
    Client
    Software Validation
    Log Review
    Version Compatibility Validation
    Error Review
    Tracking Failures
    Backup Reporting
    Introduction
    Reporting Options
    Automated Reports
    Automated Report Parsing
    Zero-Failure Policy
    Choosing a Backup Product
    Introduction
    Coverage
    Value Products That Value Protection
    Value Frameworks, Not Monoliths
    Operating Systems
    Databases
    Applications
    Clustering
    Hardware
    Functionality Checklist
    Administrative Considerations
    Training
    Support
    Maintenance
    Technical Support
    Best Practices
    Introduction
    Backup to Recover
    Documentation
    What to Backup
    Protect the Backups
    Results Checking and Reporting
    Core Design Considerations
    Track Failures
    Clearly Delineate Roles and Responsibilities
    Network, Not Netwon’t
    Ensure the System Is Supported
    Appendix A: Technical Asides
    A Introduction
    A Transactional Logging
    A Snapshots
    A Traditional Snapshots
    A Fast Resynchronization Snapshots
    A Copy-On-Write Snapshots
    A Cache Snapshots
    A Apply-Deferred Snapshots
    Appendix B: Sample Recovery Request Form
    Appendix C: Sample Test Form
    Appendix D: Glossary of Terms

    Biography

    Preston de Guise