These notes were
compiled whilst watching the SRM5 videos at http://vmwarelearning.com
by Andrew Ellwood. The videos are well worth watching when you have the time –
if you don’t then these notes will suffice as a primer. The videos go into
vSphere replication as opposed to array-based replication, but – even if you
will be using array-based replication – the information contained is very
useful when planning for SRM5.
SRM Video 1: SRM5
Concepts
Four-Step Disaster Recovery with SRM:
1. At the protected site, SRM shuts down the virtual machines, starting with
the virtual machines designated with the lowest priority
2. At the recovery site, SRM prepares the datastore groups for failover of
the protected virtual machines
3. To provide more resources for the virtual machines to
be powered on at the recovery site, SRM suspends any virtual machines running at the recovery site
that are designated as noncritical
4. SRM restarts
virtual machines at the recovery site, starting with the virtual machines that are
designated as the highest
priority
Recovery Types:
. Planned Migration – synchronizes storage and cancels
recovery if errors are encountered
Note: Planned
Migration can also be used for migrating
to a different IP subnet
ii. Disaster Recovery – attempt to synchronize storage
but otherwise use the most recent storage synchronization data. Continue
recovery even if errors are encountered
SRM Features: Automated Failback
One-button re-protect and failback
Support for Various Disaster Recovery
Topologies:
- Active-passive
failover
- Active-active
failover
- Bidirectional
failover
- Shared
recovery sites
SRM Video 2:
Installing SRM5
VMware vCenter
Server 5.0 must be installed at both the protected site and the recovery site. A single instance of the
VMware vSphere Client
installed with the VMware vCenter Site Recovery Manager Plug-In, can manage both sites.
Prerequisites:
- Windows
64-bit O/S
- Database
to hold SRM information (including VM configurations, protection groups,
recovery plans)
- Network connectivity from SRM Server to vCenter Server
Guidelines:
- Install SRM at both the protected and recovery
site
- Can install on physical or virtual (works well to
separate vCenter and SRM in a virtual environment)
- Install the SRAs (Storage Replication Adapters) on the SRM Server
host
- If vCenter Server workload is high, consider installing
SRM on a separate system
- Place SRM in your management cluster and network (if
you have one)
Order of Installation:
1. Install
vCenter at the protected site and the recovery site
2. Configure
the vCenter Server inventory
3. Configure the SRM database
4. Install
SRM Server on both sites
5. Install the SRM plug-in
6. Pair
the sites
Configuring the vCenter Server inventory:
Hosts and Clusters
Fig. 1: Resource
pools design for SRM
Fig. 2: VMs and
Templates design for SRM
INSTALL
vSphere Replication:
Even if you do not plan to use vSphere Replication, you
can install vSphere Replication when prompted – this just makes the binaries
and VRMS and VRS OVF’s available.
Fig. 3: Install
vSphere Replication
Important:
Use
fully qualified domain names (FQDNs) wherever possible, and lowercase for everything
(avoids random issues with capitalization)
Requirement:
Need an account with administrator level access to the vCenter
Cerificates:
Can automatically generate a certificate or purchase a
PKCS#12
Info to Register the VMware vCenter Site
Recovery Manager Extension:
- Local
Site Name
- Administrator
E-mail (to receive notifications)
- Additional E-mail
Database Configuration:
- SRM_DB
pre-created
- ODBC
DSN Setup (can be done from the SRM Installer)
- Database
user account
vSphere Client:
Install the VMware vCenter Site Recovery Manager plug-in
Note: Deploy to
management servers in both sites
SRM Video 3: Site
Pairing
Getting Started > 1. Connect the Sites > Configure Connection
Note: Alternative is – Summary > Configure
Connection
Fig. 4: vSphere
Client: Home > Solutions and Applications > Site Recovery – Sites
vCenter servers default port = 443
SRM servers default port = 8095
SRM Video 4: Storage
Replication – part 1 (vSphere Replication)
Note: Array-based
replication and VR (vSphere Replication) are complementary solutions and can be
used together.
With array-based replication, the LUN design has to be done in such a way
that only the servers requiring replication are replicated. This could be done
on a one server per LUN
basis, or multiple
replicated servers per LUN, with un-replicated LUNs for local servers.
An advantage with vSphere Replication is that it can be done on a per-VM level
as opposed to per datastore.
VR cannot automatically
reprotect.
vSphere Replication Setup:
Important: vCenter
needs to be configured with a managed IP Address
Done via the vSphere Client and Site Recovery plug-in
Fig. 5: vSphere
Replication Deployment Example
SRM Video 5: Storage
Replication – part 2 (vSphere Replication)
VRMS
= vSphere Replication Management Server
VRS
= vSphere Replication Server (does the heavy lifting)
In large deployment, you may have multiple VRS servers, but only need one VRMS per site.
SRM Video 6: Inventory
Mapping
Inventory mappings
are optional but recommended.
An inventory mapping specifies the resources that a
recovered virtual machine uses.
Four elements can be mapped:
- Compute resources, such as resource pools
- Virtual machine folders
- Networks
- Placeholder
datastore
For a virtual
machine that is not fully mapped, then its mappings must be individually
configured when you protect it!
Fig. 6: vSphere
Client: Home > Solutions and Applications > Site Recovery – Sites >
Getting Started
Fig. 7: Example
Resource Mappings
Placeholder
datastores are where we put the virtual machine configuration files on
the recovery site in anticipation of that VM failing over.
Fig.8: Placeholder_VMs
datastore (in recovery site)
SRM Video 7: Protection
Groups
A protection group:
Typically contains virtual machines that are related in
some way:
- A three-tier
application (application server, database server, web server)
- Virtual machines whose virtual disk files are part of the same datastore
group
Note: For VMs that
share a datastore group, you cannot just fail-over one, you must fail-over all!
Fig. 9: Protection
Groups
SRM Video 8: Creating
a Recovery Plan
What was the
failure?
- complete site
failure
- SAN failure
- ISP failure
You can have many recovery plans (for different
scenarios) or just one (for the complete site failure.)
An SRM recovery plan includes:
- A list of virtual machines from the protection groups
- A startup
and priority order for those virtual machines
- Any custom
steps added before or after virtual machine startup
SRM recovery plans enable an administrator
to determine the following:
- Which protection groups are to be included in the
recovery plan
- Which virtual machines at the recovery site are to be suspended?
- The power-on order of protected virtual machines during
failover
A recovery plan is created at the recovery
site (if you lost the main
site with your recovery plans in…)
vSphere Client: Home > Solutions and Applications >
Site Recovery – Recovery Plans: Create Recovery Plan
You get to choose
a Test Network!
Can leverage Priorities and Dependencies to get VMs to start in the correct order!
Add Non-Critical VM – can pause a VM to release resource for more critical
servers.
SRM Video 9: Testing
a Recovery Plan
A recovery plan is essentially a script!
Fig. 10: Recovery
Plan Test Button
Fig. 11: Option to
replicate recent changes to the recovery site
Fig. 12: Recovery Plan
Cleanup Button
SRM Video 10: SRM
Failover
Fig. 13: Recovery
Plans – Recovery Button
Click of the ‘Recovery’
button occurs on the recovery site (because the protected site is down.)
SRM automates many of the tasks required at
the time of failover:
- Powers
down the protected virtual machines
- Stops
data replication and enables the recovery site datastores
- Rescans the ESXi hosts at the recovery site
- Suspends
nonessential virtual machines if specified in the recovery plan
- Powers
on virtual machines at the recovery site
Running the Recovery plan:
Tick ‘I understand that this process will permanently alter the virtual
machines and infrastructure of both
the protected and recovery datacenters’
Recovery type:
- Planned Migration (if errors occur it will stop)
- Disaster Recovery
SRM Video 11: SRM
Failback
One button reprotect when using Array-Based
Replication:
- Reverse the replication of storage
- Reprotect the VMs in the reverse direction
- The original protected site becomes the recovery site
Of course, if the
original protected site is gone, there is nothing to failback to, in which case
– using the templates from the recovery site – we can rebuild the configuration
in the reverse direction!
Comments
Post a Comment