The Problem
Logging into the Navisphere Web UI revealed that the Secondary Image of some of the Remote Mirrors had become AdminFractured.
Running a Test failover in VMware Site Recovery Manager,
revealed the following error:
Error - Virtual machine file '[snap-XXX] XXX/XXX.vmx' cannot
be found on recovered datastore.
Analysis of the mounted snapshot in the DR site revealed
that indeed the VMX was not there. The virtual machine in question, had only
been moved into the datastore a few days previously, so the conclusion was that
the replication is not running every 15 minutes as it should have been.
Logging into the Navisphere Web UI revealed that the Secondary Image of some of the Remote Mirrors had become AdminFractured.
Fig. 1:
AdminFractured Secondary Image
Right-clicking on one of the affected mirrors > selecting
Properties > and navigating to the ‘Secondary Image’ tab, revealed the
following Last Image Error:
Unable to create protective snap session on secondary array.
Snapview returned an error on an attempt to create a protective snap session
due to lack of free cache LUN's on secondary array. As a result of the error
the mirror will be admin fractured. Add more cache LUN's on the secondary array
and retry the sync request. (0x7152863e)
Fig. 2:
Administratively fractured Secondary Image 'Last Image Error'
The Culprit
The error above refers to free cache LUN’s and these are
found in the ‘Reserved LUN Pool’ on the secondary array. Navigating to the ‘Reserved
LUN Pool’ or RLP via the Navisphere GUI > right-click and select Properties
- revealed a misconfiguration.
Fig. 3: Reserved
LUN Pool
Fig. 4: Reserved
LUN Pool Properties
The reserved LUN pool is recommended to be 20% of the
size of all the LUNs added up (it doesn’t have to be and if have only a few
LUNs and/or few changes, then can configure it lower.) What had happened is
that due to an ISP failure, the replication that was working, and was
originally setup locally, had failed for a long time, and so – in order to
re-replicate the data after the replication link was restored – additional storage
was added to the reserved LUN pool. Unfortunately, the storage that was added
was added in one big chunk (one big LUN – notice the 917.149 GB LUN in the
image above.) The reserved LUN pool needs to be made up of small chunk sized
LUNs, so that when replicating multiple LUNs, different mirrors can utilise
different chunks – if you have one big reserved LUN then this can only be used
by one replicating datastore.
Recommended size for a reserved pool LUN is found by:
i. Calculate
20% of the size of all the LUNs added up
ii. Divide
this value by a number less than the ‘Maximum number of reserved LUNs’ for the
array
You then create your pool of reserved LUNs of that size
and number.
Fig. 5: Maximum number
of reserved LUNs on CLARiiON AX/CX arrays
The Fix
The solution is easily achieved by creating more LUNs,
then right-clicking the Reserved LUN Pool, selecting configure and add/removing
LUNs as required (here we removed the wrong sized RLP LUN first, deleted it,
and recreated the LUNs in the same RAID pool.)
Fig. 6: Reserved
LUN Pool – Configure
Finally, you want to re-synchronize the AdminFractured
mirrors by right-clicking the Secondary Image and selecting ‘Synchronize…’
Fig. 7: Secondary
Image – Synchronize…
And you can monitor the synchronization progress on the
Remote Mirror Properties – Secondary Image tab.
Fig. 8: Remote
Mirror Synchronization in Progress
THE END!
The Final Word
To help understanding of RLP, monitor that as the secondary
image is synchronized, gradually, more and more Reserved LUN Pool LUNs get
allocated.
Fig. 9: Three RLP
LUNs Allocated
Fig. 10: Eight RLP LUNs Allocated
When synchronization is finished, RLP LUNs get
un-allocated and return to ‘Free’ status.
Excellent post - Thanks for the information really enjoyed reading it. Please visit my emi testing lab page and please leave comments.
ReplyDelete