Sunday, 20 October 2019

Exchange 2013: DAG with 3 Mailbox Servers and 5 Mailbox Databases - All Databases Dismounted and All DAG Copies FailedAndSuspended

It’s a while since I did a Microsoft Exchange post (actually, January 2013 or nearly 7 years ago)!

Image: The commands below were all run in the Exchange Management Shell (EMS)

In this post I have an Exchange 2013 lab with 3 mailbox servers in a DAG:

[PS] C:\Windows\system32>Get-ExchangeServer | Ft Name,ServerRole,Edition,AdminDisplayVersion -AutoSize

Name            ServerRole    Edition AdminDisplayVersion
----            ----------    ------- -------------------
MB1  Mailbox, ClientAccess Enterprise Version 15.0 (Build 1367.3)
MB2                Mailbox Enterprise Version 15.0 (Build 1367.3)
MB3                Mailbox Enterprise Version 15.0 (Build 1367.3)
CAS1          ClientAccess Enterprise Version 15.0 (Build 516.32)

Note: MB1 & MB2 are in site A, MB3 is in site B.

And 5 mailbox databases:

[PS] C:\Windows\system32>Get-MailboxDatabase | FT Name,Server -AutoSize

Name Server
---- ------
DB2  MB2
DB3  MB1
DB1  MB3
db4  MB2
db5  MB2

But there’s a few problems with the lab:

1) All the databases are dismounted.
2) All the database copies are in a ‘failed and suspended’ state.
3) The active databases are not on the correct mailbox servers (Should have: DB1 on MB1, DB2 & DB4 on MB2, DB3 & DB5 on MB3).

[PS] C:\Windows\system32>Get-MailboxDatabaseCopyStatus -Server MB1 | FT Name,Status,CopyQueueLength,ReplayQueueLength -AutoSize

Name                Status CopyQueueLength ReplayQueueLength
----                ------ --------------- -----------------
DB2\MB1 FailedAndSuspended               0                 0
DB3\MB1         Dismounted               0                 0
DB1\MB1 FailedAndSuspended              31                 0
db4\MB1 FailedAndSuspended               2                 0
db5\MB1 FailedAndSuspended               2                 0

[PS] C:\Windows\system32>Get-MailboxDatabaseCopyStatus -Server MB2 | FT Name,Status,CopyQueueLength,ReplayQueueLength -AutoSize

Name                Status CopyQueueLength ReplayQueueLength
----                ------ --------------- -----------------
DB2\MB2         Dismounted               0                 0
DB3\MB2 FailedAndSuspended               2                 0
DB1\MB2 FailedAndSuspended               0                 0
db5\MB2         Dismounted               0                 0
db4\MB2         Dismounted               0                 0

[PS] C:\Windows\system32>Get-MailboxDatabaseCopyStatus -Server MB3 | FT Name,Status,CopyQueueLength,ReplayQueueLength -AutoSize

Name                Status CopyQueueLength ReplayQueueLength
----                ------ --------------- -----------------
DB2\MB3 FailedAndSuspended               0                 0
DB3\MB3 FailedAndSuspended               2                 0
DB1\MB3         Dismounted               0                 0
db4\MB3 FailedAndSuspended               2                 0
db5\MB3 FailedAndSuspended               2                 0

The first thing you’d normally try is to mount the databases. But this fails (had already confirmed the path mentioned exists):

[PS] C:\Windows\system32>Mount-Database -Identity DB1
Failed to mount database "DB1". Error: An Active Manager operation failed. Error: The database action failed. Error: Database 'DB1' on server 'MB3' cannot be mounted due to a previous error: The Microsoft Exchange Replication service is unable to create required directory H:\DB1\Logs for DB1\MB3. The database copy status will be set to Failed. Please check the file system permissions. Error: System.IO.DirectoryNotFoundException: Could not find a part of the path 'H:\DB1\Logs'.

So, we mount the MB1 database using the force parameter:

[PS] C:\Windows\system32>Mount-Database -Identity DB1 -Force

The -Force parameter does not work for the other 4 databases:

[PS] C:\Windows\system32>Mount-Database -Identity DB2 -Force
Unable to mount database "DB2" because its database file is missing. You can recover the database by using a healthy database copy. If none of its copies are healthy, use the Remove-MailboxDatabaseCopy cmdlet to remove the copies and then attempt the mount operation again.

And when I check the mailbox servers, it appears that the database files are indeed missing (not seeing the disks/LUNs with the database files on.) Further checking and I see the iSCSI is “reconnecting”. Seems we have a storage issue! And I reboot the storage array.

After rebooting the storage array, and confirming the disks are now visible, I mount the other 4 databases (must use the -Force parameter, does not work without):

[PS] C:\Windows\system32>Mount-Database -Identity DB2 -Force
[PS] C:\Windows\system32>Mount-Database -Identity DB3 -Force
[PS] C:\Windows\system32>Mount-Database -Identity DB4 -Force
[PS] C:\Windows\system32>Mount-Database -Identity DB5 -Force

Now to fix the FailedAndSuspended DAG copies. We use the -BeginSeed parameter because don’t want to have to wait for the reseed to complete to get back to a prompt. Also need to use the -DeleteExistingFiles parameter because otherwise we receive an error message that log files already exist in the transaction log path for the database:

The seeding operation failed. Error: An error occurred while running prerequisite checks. Error: Log files exist in 'I:\DB2\Logs'. You must remove them before database seeding or reseeding can be performed.

You can only have one running copy per source database going at a time, if you try a second reseed you get the error:

The seeding operation failed. Error: An error occurred while running prerequisite checks. Error: Seeding for this database is currently in progress.

To update the first mailbox database copies:

Update-MailboxDatabaseCopy -Identity DB2\MB1 -SourceServer MB2 -DeleteExistingFiles -BeginSeed

Update-MailboxDatabaseCopy -Identity DB1\MB1 -SourceServer MB3 -DeleteExistingFiles -BeginSeed
Update-MailboxDatabaseCopy -Identity db4\MB1 -SourceServer MB2 -DeleteExistingFiles -BeginSeed
Update-MailboxDatabaseCopy -Identity db5\MB1 -SourceServer MB2 -DeleteExistingFiles -BeginSeed

Update-MailboxDatabaseCopy -Identity DB3\MB2 -SourceServer MB1 -DeleteExistingFiles -BeginSeed

Wait for the above to finish (check with Get-MailboxDatabaseCopyStatus -Server), then reseed the second copies (could reseed from a healthy copy - in this case we only had the active DB to initially seed from):

Update-MailboxDatabaseCopy -Identity DB1\MB2 -SourceServer MB3 -DeleteExistingFiles -BeginSeed

Update-MailboxDatabaseCopy -Identity DB2\MB3 -SourceServer MB2 -DeleteExistingFiles -BeginSeed
Update-MailboxDatabaseCopy -Identity DB3\MB3 -SourceServer MB1 -DeleteExistingFiles -BeginSeed
Update-MailboxDatabaseCopy -Identity db4\MB3 -SourceServer MB2 -DeleteExistingFiles -BeginSeed
Update-MailboxDatabaseCopy -Identity db5\MB3 -SourceServer MB2 -DeleteExistingFiles -BeginSeed

Tip: Can use the following in the EMS:
MB1","MB2","MB3" | Foreach{Get-MailboxDatabaseCopyStatus -Server $_ | where{$_.status -eq "seeding"}}

Note: Seeding can take a long time, which is why really you don’t want to have to use Exchange mechanisms to reseed. Using a product like NetApp SnapCenter which can leverage storage snapshots for faster reseed, makes a lot of sense.

Finally, need to have the active copies on the correct mailbox servers.

[PS] C:\Windows\system32>"MB1","MB2","MB3" | Foreach{Get-MailboxDatabaseCopyStatus -Server $_ | FT Name,Status,CopyQueueLength,ReplayQueueLength,ActivationPreference -AutoSize}

Name     Status CopyQueueLength ReplayQueueLength ActivationPreference
----     ------ --------------- ----------------- --------------------
DB2\MB1 Healthy               0                 0                    3
DB3\MB1 Mounted               0                 0                    2
DB1\MB1 Mounted               0                 0                    1
db4\MB1 Healthy               0                 0                    3
db5\MB1 Healthy               0                 0                    3

Name     Status CopyQueueLength ReplayQueueLength ActivationPreference
----     ------ --------------- ----------------- --------------------
DB2\MB2 Mounted               0                 0                    1
DB3\MB2 Healthy               0                 0                    3
DB1\MB2 Healthy               0                 0                    2
db5\MB2 Healthy               0                 0                    2
db4\MB2 Mounted               0                 0                    1

Name     Status CopyQueueLength ReplayQueueLength ActivationPreference
----     ------ --------------- ----------------- --------------------
DB2\MB3 Healthy               0                 0                    2
DB3\MB3 Healthy               0                 0                    1
DB1\MB3 Healthy               0                 0                    3
db4\MB3 Healthy               0                 0                    2
db5\MB3 Mounted               0                 0                    1

From the above, you see that DB3 should be on MB3, but it is on MB1 instead.

[PS] C:\Windows\system32>Move-ActiveMailboxDatabase DB3 -ActivateOnServer MB3

Identity Active  Active Status    NumberOf RecoveryPoint MountStatus MountStatus
         Server  Server           LogsLost Objective     AtMoveStart AtMoveEnd
         AtStart AtEnd
-------- ------- ------ ------    -------- ------------- ----------- -----------
DB3      mb1     mb3    Succeeded 0        10/20/2019... Mounted     Mounted

[PS] C:\Windows\system32>get-mailboxdatabase | FT -AutoSize

Name Server Recovery ReplicationType
---- ------ -------- ---------------
DB2  MB2    False    Remote
DB3  MB3    False    Remote
DB1  MB1    False    Remote
db4  MB2    False    Remote
db5  MB3    False    Remote


No comments:

Post a Comment