With synchronous mirroring, data sent to a VDisk is also mirrored to another VDisk on another QUADStor machine. Mirroring is synchronous which means that a write operation is complete only when both the VDisks acknowledge the write. Unlike traditional synchronous mirroring deployments, QUADStor's implementation is dedupe-aware. For example if a block is received by node 'A' and needs to be sent to node 'B', the data block needs to travel the network from 'A' to 'B' only if the block isn't a duplicate of an already existing block on 'B'.
Benefits of Synchronous Mirroring
1. Unlike shared storage high availability, the underlying physical storage is no longer a single point of failure
2. High Availability
Drawbacks of Synchronous Mirroring
1. Increased latency during write operations
2. Twice the storage requirements
Mirroring Setup
For the purpose of understanding the following documentation, consider the following scenario
Node 'A' and Node 'B' are two machines between which synchronous mirroring needs to be setup.
Node 'A' has a VDisk 'M' for which synchronous mirroring needs to be setup.
Setting up Node 'A' and Node 'B'
On node 'A' and 'B' install the QUADStor Core and ITF packages
Create a file called /quadstor/etc/ndrecv.conf and add the following line
RecvAddr=<recv listening ipaddr>
For example
RecvAddr=10.0.13.6
In the above example, the system will bind to 10.0.13.6 for mirror data. This has to be done for both node 'A' and node 'B'. Node 'A' will bind to its RecvAddr and connect to Node 'B' RecvAddr and vice-versa.
Ensure that the following ports are allowed for TCP traffic in your firewall configuration
9953
9955
Restart the quadstor service
Ensure that physical storage is configured on both nodes
Configuring a VDisk for mirroring
1. Click on the "Virtual Disks" menu
2. For the VDisk click on its "Modify" link
3. Under the "Mirroring Configuration" section enter the destination VDisk name, destination pool for the VDisk, and the remote node's IP Address. If configuring on Node 'A', the remote node's IP Address will be the RecvAddr of Node 'B' and vice versa, if configuring on Node 'B'.
4. Click on Submit
Once successfully configured the current mirroring state for the VDisk is shown. From this point the two VDisks are attached for mirroring. If data is received by VDisk 'M' on Node 'A' it is also sent to VDisk 'M' on Node 'B' and vice versa
One of the VDisk assumes a Master role and other the Standby role. This is not to be confused with a Active-Passive mirroring as with our synchronous mirroring Active-Active configuration is supported and is described later. The roles used by the VDisk determines which VDisk has control over metadata related decisions. If the master VDisk fails and the slave VDisk cannot take over as Master, it no longer can write data to its node. However if the slave VDisk fails then the master VDisk can still write data to its node. Once the slave VDisk is back online, data is synced from the master VDisk to the slave VDisk.
Node fencing
For FreeBSD and SLES and fencing can be done using stonith (heartbeat package) and for RHEL/CentOS and Debian use fence-agents package
Installing Fence Agents
On RHEL/CentOS 6.xyum install fence-agentsOn Debian7.x
apt-get install fence-agents
For a list of possible fence agents for your hardware please refer https://access.redhat.com/site/articles/28603 Refer to the package documentation for configuring and manually fencing a node
In order for a slave VDisk to perform when a master VDisk has failed, the node containing the master VDisk needs to be fenced first. Configuration for fencing can be performed by qmirrorcheck tool by
/quadstor/bin/qmirrorcheck -t fence -r <mirror IPaddress> -v 'fence command'
-r is the RecvAddr of the peer node. For Node 'A', this will be the RecvAddr of Node 'B'
-v 'fence command' is the fence command to run. Note that the command should be specified with single or double quotes
Lets suppose VDisk 'M' has a slave role on Node 'B' and VDisk 'M' has the master role on node 'A'. Node 'A' has crashed and data needs to be written to VDisk 'M'. Node B detects that node 'A' is unresponsive n invokes the fence command which corresponds to the RecvAddr of Node 'A'. If the fence command is successful VDisk 'M' on node 'B' takes over the master role and write operations for VDisk 'M' continue uninterrupted. If the fence command fails, to avoid a split-brain condition write operations to VDisk 'M' on node 'B' will fail till VDisk 'M' on node 'A' is back online.
qmirrorcheck also supports a -t ignore -r <mirror IPaddress> option. This means on a peer failure, no fencing is performed. This however should be used when the nodes are a part of a cluster and the failed node will have any case been fenced.
Manual Switching of Roles
It is possible to now manually switch the role of a VDisk using the qsync program (/quadstor/bin/qsync) A few useful options that can be passed to the program are
/quadstor/bin/qsync -l will list all VDisk with synchronous mirroring configured. Also current role of the VDisk (whether master/slave), the current status etc are displayed
In order to switch the role of a VDisk run the following command
/quadstor/bin/qsync -s <VDisk Name> -m <Mirror Role>
Where:
VDisk Name is the name of the VDisk for which the role needs to be changed
Mirror Role is the new role of the VDisk. It can either be "master" or "slave" (without the quotes)
For example
/quadstor/bin/qsync -s FOO -m master
In the above example the role of VDisk FOO will switch over to master if the command is successful.
In order to switch the roles of all VDisks run the qsync command without a -s option and -f option
For example to switch all VDisks to the master role
/quadstor/bin/qsync -f -m master
Manual Failover
The qsync option is useful in implementing manual failover. In order to configure manual failover do the following on both the nodes
Clear any previous qmirrorcheck rules for the same destination ipaddress by
/quadstor/bin/qmirrorcheck -x -r <mirror ipaddress>
Where <mirror ipaddress> is the ip address of the peer.
Run qmirrorcheck with a -t manual option on both nodes.
/quadstor/bin/qmirrorcheck -a -t manual -r <mirror ipaddress>
Where <mirror ipaddress> is the ip address of the peer.
For example if mirroring is to be configured between 10.0.13.3 and 10.0.13.4,
On 10.0.13.3 the commands to be run are
/quadstor/bin/qmirrorcheck -x -r 10.0.13.4
/quadstor/bin/qmirrorcheck -a -t manual -r 10.0.13.4
On 10.0.13.4 the commands to be run are
/quadstor/bin/qmirrorcheck -x -r 10.0.13.3
/quadstor/bin/qmirrorcheck -a -t manual -r 10.0.13.3
Lets us suppose that VDisk FOO has been configured for mirroring between 10.0.13.3 and 10.0.13.4 and the current role of FOO on 10.0.13.3 is master and slave on 10.0.13.4
If node 10.0.13.4 fails and 10.0.13.3 then receives a write for FOO on 10.0.13.3, since -t manual has been configured and since FOO is master on 10.0.13.3, the write will complete successfully.
However if node 10.0.13.3 fails and 10.0.13.4 then receives a write for FOO on 10.0.13.4, the write will terminated and an error response is sent back to the client since the current role of FOO on 10.0.13.4 is slave. In order that 10.0.13.4 can continue writes to FOO qsync command can then be run to manually switch the role of FOO on 10.0.13.4 to master
This can be avoided if in fact fencing has not been configured at all as described below. However that would also mean that if FOO on 10.0.13.3 is master and if 10.0.13.4 fails all writes to 10.0.13.3 from there on till 10.0.13.4 is back online, will fail
Fencing Failures
In order to continue writing data to a VDisk, in the event of a peer failure fencing must succeed. Newer versions of the software will refuse to complete a write if fencing of of a non-responsive peer fails or if fencing has not been configured . Following is a summary of how fencing failures are handle
1. VDisk with slave role receives a write command, peer node may have failed and command to fence the peer node has not been configured
Write command will be terminated and error response sent to client
2. VDisk with slave role receives a write command, peer node may have failed and
'qmirrorcheck -t ignore' specified for the peer address
Write command will succeed, VDisk takes over master role
3. VDisk with slave role receives a write command, peer node may have failed and 'qmirrorcheck -t manual' specified for the peer
Write commands will be terminated until role is manually switched to master or when peer is back online
4. VDisk with master role receives a write command, peer node may have failed and command to fence the peer node has not been configured
Write command will be terminated, VDisk switches over to Slave role and will require manually switching over to master role
5. VDisk with master role receives a write command, peer node may have failed and 'qmirrorcheck -t ignore' specified for the peer address
Write command will succeed, VDisk continues with master role
6. VDisk with master role receives a write command, peer node may have failed and 'qmirrorcheck -t manual' specified for the peer address
Write command will succeed, VDisk continues with master role
Active-Active configuration
With synchronous mirroring it would seem intuitive to write to VDisk 'M' using both paths to Node 'A' and Node 'B'. If Node 'A' fails, the client continues to write to Node 'B' and vice-versa. This is an Active-Active configuration.
In order to use a VDisk in an active-active configuration the following are the requirements
1. The clients must use SCSI SPC-3 Persistent Reservations or SCSI COMPARE AND WRITE (Atomic Test Set) commands. Legacy SCSI RESERVE/RELEASE commands are not supported
2. Fencing must be configured on both the nodes
For a simple example visit http://www.quadstor.com/tech-library/141-high-availability-with-synchronous-mirroring-tutorial.html