Some customers have adopted block level replication of the storage disks (LUNs) as their DR solution for SQL Server databases. Remote mirroring of storage disks sounds great – transparent to the application, replicates every block as-is; so very little reconfiguration required on the other side, scales fairly linearly regardless of the number of databases and so on. Sounds better than baked bread, huh? However, there are some finer points we would like to share from our experiences on the field.
Our KB articles sometimes read like legal documentation and rightly so – they are in many cases the ‘rulings’ and ‘bye-laws’ by which customers and us reach conclusions in tricky situations. So there are some KB articles which are a must-read in the case of storage.
KB 913945 states that Microsoft does not officially certify specific third party products will work with SQL Server. The most important part of that article is here:
Microsoft support documents may use the phrase "not supported by Microsoft" with regard to third-party products. This phrase means that the first line of support for such a product should be the third-party product vendor rather than Microsoft.
Next we come to KB 910716 which specifically talks about remote mirroring a.k.a. SAN replication. Some excerpts from this document:
- This article does not endorse any specific vendor’s solution.
- Work with your hardware vendor and your storage vendor to make sure that the remote mirroring solution that you want to use supports the I/O requirements of SQL Server.
- If your solution meets the requirements that are discussed in this article, Microsoft Customer Support Services will provide support for issues that are not related to storage devices.
- To obtain support from Microsoft Customer Support Services for issues that involve transactional throughput, you must be able to reproduce the issue in a mirror-free environment.
- To obtain support for issues that involve a storage device, see your storage vendor.
- To obtain support for issues that involve third-party replication technology, see your technology vendor.
Lastly we have KB 943984 which talks about using the storage mirroring to support a so-called geo-cluster. In such cases, remote disk mirroring is used to provide a consistent copy of the clustered storage disks on a remote site. This article has one very important line:
Please contact the data replication vendor for any issues related to accessing data on failover.
By now, you must be feeling very jittery about using such remote mirroring But fear not, there are some basic precautions you can take to be sure.
SQL I/O Requirements
The fundamental issue is that the database and log files are now in two places rather than one. And that means that the basic I/O requirements that SQL Server assumes or requires are to be followed in both places. These requirements are described in the ages-old (but still relevant) SQL I/O Basics Part 1 document. (FYI, the Part 2 document is also a very good read) and are given below:
- Stable Media: Any system running SQL Server must provide stable media capabilities for the database log and data files. If the system has something like a non-battery backed cache or perhaps caching enabled on disk, it is not safe for SQL Server installations. Ensure that your system has proper caching attributes to secure the data to stable media.
- Write Ordering: SQL Server is designed to maintain the WAL protocol as described earlier in this paper. The underlying environment plays a pivotal role in the maintenance of the protocol. Write ordering must be maintained. With any system, but specifically for remote mirroring implementations, write ordering is critical to maintain a point-in-time view of the data. Many remote mirroring vendors implement generation logic to maintain physical write order even when the write operations are transmitted between remote caches with alternate protocols.
- Torn I/O Prevention (Split I/Os): For SQL Server, the 8-KB blocks need to be handled as a single block of data. Systems that split I/Os must be configured to avoid splitting I/O requests into smaller blocks. Some dynamic disk and volume managers can be configured with block and cluster sizes that are smaller than 8 KB or not on 8-KB boundaries. These systems may split the SQL Server I/O request across physical system components. In doing so, they can introduce the possibility of torn pages and they can destroy true write ordering precedence.
How do I test for these?
The easiest way to test if a disk subsystem (standalone or mirrored) meets the above requirements is to simulate SQL Server I/O patterns using the SQLIOSim utility. This utility actually replaces the SQLIOStress utility described in Bob Dorr’s whitepaper mentioned previously.
Write ordering implementations
There are some reference steps you can check for remote mirroring and SQL Server. I’ll keep adding to these as I come across more
For example, this EMC paper talks about using something called Consistency Groups when using their SRDF technology to remote mirror a group of LUNs. Consistency Groups are used by EMC to ensure write-ordering, which as you might have read above, is a very important criteria, especially when you have different sets of LUNs for log and data files.
Note: other manufacturers may have their own equivalent to ensure write ordering. I mention EMC’s solution purely as a sample. It is not to be taken in any way as a recommendation or guarantee of any sort.
Hyper-V in Windows Server 2012 offers something called Replicas which are an easy way to provide for DR of a VM. SQL Server is supported in such scenarios – see KB 956893 for details. The scenario is supported as long as the EnableWriteOrderPreservationAcrossDisks flag is set.
TEMPDB and remote mirroring / storage replication
One thing to definitely keep in mind when using remote mirroring is to consider where TEMPDB is placed. If you are replicating the LUNs where TEMPDB data / log files are placed, that is calling for big trouble! A lot of bandwidth on your storage replication channel will be wasted moving these bits across, when in reality the TEMPDB will be recreated on startup of the DR instance.
One good thing here is that SQL server 2012 allows TEMPDB data / log files to be placed on local disks even for clustered instances – this is a big help as it helps avoid including TEMPDB in the remote mirroring setup.
Wrap-up cheat sheet
Here is a summary of the key points to keep in mind when implementing storage replication / remote disk mirroring:
- Check with the manufacturer if they have addressed the SQL Server storage requirements, specifically the write-ordering implementation
- Test with SQLIOSim well before deployment of the actual SQL instances
- Always check if your HBA drivers and SAN firmware / software is up to date
- Ensure that TEMPDB is not being replicated using disk mirroring across the storage replication channel
That’s it for now. Hope you enjoyed this post! Please leave your feedback / questions in the comments, or use the ‘Email Author’ link on the top right section of the blog to reach me.