Windows Server 2012 – Getting Started With Failover Clustering

Failover clustering is a bit intimidating at first. However, once you get started it’s not too bad (like most things).

Overview

  1. Configure VM’s
  2. Configure Shared Storage
  3. Configure Nodes
  4. Create the Cluster

Configure VM’s

You will need the following VM’s to do this in a lab:

  1. iscsi1. This server will act as the iSCSI target.
  2. clusternode1.
  3. clusternode2.

I’ll leave the names and addressing up to you. For help creating VM’s in Hyper-V, see my previous blog post Getting Started with Hyper-V.

Configure Shared Storage

You’ll need to create the following volumes and connect them to both cluster node’s.

  1. cluster1-quorum, 5GB
  2. cluster1-disk1, 10GB

For help with configuring an iSCSI target server, see my previous blog post Windows Server 2012 – Getting Started With the iSCSI Target Server.

Configure the Nodes

On each node, run the following commands. You’ll need to scan the code and replace variables as needed. For example, “Target-IQN” needs replaced with the correct setting from the iSCSI target server.

#Install Roles
Install-WindowsFeature failover-clustering,multipath-io -includeManagementTools

#Configure iSCSI Service
Start-Service msiscsi
Set-Service msiscsi -startupType "Automatic"

#Connect to the Target
New-iSCSITargetPortal -TargetPortalAddress "iscsi target fqdn"
$nodeAddress = (Get-IscsiTarget).NodeAddress
Connect-iSCSITarget -NodeAddress $nodeAddress -IsPersistent $true

#Configure Multipath Settings
Enable-MSDSMAutomaticClaim -BusType iSCSI #(computer will reboot, possibly bsod)
Restart-Computer

##one the reboot is complete, continue with the code below.

Get-MSDSMAutomaticClaimSettings #visually confirm that iscsi = $true
Set-MSDSMGlobalDefaultLoadBalancePolicy -Policy RR

#Online and Initialize the Disks
Get-Disk | ?{$_.FriendlyName -like "MSFT Virtual HD*" -and ($_.IsReadOnly -eq $true -or $_.isOffline -eq $true)} | % {Set-Disk -Number $_.Number -IsOffline $false}
Get-Disk | ?{$_.FriendlyName -like "MSFT Virtual HD*" -and ($_.IsReadOnly -eq $true -or $_.isOffline -eq $true)} | % {Set-Disk -Number $_.Number -IsReadOnly $false; Initialize-Disk -Number $_.number -partitionStyle GPT}

Now, choose one cluster node to work on. Login and run the following commands:

#Format the disks and assign a drive letter
Get-Disk | ?{$_.FriendlyName -like "MSFT Virtual HD*" -and $_.partitionstyle -eq "RAW"} | % {
  Initialize-Disk -number $_.number -partitionStyle GPT
  New-Partition -disknumber $_.number -useMaximumSize -AssignDriveLetter
  $driveLetter = $null
  $driveLetter = (Get-Partition -DiskNumber $_.number | ?{$_.type -eq "Basic"}).DriveLetter
  Format-Volume -DriveLetter $driveLetter -FileSystem NTFS
}

#Run a test to see if the cluster can be created.
Test-Cluster -node sql1,sql2 -ReportName C:\Install_Files\cluster1.report.html
#stop here and review report. Some warnings are ok.

#create the new cluster
New-Cluster -name cluster1 -node clusternode1,clusternode2 -staticAddress 10.10.10.20 -NoStorage -AdministrativeAccessPoint ActiveDirectoryAndDns

#add the volumes to the cluster
Get-ClusterAvailableDisk -cluster cluster1 | Add-ClusterDisk

#run this to find the cluster disks. Choose one to be the quorum.
Get-ClusterResource

#configure quorum settings. change "cluster disk 1" to whatever disk you want to be the quorum.
Set-ClusterQuorum -Cluster Cluster1 -NodeAndDiskMajority "Cluster Disk 1"

And, that’s it. You should now have a green cluster to play with.

Windows Server 2012 – Getting Started With the iSCSI Target Server

It’s really great that Windows Server 2012 can now act as an iSCSI target. Here’s what I’ve learned.

An important note is that Windows Server will not share out a raw disk. It will only share virtual disks (VHDX files) sitting on a file server. If you’re used to iSCSI via other vendors, this will freak you out a little bit. Don’t worry — it’s different but it works well.

Overview:

  1. Install the Role
  2. Create a virtual disk.
  3. Create a new iSCSI target, and map the virtual disks to the target.
  4. Map initiator IQN’s to targets.

Install the Role

It’s easy:

Install-WindowsFeature FS-FileServer,FS-VSS-Agent,FS-iSCSITarget-Server,iSCSITarget-VSS-VDS,Storage-Services -includeManagementTools

Create a virtual disk

mkdir C:\iscsimount
New-IscsiVirtualDisk -path C:\iscsimount\test-disk-1.vhdx -sizebytes 10GB

Create a Target and Mapping

New-iSCSIServerTarget TestTarget1
Add-iSCSIVirtualDiskTargetMapping TestTarget1 C:\iscsimount\test-disk-1.vhdx

Map Initiator IQN’s

On each initiator, run the following command to retrieve the IQN.

(Get-InitiatorPort).NodeAddress

Now, back on the target server, do the following:

Set-IscsiServerTarget -Target TestTarget1 -InitiatorIDs @("IQN://initator1IQN","IQN://initiator2IQN")

You’re good to go. On the initiators, you should now be able to connect to the iSCSI Target and see the disks. Congrats!

New Dell MD32xx and MD36xx Firmware with VAAI and Dynamic Disk Pools

For those not in the know, Dell recently released a new firmware version for the MD32xx and MD36xx series. The two big features are VAAI support and Dynamic Disk Pools.

VAAI

VAAI is the VMWare hardware-acceleration API. The MD’s support hardware accelerated locking, which is a huge freakin’ deal. This means that locking operations only lock the needed blocks during a VM operation, and not the whole LUN. This means that you can create and manage a few big LUNs with a lot of VM’s per datastore, instead of a bunch of tiny LUNs.

Here’s a great article on VAAI: Why VAAI?

Dynamic Disk Pools

DDP is an interesting concept. It’s kinda like RAID 6, but instead of choosing a specific hot spare, you choose how many physical disk failures you want to tolerate. The system pools all the disks together, and uses all the spindles, but reserves a small amount of space on each disk in the pool to tolerate failures.

The benefits

  • You get to use all your spindles instead of reserving a hot spare.
  • You can create larger disk pools than would be safe with RAID6.
  • You can tolerate n-number of failures, where n is the amount of disk space you’re willing to reserve.
  • Rebuilds are about 4 times faster.

The downside

  • DDP’s are not as fast as RAID6 for fully-sequential writes. 
  • It’s new, and that might freak people out.

Here’s a video from Dell on the new feature:

I’m using both features now; they’re sweet.

Windows Server 2008 R2 Post-SP1 ISCSI Hotfixes

I was browsing the Dell MD3 PowerVault documentation the other day, and came across an interesting section regarding hotfixes. Apparently, all those strange iSCSI errors and weirdness I sometimes get have been resolved by Microsoft.

Here’s a link to all Post-SP1 hotfies: Links to post SP1 hotfixes for Windows Server 2008 R2 Service Pack 1.

I’ll be installing the following hotfixes on Server 2008 R2 SP1 boxes using iSCSI from now on.

File Server Capacity Tool

I recently used Microsft FSCT to load test a new Dell MD3220i iSCSI array. It took a bit of poking around to get going and I wanted to share my experience.

The Downloads!

Components

FSCT consists of three components:

  • The Server being tested (FSCT-Server)
  • The Clients used for testing (FSCT-Client)
  • The Controller that manages the client\server interaction (FSCT-Controller)

A FSCT setup requires two networks: the ‘data’ network, and the ‘control’ network. The clients and server must have a minimum of 2 nic’s, one for the data network and one for the control network. The controller requires only 1 nic which must be on the control network. The two networks must reside in different subnets.

I used 192.168.0.xxx for the control network, and 192.168.10.xxx for the data network. Make sure that all devices on the same network can ping each other. I also enabled jumbo frames, flow control, and QoS priority on my nics and switch.

Set-Up

DNS\Host Files

FSCT relies on DNS lookups. Using the hosts file on each system is the easiest way to satisfy this requirement. All client\server\controller systems in the test bed should have the control network IP’s and names of all the other systems in their respective local hosts file.

Installing FSCT

To install FSCT, simply extract the contents of the downloaded package into a folder on each system. There is no official installation procedure.

Preparing the Server

FSCT formats any volumes used for load testing during its ‘prepare’ stage. Make sure you have a clean volume ready to go with no needed data. In order to properly prepare the server, you must provide the following information.

  • Volumes to use during testing (drive letters only, no mount points).
  • Maximum number of remote ‘users’ that will be used for testing.
  • A password you would like to assign to the ‘users’ that will be created on the server. In order to properly download results, it should also be the local admin password on the client, server, and controller.
  • The computer names of all clients connecting (must be in hosts file with control network IP).

An example of the server preparation command is as follows:

fsct prepare server /users 5000 /password a1234567! /clients fsct-client-01 /volumes "E: F: G: H: I: J: K: L:" /workload homeFolders

Preparing the Client

To prepare the client you must provide the following information.

  • A password for the users on the server. This must match the password in the step ‘preparing the server’.
  • The server’s data network IP  (for /SERVER_IP)
  • The server’s computer name (must be in hosts file with control network IP)
  • The maximum number of users you wish to make available to this client for testing.

An example of the client preparation command is as follows:

fsct prepare client /server fsct-server /password a1234567! /users 2500 /server_ip 192.168.10.10 /workload homeFolders

Preparing the Controller

This is easy; simply run the following command.

fsct prepare controller

Running FSCT

The client command is straight-forward. The server command includes the ability to run multiple times with a different number of users per run. To start a single run, set min_users and max_users to the same number with a step of 1. To start a succession of runs, set a higher max_users and increase the step value as needed. The step value indicates how many users to add between runs. For example, if min_users is 1 and max users is 100 with a step of 1, FSCT will run 100 times. In the same example with a step of 50, FSCT will run twice. Duration is in seconds; 10-15 minutes is the recommended run time per Microsoft’s FSCT Users Guide.

  1. On the client, run the following command:
    fsct run client /controller fsct-controller /server fsct-server /password a1234567!
  2. On the controller, run the following command:
    fsct run controller /server fsct-server /password a1234567! /volumes "E: F: G: H: I: J: K: L:" /clients fsct-client-01 /min_users 1250 /max_users 1250 /step 1 /duration 900 /workload homeFolders

Returning the Output

On the controller, run the following command to gather the results. An output directory will be created at the path given.

fsct cleanup controller /backup C:\workingtemp\fsctbackup01

The output of the FSCT_data file should look similar to this:

*** Results
Users  Overload  Throughput  Errors  Errors [%]  Duration [ms]
1250       125%          52      44          0%        1055923

*** Test's information
FSCT version: 1.0
Workload: homeFolders
Time: 2011/11/02 17:46

*** Performance Counters
1 - \Processor(_Total)\% Processor Time
2 - \PhysicalDisk(_Total)\Disk Write Bytes/sec
3 - \PhysicalDisk(_Total)\Disk Read Bytes/sec
4 - \Memory\Available Mbytes
5 - \Processor(_Total)\% Privileged Time
6 - \Processor(_Total)\% User Time
7 - \System\Context Switches/sec
8 - \System\System Calls/sec
9 - \PhysicalDisk(_Total)\Avg. Disk Queue Length
10 - \TCPv4\Segments Retransmitted/sec
11 - \PhysicalDisk(_Total)\Avg. Disk Bytes/Read
12 - \PhysicalDisk(_Total)\Avg. Disk Bytes/Write
13 - \PhysicalDisk(_Total)\Disk Reads/sec
14 - \PhysicalDisk(_Total)\Disk Writes/sec
15 - \PhysicalDisk(_Total)\Avg. Disk sec/Read
16 - \PhysicalDisk(_Total)\Avg. Disk sec/Write

*** Server resources
Users    CPU     DiskWrite      DiskRead        Memory       avg( 5)       avg( 6)       avg( 7)       avg( 8)       avg( 9)       avg(10)       avg(11)       avg(12)       avg(13)       avg(14)       avg(15)       avg(16)
 1250  35.0%     9938461.0    12517901.0         955.2          34.7           0.3        7152.0        2176.9           5.4         588.7       35680.3      295838.1         376.5          72.3           0.0           0.0

*** Client Resources (1250 users)
Name            CPU     DiskWrite      DiskRead        Memory       avg( 5)       avg( 6)       avg( 7)       avg( 8)       avg( 9)       avg(10)       avg(11)       avg(12)       avg(13)       avg(14)       avg(15)       avg(16)
fsct-client-01   0.0%           0.0           0.0           0.0           0.0           0.0           0.0           0.0           0.0           0.0           0.0           0.0           0.0           0.0           0.0           0.0

*** Label descriptions
Overload   - server's overload in percent. For example if the return value is 900% it means
             that to support the given number of users the server capacity must be increased
             by 900% (so if there was 1 machine, 9 more are needed).

Errors [%] - number of errors / number of executed scenarios * 100%.

             The value can be greater than 100% because multiple errors can occur during
             a single scenario execution.

Cleaning Up

To ‘clean up’ the client and sever, run the following commands.

fsct cleanup server /users 5000 /clients fsct-client-01 /volumes "H: I: J: K:"
fsct cleanup client /users 2500

Getting Consistent Results

At first, my outputs varied widely. After researching the issue and re-reading TFM, I found this gem in the FAQ: “To achieve repeatable results, you must reformat the data volumes, recreate the file set, restart all of the computers (server, controller, and clients), and run a single iteration per run…You can run multiple iterations to investigate the maximum number for a configuration, but you should redo the testing as indicated to get a repeatable and reportable result “. This honestly makes sense because of the amount of caching involved in a file system.

Here’s a script to help out with prepping the server between runs. This is a destructive script in that it formats volumes without asking and will need modified for your environment. For it to work, the your volumes must be labeled “FSCT”.
First, create a text file named “format-override.txt” with the following contents.

fsct
y

Next, create a file named “prep-server.cmd” with the following contents:

fsct cleanup server /users 1250 /clients fsct-client-01 /volumes "E: F: G: H: I: J: K: L:"
type format-override.txt | format E: /q /X /V:"fsct"
type format-override.txt | format F: /q /X /V:"fsct"
type format-override.txt | format G: /q /X /V:"fsct"
type format-override.txt | format H: /q /X /V:"fsct"
type format-override.txt | format I: /q /X /V:"fsct"
type format-override.txt | format J: /q /X /V:"fsct"
type format-override.txt | format K: /q /X /V:"fsct"
type format-override.txt | format L: /q /X /V:"fsct"
fsct prepare server /users 1250 /password a1234567! /clients fsct-client-01 /volumes "E: F: G: H: I: J: K: L:" /workload homeFolders
pause
shutdown /r /t 00 /f /c "prepping for fsct" /d P:0:0

Good luck and happy performance hunting!

Openfiler Will Not Map iSCSI Luns to iSCSI Targets (Part 2)

In my previous post with this title, I went through installing a script to rebuild the openfiler volumes.xml file on every reboot. This works great for iSCSI, but yesterday I added an NFS volume. On reboot, it was marked as iSCSI. Turns out that the script only works for iSCSI, even though it’s supposed to work for all the supported openfiler volume types.

First, a quick review of the problem.

  • My mdadm software raid array will not auto-assemble when called by the default lines in /etc/rc.
  • Because of this, LVM does not load the volume.
  • Openfiler deletes the volumes.xml entry once it starts since the volume is gone.
  • When I manually start /dev/md0 with “mdadm –assemble –scan” and restart the openfiler service my volumes do not reappear.
Here’s how I got everything working:
  • call the mdadm twice in /etc/rc.sysinit to get the array to assemble before LVM starts.
  • remove /etc/rc3.d/S85Openfiler so that openfiler doesn’t automatically start on reboot.
  • crontab script to backup volumes.xml every 1 minute.
  • edit /etc/rc.local to restore the backup volumes.xml and start the openfiler service.
The result? Everything works as advertised. Here’s how you do it.

The Process

1) Edits to /etc/rc.sysinit

Find the lines below in your /etc/rc.sysinit file, and add the lines that say “added by JP”.

# Start any MD RAID arrays that haven't been started yet
[ -r /proc/mdstat -a -r /dev/md/md-device-map ] && /sbin/mdadm -IRs

if [ -x /sbin/lvm ]; then
        export LVM_SUPPRESS_LOCKING_FAILURE_MESSAGES=1
        action $"Setting up Logical Volume Management:" /sbin/lvm vgchange -a y --ignorelockingfailure --ignoremonitoring
        unset LVM_SUPPRESS_LOCKING_FAILURE_MESSAGES
fi

#This section added by JP 08/19/11 to fix software raid not auto-assembling.
/sbin/mdadm --assemble --scan
#If you notice that a specific volume group won't activate properly add this line too:
#You can diagnose this with 'lvscan' then 'vgscan'
#/sbin/vgchange -ay VGName

if [ -f /etc/crypttab ]; then
    init_crypto 0
fi

if [ -f /fastboot ] || strstr "$cmdline" fastboot ; then
        fastboot=yes
fi

2) rc.3 changes

  1. rm /etc/rc3.d/S85Openfiler

3) Crontab Additions

  1. crontab -e
  2. add the following line:
    * * * * * cp -f /opt/openfiler/etc/volumes.xml /root/volumes.xml.DONOTDELETE.bak

4) /etc/rc.local Additions

Here is my /etc/rc.local:

#!/bin/sh
#
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.

touch /var/lock/subsys/local

#reactive lvm volumes
###if your LV's don't come online you might need the lines below:
#lvchange -Ay /dev/VGName/LVName
#lvchange -Ay /dev/VGName/LVName
#lvchange -Ay /dev/VGName/LVName
#restore volumes in the Openfiler GUI
cp /root/volumes.xml.DONOTDELETE.bak /opt/openfiler/etc/volumes.xml

#resart iscsi-target service because it doesn't work right for some reason at this point
service iscsi-target restart

#restart openfiler service with all volumes
service openfiler start

After all this; my openfiler box is now working properly with NFS :). Good luck!

Openfiler Will Not Map iSCSI Luns to iSCSI Targets

NOTE: I found a much better way to do this.

See my new post: Openfiler Will Not Map iSCSI Luns to iSCSI Targets (Part 2). Use the directions below at your own risk! The solution below doesn’t fix the problem for NFS, btrfs, ext2\3\4, or xfs and can possibly lead to data loss.

Last night I was putting the finishing touches on my new Openfiler 2.99 box. Once strange thing, although I could see my volumes in the ‘Volume Management’ page, clicking ‘Map’ in the iSCSI target’s Lun Mapping page would just refresh the page without any changes.

In my case, running lvscan showed that the volumes were ‘inactive’. Changing them to ‘active’ by running ‘lvchange -ay /dev/vol/volname’ allowed me to map again! Sadly, rebooting changed them back to ‘inactive’, probably since they’re on a software raid volume that isn’t automatically assembling on boot.

so, back to editing the rc.local.

Solution

  1. run lvscan to find your inactive volumes.
  2. vi /etc/rc.local and add the following lines (change to your /dev/vg/volume names though):
    service openfiler stop
    #reactive lvm volumes
    lvchange -ay /dev/jetstor/axisdisk
    lvchange -ay /dev/jetstor/dpmbackups
    lvchange -ay /dev/jetstor/desktopimages
    
    #reimport openfiler volumes
    /root/remake_vol_info2
    
    service openfiler start
    

One important note! You might need the remake_vol_info2 script from my previous post here: Openfiler Software Raid Volumes Disappear on Reboot. Because of the software raid problem, my final /etc/rc.local looks like this:

#!/bin/sh
#
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.

touch /var/lock/subsys/local

#stop openfiler service to make changes to storage subsystem
service openfiler stop

#recreate software raid volumes on JetStor
mdadm -A -s

#recreate volumes in the Openfiler GUI
/root/remake_vol_info2

#reactive lvm volumes
lvchange -ay /dev/jetstor/axisdisk
lvchange -ay /dev/jetstor/dpmbackups
lvchange -ay /dev/jetstor/desktopimages

#restart openfiler service with all volumes
service openfiler start