ESXi Errors – Failed write command to write-quiesced partition

I’ve been getting the following emails from all of my ESXi hosts since I’ve upgraded to 4.1 about 9 months back. I’d get 3-8 emails a day, and see large latency spikes on the corresponding datastore when the email was sent.


Target: vm6.chemistry.ohio-state.edu
Stateless event alarm
Alarm Definition:
([Event alarm expression: Host error] OR [Event alarm expression: Host warning])
Event details:
Issue detected on vm6.chemistry.ohio-state.edu in Chemistry Datacenter: ScsiDeviceIO: 2352: Failed write command to write-quiesced partition naa.6002219000a17f3d00003dcb4e0ccad3:1
(5:03:26:49.543 cpu1:5362)

I had engaged support from both VMWare for ESXi and Dell for my MD3000i’s. I’ve tried Jumbo Frames, Flow Control, different VLAN trunk configurations, etc. After many support calls and sessions, we found the problem on page 40 of the iSCSI SAN Configuration guide. In any situation where an iSCSI VMKernel can send data down a group of NICs, either because of a Virtual Distributed Switch, or multiple NIC’s assigned to an iSCSI port group, it’s mandatory to lock things down so that each VMKernel is assigned to send data through only 1 port group. Essentially, this forces multipathing from the network level up to the protocol level.

Resolution:

My VMKernels were on a VDS, so I had to perform the following operations:

  1. Open vSphere client, then navigate to Inventory -> Networking.
  2. Right click your first SAN\VMKernel Port Group -> Edit Settings.
  3. Click “Teaming and Failover”, and limit your active dvUplinks to only a single uplink. The rest should be placed under ‘unused’.
  4. Repeat this for every SAN\VMKernel port group.
Next, you must “bind” the iSCSI Software Adapter to the VMKernels:
  1. In vSphere client, find the name of your first host’s iSCSI adapter by choosing a host then clicking Configuration -> Storage Adapters. It’s typically vmhba34.
  2. Enable remote tech support mode and SSH to your first ESXi host.
  3. Run the following commands. After the first command, write down any vmk#’s that correspond with your iSCSI VMKernels.
    esxcfg-vmknic -l
    esxcli swiscsi nic list -d vmhba34
  4. If the ‘nic list’ command didn’t show any vmkernels, then you need to bind each iSCSI VMKernel with the following command:
    esxcli swiscsi nic add -n vmk# -d vmhba34
  5. When finished, run the following command to verify the work:
    esxcli swiscsi nic list -d vmhba34
  6. Repeat this for all hosts in your inventory.

Other Notes

After running these commands, it’s recommended that any unused dynamic and static iSCSI targets ne removed. However, the add\remove delay is faster with iSCSI bindings in place. For more info, see page 40 of the iSCSI SAN Configuration Guide 4.1 .

References:

Print Server CNames Broken! Server 2008 R2 DC

Yesterday morning I got the call, “the printers are down!”. Bad news — printing and file sharing are the two biggest customer interactions we have. Turns out the print server would process connections fine to it’s actual hostname, but not to it’s cname. This was directly after installing Server 2008 R2 SP1. I googled around, and found this guy with the -exact- same story. Rejoining my servers to the domain worked like a charm.

Reference: