Category Archives: Storage

EMC Avamar Error 10007 VMX file is Suspiciously Small Fix

EMC Avamar Error 10007 is an interesting error with a little history behind it. After opening up a support case with EMC, they indicated that error 10007, vmx file is suspiciously small, is a legacy error message from the VMware vCenter 4.x version of vCenter. From what the support technician said, is that the VMX file would be backed up through web services. This all changed in vCenter 5.x. Our error 10007, was resolved by restarting the MCS services on the Avamar grid. What happened is that the cached credentials and session state became stale/invalid. The EMC Avamar grid would try to reuse that session information that was now invalid thus causing the Avmar 10007 error.

This may seem like a strange fix, but restarting MCS services on the Avamar grid resolved our issue. This resolution was the recommended fix from EMC support. The article, Backup job fails for all virtual machines after vSphere Data Protection 5.1 deployment (2038597), is published by VMware regarding a similar error but this was not the particular error which we were encountering.

The EMC Avamar Error message would look like:

2013-07-08 08:31:41 avvcbimage Error : vmx file  is suspiciously small (under 30 bytes), please examine the log on the Avamar Administrator for root cause analysis (Log #2)
2013-07-08 08:31:41 avvcbimage Error : Backup of VM metadata failed. (Log #2)
2013-07-08 08:31:45 avvcbimage Error : Avtar exited with 'code 163: externally cancelled' (Log #2)

The EMC Avamar proxy log would display the following messages:

avmproxy1:/usr/local/avamarclient/var-proxy-1 # egrep -i "sdk|reused" Default_Domain-1372926600107-e55a99675c5f5a260028ec81d88bf620d9678824-1016-vmimagel.log
2013-07-04 08:32:32 avvcbimage Info : Login(https://vcenter.domain.com:443/sdk) problem with reused sessionID='52f3ba9e-ff44-fa6d-ed6f-61b44fe35ec5' contacting data center 'Cleveland'.
2013-07-04 08:32:32 avvcbimage Warning : Problem logging into URL 'https://vcenter.domain.com:443/sdk' with session cookie.
2013-07-04 08:32:32 avvcbimage Info : Logging into URL 'https://vcenter.domain.com:443/sdk' with user 'DOMAINavamar' credentials.
2013-07-04 08:32:32 avvcbimage Info : Login(https://vcenter.domain.com:443/sdk) problem with reused sessionID='524408a5-443b-1d22-9878-6f5fe4de2816' contacting data center 'Cleveland'.

Resolution: Restart MCS Services on the EMC Avamar Grid.

Advertisements
Tagged , , , , , ,

vSA vSphere Storage Appliance Performance Benchmark Test

The article below goes into depth with my experience with VMware vSA Performance Benchmark Testing. I’ve tried to be as detailed as possible to give you a complete picture of my findings. I believe that there is a space where storage virtualization may thrive but with recent experience with the VMware vSA product, I am less than satisfied with the results, manageability and most of all performance. I believe storage virtualization has a few more years until maturity until it can be truly considered a serious candidate in the small & remote office scenarios.  This statement holds true for other two/three node storage virtualization technologies including Falconstor’s storage virtualization.

VMware Version Information

VMware vCenter Server 5.1.0, 947673
VMware vStorage Appliance 5.1.3, 1090545
VMware ESXi 5.1 U1, HP OEM Bundle, 1065491 (VMware-ESXi-5.1.0-Update1-1065491-HP-5.50.26.iso)

HP ProLiant DL385 G2 Hardware Configuration
– 4 CPUs x 2.6 GHz
– Dual-Core AMD Opteron Processor 2218
– AMD Opteron Generation EVC Mode
– HP Smart Array P400, 512MB Cache, 25% Read / 75% Write
– RAID-5, 8x 72 GB 10K RPM Hard Drives
– HP Service Pack 02.2013 Firmware

vStorage Appliance Configuration
– 2 Node Cluster
– Eager Zero Full Format
– VMware Best Practices

IOZone Virtual Machine Configuration
– Oracle Linux 6.4 x86_64
– 2 vCPU
– 1 GB Memory
– 20 GB Disk, Thick Eager Zero Provisioned
– VMware Tool 9.0.5.21789 (build-1065307)

IOZone Test Paramaters
/usr/bin/iozone -a -s 5G -o

-a   Used to select full automatic mode. Produces output that covers all tested file operations for record sizes of 4k to 16M for file sizes of 64k to 512M.

-s #   Used to specify the size, in Kbytes, of the file to test. One may also specify -s #k (size in Kbytes) or -s #m (size in Mbytes) or -s #g (size in Gbytes).

-o   Writes are synchronously written to disk. (O_SYNC). Iozone will open the files with the O_SYNC flag. This forces all writes to the file to go completely to disk before returning to the benchmark.

VMware ESXi/vSA Network Configuration

VMware vSA Architecture

IOZone Performance Benchmark Results

vSA Read Graph

vSA Stride Read Graph

vSA Random Read Graph

vSA Backward Read Graph

vSA Fread Graph

vSA Write Graph

vSA Random Write Graph

vSA Record Rewrite Graph

vSA Fwrite Graph

Download RAW Excel Data

Summary

The vSA performed far less than the native onboard storage controller which was expected due to the additional layer of virtualization. I honestly expected better performance out of the 8-disk RAID-5 even without storage virtualization since they were 10,000 RPM drives. On average, across all the tests there is 76.3% difference between the native storage and the virtualized storage! Wow! That is an expensive down grade! I understand that the test bed was using not the latest and greatest hardware but in general terms of disk performance is generally limited by the spinning platter. I would really be interested in seeing the difference using newer hardware.

I believe this only depicts a fraction of the entire picture, performance. There is other concerns that I have at the moment with storage virtualization such as complexity and manageability. I found the complexity to be very frustrating while setting up the vSA, there are many design considerations and limitations with this particular storage virtualization solution most of which were observed during the test trails. The vSA management is a Flash-based application which had it’s quirks and crashes as well. Crashes at a storage virtualization layer left me thinking that this would be a perfect recipe for data loss and/or corruption. In addition, a single instance could not manage multiple vSA deployments due to IP addressing restrictions which was a must for the particular use-case which I was testing for.

For now, storage virtualization is not there yet in my opinion for any production use. It has alot of room to grow and I will certainly be interested in revisiting this subject down the road since I believe in the concept.

Reference Articles That May Help 

Tagged , , , , , , ,

VMware vSA vSphere Storage Appliance Installation Parameters

During recent troubleshooting of installing VMware’s vSphere Storage Appliance, I’ve under covered two installation parameters which could be needed depending on your environment and installation.

The first one allows you to change the username and password that the vSA will use to connect to vCenter with.

VMware-vsamanager.exe /v"VM_SHOWNOAUTH=1"

VMware vSA Install Screenshot

The other parameter, allows you to specify the vCenter IP addresss or FQDN.

VMware-vsamanager.exe /v"VM_IPADDRESS=<fqdn/ip>"

VMware vSA Install Screenshot

… or the two can be combined like

VMware-vsamanager.exe /v"VM_SHOWNOAUTH=1 VM_IPADDRESS=<fqdn/ip>"
Tagged , , , ,

EMC VNX Check Hotspare Rebuild Successfully After Disk Failure

There are two status codes in a SP Collect that a CE should look at before removing a failed drive.

67d is the hexadecimal address which can be found in the SP Collect logs which indicates that the drive successfully failed over to the hot spare. This can be found in the SPA_navi_getlog.txt and the SPB_navi_getlog.txt respectively.

78b is the address which indicates that the drive has been removed from the array.

EMC VNX SPA Log Rebuild Screenshot

EMC VNX SPB Log Rebuild Screenshot

I received the two tips in this article from a local CE that came onsite one day after a disk failure and is definitely worth a mention.

Tagged , , , , , , ,

EMC VNX Check Transitioning Equalizing Faulted LUN Disk

After a disk has faulted, the disk goes into a transitioning or an equalizing state. EMC uses different terminology to describe the action of repairing a faulted disk. Transitioning or equalizing times can vary based on the type and speed of the disk but your SAN utilization will have a direct impact on how fast a drive rebuilds. No where in Unisphere does it indicate a status or an approximate ETA time of when the drive is to complete.

You can check the state of the transitioning or an equalizing state but it is burried in the SP Collect logs. Once you perform an SP Collect, it is one large zip file. Expand the zip file and you will find more zip files. There is a zip file that ends in _sas.zip, within that file there will be SPA_cfg_info.txt or SPB_cfg_info.txt depending on which service processor you performed the SP Collect on. Within that file look for the information that shows you the status of the transitioning/equalizing process!

EMC VNX Equalizing Log

Tagged , , , , , , , , ,

EMC Isilon Integration Quest Vintela Authentication Services VASD

There are many challenges that are faced when an organization is forced to run a mixed protocol environment to serve up the same data. This introduces additional management tasks and additional complexities. In a mixed protocol environment you must manage Windows Access Control Lists (ACL) to enforce Windows permissions in addition to managing *nix user/group ids with permission bits to control access. The EMC Isilon solution is a great platform to support mixed protocol environments. In my opinion this far, the Isilon platform is the ideal solution to deal with a mixed protocol environment due to it’s integration with authentication services such as Windows Active Directory or any LDAP service. There are a number of products that provide extensions to Windows Active Directory to provide UID/GID authentication and mappings. One of those products is Quest’s (Vintela) Authentication Services.

Quest Authentication Services uses five fields in the Windows Active Directory. These are the five attributes that Quest Authentication Services uses:

  • gecos
  • uidNumber
  • gidNumber
  • loginShell
  • unixHomeDirectory

Using that information, we are now able to integrate the Quest (Vintela) Authentication Services with the EMC Isilon NAS Storage. The screenshot below displays the correct settings to use on the EMC Isilon storage to integrate with Quest Authentication Services.

EMC Isilon Quest Authentication Services Settings Screenshot

Finally, test your mappings to ensure your AD/LDAP authentication and mappings work correctly.

Tagged , , , , , , , , , , , ,

Xiotech Magnitude 3D 3000 4000 Controller Fan Replacement

Just recently, there was a pesky failing fan inside of our controllers on a Xiotech Magnitude 3D 4000. A replacement part dispatched and when the controller was taken off line, it was difficult to determine which fan was “Board Fan 1” since there were no markings. In addition, the fan numbering could have started with zero for all we knew. After trial-and-error, this is the correct fan numbering schema.

Xiotech Magnitude 3D 4000 Fan Number Locations

Xiotech’s Controller Fan Replacement Guide: Mag3D_Mod-03_Controller_Fan_Replacement_040110

In addition, this is the manual for the SuperMicro X6DH8/X6DHE-XG2 motherboard inside a Xiotech Controller: SuperMicro-X6DH8-X6DHE-XG2-MNL-0737

Tagged , , , , , ,

Restart EMC VNX Management Services in Unisphere

Restarting the Management Services for EMC Unisphere is simple and it does not require an outage. The process will require you to log into each Service Processor separately through a web interface.

1. Open up a web browser and goto https://<Service Processor IP/FQDN>/setup
2. Login using the ‘sysadmin’ user account and password.
EMC VNX Restart Management Services

3. Scroll down to ‘Restart Management Services’ and press that button.
EMC VNX Restart Management Service
4. Confirm the Management Services Restart and wait until the restart has completed. Note, this can take up to 10 minutes.
5. Once the first Service Processor reboot has occurred and you are able to log back into that Service Processors’ Web Services then proceed to the second Service Processor.

This should clear up any Unisphere GUI issues or incorrect reporting of information on the Host Agent’s page.

Tagged , , , , , , ,

EMC PowerPath Multipathing in RedHat Linux Guide

First and foremost, I want to give the credit to Will’s Notes for the original article on Multipathing in RHEL5. I was able to use this guide with a Xiotech SAN to configure Multipathing. EMC makes it a ton easier to configure Multipathing. EMC has a product called PowerPath, PowerPath can be used with or without a license. If you install and use PowerPath in an unlicensed fashion you have an active-passive connection back to the EMC SAN. If PowerPath is licensed, this allows for an active-active connection back to the EMC SAN. Active-active is not only highly available it is load balanced.

Configuring PowerPath was rather easy to my surprise. Download PowerPath from powerlink.emc.com for your correct distribution and install the RPM.

[root@localhost ~]# rpm -iv EMCPower.LINUX-5.6.0.00.00-143.RHEL5.x86_64.rpm

If you are using PowerPath in licensed mode, register license key with the first command and check the registration of PowerPath with the second command listed below.

[root@localhost ~]# emcpreg -install
[root@localhost ~]# powermt check_registration

Once PowerPath is installed you can rescan the bus or if you do not know how simply reboot RHEL.

To view the information about PowerPath issue the following command.

[root@localhost ~]# powermt display dev=all
Pseudo name=emcpowera
CLARiiON ID=<SERIALNUMBER> [STROAGEGROUP_NAME]
Logical device ID=STORAGEGROUP_WWN [LUN 400]
state=alive; policy=BasicFailover; priority=0; queued-IOs=0;
Owner: default=SP B, current=SP B       Array failover mode: 4
==============================================================================
--------------- Host ---------------   - Stor -   -- I/O Path --  -- Stats ---
###  HW Path               I/O Paths    Interf.   Mode    State   Q-IOs Errors
==============================================================================
3 qla2xxx                  sdb       SP A0     unlic   alive       0      0
3 qla2xxx                  sdd       SP B0     unlic   alive       0      0
3 qla2xxx                  sdf       SP A4     active  alive       0      0
3 qla2xxx                  sdh       SP B4     active  alive       0      0
4 qla2xxx                  sdj       SP A1     unlic   alive       0      0
4 qla2xxx                  sdl       SP B1     unlic   alive       0      0
4 qla2xxx                  sdn       SP A5     unlic   alive       0      0
4 qla2xxx                  sdp       SP B5     unlic   alive       0      0

Pseudo name=emcpowerb
CLARiiON ID=<SERIALNUMBER> [STROAGEGROUP_NAME]
Logical device ID=STORAGEGROUP_WWN [LUN 401]
state=alive; policy=BasicFailover; priority=0; queued-IOs=0;
Owner: default=SP A, current=SP A       Array failover mode: 4
==============================================================================
--------------- Host ---------------   - Stor -   -- I/O Path --  -- Stats ---
###  HW Path               I/O Paths    Interf.   Mode    State   Q-IOs Errors
==============================================================================
3 qla2xxx                  sdc       SP A0     unlic   alive       0      0
3 qla2xxx                  sde       SP B0     unlic   alive       0      0
3 qla2xxx                  sdg       SP A4     active  alive       0      0
3 qla2xxx                  sdi       SP B4     active  alive       0      0
4 qla2xxx                  sdk       SP A1     unlic   alive       0      0
4 qla2xxx                  sdm       SP B1     unlic   alive       0      0
4 qla2xxx                  sdo       SP A5     unlic   alive       0      0
4 qla2xxx                  sdq       SP B5     unlic   alive       0      0

Once PowerPath is installed and PowerPath is able to access the LUNs that are presented to the host, begin to create the filesystem. Create the file system like you would format any ordinary storage device but instead of /dev/sda, /dev/sdb, etc… EMC’s PowerPath devices are /dev/emcpowera, /dev/emcpowerb, /dev/emcpowerc, etc..

Tagged , , , , , , , , , , , ,

EMC Naviseccli View Service Processor Cache Utilization

After hooking up a host to an EMC SAN it is very helpful to have the host that is connected to the SAN to have Naviseccli installed along with the Host Agent. Naviseccli allows you to connect to the Service Processors to gather information among other things.

Dirty Cache Pages is data in the cache that is a solid state physical format waiting to be written to disk. You can monitor how full the cache is by the following commands on each Service Processor.

[root@localhost bin]# ./naviseccli -h <IP/Hostname SPA> getcache -pdp
[root@localhost bin]# ./naviseccli -h <IP/Hostname SPB> getcache -pdp

If you receive the following error, “Security file not found. Already removed or check -secfilepath option.” Issue to following commands to allow security to issue commands remotely.

[root@localhost bin]# ./naviseccli -User sysadmin -Password <password> -Scope 0 -h <IP/Hostname SPA> -AddUserSecurity
[root@localhost bin]# ./naviseccli -User sysadmin -Password <password> -Scope 0 -h <IP/Hostname SPB> -AddUserSecurity
Tagged , , , , , , , , , , ,