Tag Archives: VMware

Poor Man’s vCPU & vRAM Right Size Recommendation Tool

VMware vCenter Operations Management Suite can be expensive. If you are like me and there is no budget for vCOPs, this script will give you a vCPU & vRAM recommendations based off of past virtual machine usage. The following script will connect to your vCenter, grab historical performance data and provide recommendations that were designed around two vKernel whitepapers. The following whitepapers are:

 

The script is simple to use only requiring the vCenter parameter to start with all defaults:

PoorMansRecommendations.ps1 -vCenter site1.local.domain

 

Specifies additional authentication information. Grabbing 60 days of past performance instead of the default 30 days:

PoorManRecommendations.ps1 -vCenter site1.local.domain -Username fred -Password root -PastDays 60

 

Specifies more samples for accuracy and using a larger ‘building block’ for memory recommendations:

PoorMansRecommendations.ps1 -vCenter site1.local.domain -PastDays 60 -MaxSamples 25000 -MemoryBuildingBlockMB 1024

 

When running the script interactively, a progress bar be displayed as it calculates recommendations per virtual machine:
Poor Man's Right Sizing

The results:

Poor Man's Recommendations Results

This should only be used as a guidance, point of reference, a conversation point or just a rough estimate. Each environment and workload characteristics are unique, please use your logic along with this data to come to a solution that is right for your environment.

Download the script: PoorMansRecommendations.ps1

Thanks for looking. Please leave any questions or comments below and have a great day!

Tagged , , , , , , ,

VMware STS Clients Failed SSL Certificate of STS Service Cannot Be Verified

“Initialization of STS Clients failed. Root Cause: The SSL certificate of STS service cannot be verified” is an error which put a delay in deployment of the vShield Manager.

VMware STS Clients Failed Error

During the configuration of the Lookup Service Information, we encountered this particular error. It important to understand how the environment was designed when we hit this error and why it didn’t seem to make sense at first .

There are two sites, Site A and Site B, in a hybrid vCenter 5.1 configuration running vCenter 5.5 Single Sign-On and Web Client on their own dedicated virtual machines, SSO1 and SSO2. vCenter 5.5 Single Sign-On and the Web Client both reside on the same server, one in each site. There are a total of 5 vCenter Servers that are at 5.1 U1/U2 versions. Each vCenter is pointed at their corresponding site/geographic regions’ vCenter 5.5 Single Sign-On and Web Client server.

VMware Single Sign-On SSO Architecture

This model is fully supported by VMware per KB2059249 and has proven to be an ideal deployment model in the vCenter 5.1 product family than the initial release of Single Sign-On 5.1.

The vShield Manager was deployed at Site B and we used Site B’s SSO and Web Server address when configuring the Lookup Service. After research, internet forums indicated that the certificate of the SSO server, chain and root certificates needed to be bundled into a single certificate and installed on the STS server. This did not make sense since no certificates were manually generated for use by the SSO servers. All SSO certificates were generated during installation and we’re self signed by the VMware SSO installer.

VMware STS Clients Failed Error

While working with a co-worker to troubleshoot the issue above, it occurred to me to list all services that the SSO server see’s to determine what STS service that the SSO server was using. After issuing the following command on the SSO server:

ssolscli listServices https://cgvccore2.fqdn:7444/lookupservice/sdk

Output:

VMware STS Clients Failed Error Proof

The urn:sso:sts service was listed with Site A’s registered URL! It completely slipped my mind that there was only one STS server listed in any SSO instance. We updated the Lookup Service Information Host URL and the “Initialization of STS Clients failed. Root Cause: The SSL certificate of STS service cannot be verified” issue was resolved!

VMware STS Clients Failed Error Resolved

Note: This is single point of failure, it would be best to load balance the STS service. There are articles to update where the STS service is pointing to the event of a failure if a load balance model is not implemented initially.

Tagged , , , , , ,

vCenter 5.1 Single Sign-on (SSO) Unable to expose the remote JMX registry. Port value out of range: -1

VMware vCenter 5.1 Single Sign-on can pose many problems since Single Sign-on has been introduced until VMware’s replacement with the 5.5 version of Single Sign-On. If you are required to still use vCenter’s 5.1 Single Sign-on server and experience the following “Unable to expose the remote JMX registry” or “Port value out of range: -1” the resolution is simple but let’s first identify this is the issue by analyzing the catalina log.

The following is an example from, C:Program FilesVMwareInfrastructureSSOServerlogscatalina.2013-09-02.log.

02-Sep-2013 00:38:51.903 INFO [WrapperSimpleAppMain] com.springsource.tcserver.security.PropertyDecoder.<init> tc Runtime property decoder using memory-based key
02-Sep-2013 00:38:52.854 INFO [WrapperSimpleAppMain] com.springsource.tcserver.security.PropertyDecoder.<init> tcServer Runtime property decoder has been initialized in 960 ms
02-Sep-2013 00:38:56.364 INFO [WrapperSimpleAppMain] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["http-bio-7445"]
02-Sep-2013 00:38:56.396 INFO [WrapperSimpleAppMain] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["http-bio-7444"]
02-Sep-2013 00:38:56.396 INFO [WrapperSimpleAppMain] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["http-bio-7080"]
02-Sep-2013 00:38:56.396 INFO [WrapperSimpleAppMain] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["ajp-bio-7009"]
02-Sep-2013 00:38:57.784 SEVERE [WrapperSimpleAppMain] com.springsource.tcserver.serviceability.rmi.JmxSocketListener.init Unable to expose the remote JMX registry.
 java.lang.IllegalArgumentException: Port value out of range: -1
	at java.net.ServerSocket.<init>(ServerSocket.java:18
	... debug junk ...

In C:Program FilesVMwareInfrastructureSSOServerconfcatalina.properties, towards the bottom you will find the following variables:

base.shutdown.port=7005
base.jmx.port=-1
ajp-vm.http.port=7080

Change base.jmx.port to equal 6969. By default, -1 is disabled but causes SEVERE warnings in the Single Sign-On (SSO) log files.

base.shutdown.port=7005
base.jmx.port=6969
ajp-vm.http.port=7080

See for further detail about the base.jmx.port property at http://pubs.vmware.com/vsphere-51/index.jsp?topic=%2Fcom.vmware.vsphere.install.doc%2FGUID-ABF63FAB-711C-4C8D-87D7-E6FB73B98425.html

Tagged , , , , , ,

EMC Avamar Error 10007 VMX file is Suspiciously Small Fix

EMC Avamar Error 10007 is an interesting error with a little history behind it. After opening up a support case with EMC, they indicated that error 10007, vmx file is suspiciously small, is a legacy error message from the VMware vCenter 4.x version of vCenter. From what the support technician said, is that the VMX file would be backed up through web services. This all changed in vCenter 5.x. Our error 10007, was resolved by restarting the MCS services on the Avamar grid. What happened is that the cached credentials and session state became stale/invalid. The EMC Avamar grid would try to reuse that session information that was now invalid thus causing the Avmar 10007 error.

This may seem like a strange fix, but restarting MCS services on the Avamar grid resolved our issue. This resolution was the recommended fix from EMC support. The article, Backup job fails for all virtual machines after vSphere Data Protection 5.1 deployment (2038597), is published by VMware regarding a similar error but this was not the particular error which we were encountering.

The EMC Avamar Error message would look like:

2013-07-08 08:31:41 avvcbimage Error : vmx file  is suspiciously small (under 30 bytes), please examine the log on the Avamar Administrator for root cause analysis (Log #2)
2013-07-08 08:31:41 avvcbimage Error : Backup of VM metadata failed. (Log #2)
2013-07-08 08:31:45 avvcbimage Error : Avtar exited with 'code 163: externally cancelled' (Log #2)

The EMC Avamar proxy log would display the following messages:

avmproxy1:/usr/local/avamarclient/var-proxy-1 # egrep -i "sdk|reused" Default_Domain-1372926600107-e55a99675c5f5a260028ec81d88bf620d9678824-1016-vmimagel.log
2013-07-04 08:32:32 avvcbimage Info : Login(https://vcenter.domain.com:443/sdk) problem with reused sessionID='52f3ba9e-ff44-fa6d-ed6f-61b44fe35ec5' contacting data center 'Cleveland'.
2013-07-04 08:32:32 avvcbimage Warning : Problem logging into URL 'https://vcenter.domain.com:443/sdk' with session cookie.
2013-07-04 08:32:32 avvcbimage Info : Logging into URL 'https://vcenter.domain.com:443/sdk' with user 'DOMAINavamar' credentials.
2013-07-04 08:32:32 avvcbimage Info : Login(https://vcenter.domain.com:443/sdk) problem with reused sessionID='524408a5-443b-1d22-9878-6f5fe4de2816' contacting data center 'Cleveland'.

Resolution: Restart MCS Services on the EMC Avamar Grid.

Tagged , , , , , ,

vSA vSphere Storage Appliance Performance Benchmark Test

The article below goes into depth with my experience with VMware vSA Performance Benchmark Testing. I’ve tried to be as detailed as possible to give you a complete picture of my findings. I believe that there is a space where storage virtualization may thrive but with recent experience with the VMware vSA product, I am less than satisfied with the results, manageability and most of all performance. I believe storage virtualization has a few more years until maturity until it can be truly considered a serious candidate in the small & remote office scenarios.  This statement holds true for other two/three node storage virtualization technologies including Falconstor’s storage virtualization.

VMware Version Information

VMware vCenter Server 5.1.0, 947673
VMware vStorage Appliance 5.1.3, 1090545
VMware ESXi 5.1 U1, HP OEM Bundle, 1065491 (VMware-ESXi-5.1.0-Update1-1065491-HP-5.50.26.iso)

HP ProLiant DL385 G2 Hardware Configuration
– 4 CPUs x 2.6 GHz
– Dual-Core AMD Opteron Processor 2218
– AMD Opteron Generation EVC Mode
– HP Smart Array P400, 512MB Cache, 25% Read / 75% Write
– RAID-5, 8x 72 GB 10K RPM Hard Drives
– HP Service Pack 02.2013 Firmware

vStorage Appliance Configuration
– 2 Node Cluster
– Eager Zero Full Format
– VMware Best Practices

IOZone Virtual Machine Configuration
– Oracle Linux 6.4 x86_64
– 2 vCPU
– 1 GB Memory
– 20 GB Disk, Thick Eager Zero Provisioned
– VMware Tool 9.0.5.21789 (build-1065307)

IOZone Test Paramaters
/usr/bin/iozone -a -s 5G -o

-a   Used to select full automatic mode. Produces output that covers all tested file operations for record sizes of 4k to 16M for file sizes of 64k to 512M.

-s #   Used to specify the size, in Kbytes, of the file to test. One may also specify -s #k (size in Kbytes) or -s #m (size in Mbytes) or -s #g (size in Gbytes).

-o   Writes are synchronously written to disk. (O_SYNC). Iozone will open the files with the O_SYNC flag. This forces all writes to the file to go completely to disk before returning to the benchmark.

VMware ESXi/vSA Network Configuration

VMware vSA Architecture

IOZone Performance Benchmark Results

vSA Read Graph

vSA Stride Read Graph

vSA Random Read Graph

vSA Backward Read Graph

vSA Fread Graph

vSA Write Graph

vSA Random Write Graph

vSA Record Rewrite Graph

vSA Fwrite Graph

Download RAW Excel Data

Summary

The vSA performed far less than the native onboard storage controller which was expected due to the additional layer of virtualization. I honestly expected better performance out of the 8-disk RAID-5 even without storage virtualization since they were 10,000 RPM drives. On average, across all the tests there is 76.3% difference between the native storage and the virtualized storage! Wow! That is an expensive down grade! I understand that the test bed was using not the latest and greatest hardware but in general terms of disk performance is generally limited by the spinning platter. I would really be interested in seeing the difference using newer hardware.

I believe this only depicts a fraction of the entire picture, performance. There is other concerns that I have at the moment with storage virtualization such as complexity and manageability. I found the complexity to be very frustrating while setting up the vSA, there are many design considerations and limitations with this particular storage virtualization solution most of which were observed during the test trails. The vSA management is a Flash-based application which had it’s quirks and crashes as well. Crashes at a storage virtualization layer left me thinking that this would be a perfect recipe for data loss and/or corruption. In addition, a single instance could not manage multiple vSA deployments due to IP addressing restrictions which was a must for the particular use-case which I was testing for.

For now, storage virtualization is not there yet in my opinion for any production use. It has alot of room to grow and I will certainly be interested in revisiting this subject down the road since I believe in the concept.

Reference Articles That May Help 

Tagged , , , , , , ,

VMware vMotion View Horizons Replica VDI

When working with a linked-clone replica in VMware View Horizons, you must unprotect the linked-clone base image prior to vMotioning the linked-clone base image to another data store. The following commands will enable you to unprotect the base image, vMotion the base image to the new datastore then to finally reprotect the base image for continued use by VMware View Horizons.

1. Disable provisioning for the VMware View Pool.
2. Change settings in VMware View to reflect the new datastore to use.
3. Unprotect replica.

sviconfig -operation=UnprotectEntity -DsnName=<dsnname> -DbUsername=<dbusername> -DbPassword=<dbpassword> -VcUrl=https://<vcenterurl>/sdk -VcUsername=<username> -VcPassword=<password> -InventoryPath=//vm/VMwareViewComposerReplicaFolder/ -Recursive=true

4. vMotion replica.
5. Reprotect replica.

sviconfig -operation=ProtectEntity -DsnName=<dsnname> -DbUsername=<dbusername> -DbPassword=<dbpassword> -VcUrl=https://<vcenterurl>/sdk -VcUsername=<username> -VcPassword=<password> -InventoryPath=//vm/VMwareViewComposerReplicaFolder/ -Recursive=true

6. Re-enable provisioning.

Reference : http://kb.vmware.com/kb/1008704

Tagged , , , , ,

vSphere Inventory Search 403 Query Service Failed Forbidden

The 403 error which I encountered was tied to the vSphere Client Login Screen. If the “Use Windows Session Credentials” is checked this would cause the 403 errors when searching for a virtual machine.

The work around is to type in the username and password you are authenticating as which bypasses and saved session credentials.

There are many articles about the 403 Forbidden error using the Search Inventory feature within the vSphere Client. There are a differnt range of solutions and work arounds including reinstalling vCenter! Reinstalling vCenter just didn’t sit well with me since that can be a long process depending on your environment and how your organization responds to certain changes.

Login to the query service failed. The server could not interpret the communcation from the client. (The remote server returned an error: (403) Forbidden.)

404 Error vSphere Login Screenshot

When investigating the log files on the vCenter Server for the Inventory Service located at:

       “C:ProgramDataVMwareInfrastructureInventory ServiceLogsds.log”

The following two errors stood out when related to this issue:

       “WARN com.vmware.vim.vcauthorization.impl.AuthorizationManagerImpl]
Unable to find user data for user: DOMAINUser”

       “ERROR com.vmware.vim.vcauthorization.impl.PrincipalContextImpl]
Failed to get group memembership”


Root Cause:
After reaching out to support, it turns out that the issue is at login. If the following option is used, “”. It can cause a 403 Error when using the Inventory search.

Work Around: Type in the username and password, even if it is the same identity you are logging in as.

vSphere Login 404 Workaround Screenshot

VMware Support Suggested Permanent Fix: Upgrade to vCenter Server 5.1 Update 1.

Tagged , , , , ,

VMware vSA vSphere Storage Appliance Installation Parameters

During recent troubleshooting of installing VMware’s vSphere Storage Appliance, I’ve under covered two installation parameters which could be needed depending on your environment and installation.

The first one allows you to change the username and password that the vSA will use to connect to vCenter with.

VMware-vsamanager.exe /v"VM_SHOWNOAUTH=1"

VMware vSA Install Screenshot

The other parameter, allows you to specify the vCenter IP addresss or FQDN.

VMware-vsamanager.exe /v"VM_IPADDRESS=<fqdn/ip>"

VMware vSA Install Screenshot

… or the two can be combined like

VMware-vsamanager.exe /v"VM_SHOWNOAUTH=1 VM_IPADDRESS=<fqdn/ip>"
Tagged , , , ,

Failed to Start Migration Pre-copy Error 0xbad003f vMotion Migration Fix

“A general system error occurred: Failed to start migration pre-copy. Error 0xbad003f. Connection closed by remote host, possibly due to timeout.”
“A general system error occurred: Failed to start migration pre-copy. Error 0xbad004b. Connection reset by peer.”

Another issue, that I recently came across was a live vMotion issue where the vMotion migration would fail during the pre-copy and always at 10%. The following issues were either one of the two:

VMware vCenter vSphere Event Log

I performed some basic troubleshooting such as a vmkping. I used the ping command and watched the response times remain consistent during the attempted vMotion migration. No packets were being lost which I thought that there would be packet loss if there was an issue with Layer 3 IP addressing.

VMware ESXi vmkping

While still on the command line with the ESXi host, I decided to look for any arp entries anyways regardless of my logic to rule it out. I ran the following:

cat /var/log/vmkernel | grep arp

I was wrong, there was another host on the network that had the same IP address!

VMware ESXi Log

I found a new IP address for my VMKernel, updated DNS then updated the IP address on the ESXi host and my issue was resolved!

Tagged , , , , , ,

Increase Login Timeout vSphere Client – KB1002721

Default the login in time is 20 seconds, in VMware KB 1002721 it recommends bumping up the client login time out up to 60 seconds. Below is the snippet of code to run against your VMware database to bump up login times from 20 seconds to 60 seconds.

Recently, there was an interesting issue after upgrading all our vCenter instances from 5.0 to 5.1. As a result, of the upgrade one of the particular items that was most notable was logins were starting to fail. This occurred randomly and could not find a particular reason for this behavior.

After the parameter is changed, the VMware vCenter Server service must be restarted for this parameter to take effect!

UPDATE [dbo].[VPX_PARAMETER]
SET [VALUE]='60'
WHERE [NAME]='client.timeout.normal'
Tagged , , , , ,