Tag Archives: troubleshooting

VMware STS Clients Failed SSL Certificate of STS Service Cannot Be Verified

“Initialization of STS Clients failed. Root Cause: The SSL certificate of STS service cannot be verified” is an error which put a delay in deployment of the vShield Manager.

VMware STS Clients Failed Error

During the configuration of the Lookup Service Information, we encountered this particular error. It important to understand how the environment was designed when we hit this error and why it didn’t seem to make sense at first .

There are two sites, Site A and Site B, in a hybrid vCenter 5.1 configuration running vCenter 5.5 Single Sign-On and Web Client on their own dedicated virtual machines, SSO1 and SSO2. vCenter 5.5 Single Sign-On and the Web Client both reside on the same server, one in each site. There are a total of 5 vCenter Servers that are at 5.1 U1/U2 versions. Each vCenter is pointed at their corresponding site/geographic regions’ vCenter 5.5 Single Sign-On and Web Client server.

VMware Single Sign-On SSO Architecture

This model is fully supported by VMware per KB2059249 and has proven to be an ideal deployment model in the vCenter 5.1 product family than the initial release of Single Sign-On 5.1.

The vShield Manager was deployed at Site B and we used Site B’s SSO and Web Server address when configuring the Lookup Service. After research, internet forums indicated that the certificate of the SSO server, chain and root certificates needed to be bundled into a single certificate and installed on the STS server. This did not make sense since no certificates were manually generated for use by the SSO servers. All SSO certificates were generated during installation and we’re self signed by the VMware SSO installer.

VMware STS Clients Failed Error

While working with a co-worker to troubleshoot the issue above, it occurred to me to list all services that the SSO server see’s to determine what STS service that the SSO server was using. After issuing the following command on the SSO server:

ssolscli listServices https://cgvccore2.fqdn:7444/lookupservice/sdk

Output:

VMware STS Clients Failed Error Proof

The urn:sso:sts service was listed with Site A’s registered URL! It completely slipped my mind that there was only one STS server listed in any SSO instance. We updated the Lookup Service Information Host URL and the “Initialization of STS Clients failed. Root Cause: The SSL certificate of STS service cannot be verified” issue was resolved!

VMware STS Clients Failed Error Resolved

Note: This is single point of failure, it would be best to load balance the STS service. There are articles to update where the STS service is pointing to the event of a failure if a load balance model is not implemented initially.

Tagged , , , , , ,

EMC PowerPath Internal Error Migrations May Be Pending Fix

A host side migration between arrays can be a nerve racking task especially when you come across issues. Data loss is a constant fear in the back of your mind and what is your fail-back plan should you need to execute it. During a PowerPath migration, I learned the hard way that a host side copy of the boot-from-san lun is NOT supported. After setting up the migration and upon the sync command the Windows machine froze to a halt until it went offline.

After troubleshooting it was clear that the EMC PowerPath Migration Enabler Service needed to be disabled for the Windows machine to fully boot. After enabling EMC PowerPath Migration Enabler after the host was booted would immediately cause the Windows host to go unresponsive and hard power cycle was the only fix.

I could not start the PowerPath Migration Enabler service to abort the session since it would immediately freeze the server and secondly I was unable to uninstall PowerPath Migration Enabler since there was a session pending. I was in a pickle!

EMC PowerPath PPME Removal Migration Pending

After a support ticket with EMC, the resolution requires you to manually remove the PowerPath Migration Enabler database and keys within the registry. After preforming a few deletions then you will be able to star the service successfully without freezing your server and with no active sessions going.

  1. Delete the UMD by deleting the files from C:\Program Files\EMC\PPME\db*.* 
  2. Delete the all subkeys with Prefix “dm_” EXCEPT for dev_conf under, HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\EmcPowerPath\KMD_*.
    The Keys would be dm_ac, dm_control_io_to_clones, dm_funnel_io, dm_wc.EMC PowerPath PPME Removal Migration Pending_Registry
  3. Reboot.

 

Tagged , , , ,

YaBB SE 1.5.5 MySQL Unknown Column Fix

For retro purposes, I wanted to make an old instance of YaBB SE 1.5.4 a read-only version to look back with friends to get some kicks and laughs. I first upgraded from YaBB SE 1.5.4 to YaBB SE 1.5.5c. This was a small project try bring this up on a modern operating system, primarily around the MySQL versioning. First I had to stand up a temporary virtual machine running CentOS 4.5 Linux. Followed by that, I was able to successfully to restore the backups that I had to the virtual machine as it was in the early 2000’s. An issue occurred when trying to run MySQL 4.x queries on a MySQL 5.x version. PHP did not pose the problem even though all functions within YaBB SE were built for the 4.x version of PHP. Below are my findings to get the MySQL SQL 4.x queries to work properly on MySQL 5.x instance. Please refer to the line number for the corresponding files.

Error:
Unknown column ‘m.ID_MEMBER’ in ‘on clause’
File: /home/www/Sources/MessageIndex.php
Line: 269

Original Code:

$result = mysql_query("
			SELECT t.ID_LAST_MSG, t.ID_TOPIC, t.numReplies, t.locked, m.posterName, m.ID_MEMBER, IFNULL(mem.realName, m.posterName) AS posterDisplayName, t.numViews, m.posterTime, m.modifiedTime, t.ID_FIRST_MSG, t.isSticky, t.ID_POLL, m2.posterName as mname, m2.ID_MEMBER as mid, IFNULL(mem2.realName, m2.posterName) AS firstPosterDisplayName, m2.subject as msub, m2.icon as micon, IFNULL(lt.logTime, 0) AS isRead, IFNULL(lmr.logTime, 0) AS isMarkedRead
			FROM {$db_prefix}topics as t, {$db_prefix}messages as m, {$db_prefix}messages as m2
				LEFT JOIN {$db_prefix}members AS mem ON (mem.ID_MEMBER=m.ID_MEMBER)
				LEFT JOIN {$db_prefix}members AS mem2 ON (mem2.ID_MEMBER=m2.ID_MEMBER)
				LEFT JOIN {$db_prefix}log_topics AS lt ON (lt.ID_TOPIC=t.ID_TOPIC AND lt.ID_MEMBER=$ID_MEMBER)
				LEFT JOIN {$db_prefix}log_mark_read AS lmr ON (lmr.ID_BOARD=$currentboard AND lmr.ID_MEMBER=$ID_MEMBER)
			WHERE t.ID_TOPIC IN (" . implode(',', $topics) . ")
				AND m.ID_MSG=t.ID_LAST_MSG
				AND m2.ID_MSG=t.ID_FIRST_MSG
			ORDER BY $stickyOrder m.posterTime DESC") or database_error(__FILE__, __LINE__);

Modified Code:

$result = mysql_query("
			SELECT t.ID_LAST_MSG, t.ID_TOPIC, t.numReplies, t.locked, m.posterName, m.ID_MEMBER, IFNULL(mem.realName, m.posterName) AS posterDisplayName, t.numViews, m.posterTime, m.modifiedTime, t.ID_FIRST_MSG, t.isSticky, t.ID_POLL, m2.posterName as mname, m2.ID_MEMBER as mid, IFNULL(mem2.realName, m2.posterName) AS firstPosterDisplayName, m2.subject as msub, m2.icon as micon, IFNULL(lt.logTime, 0) AS isRead, IFNULL(lmr.logTime, 0) AS isMarkedRead
			FROM {$db_prefix}topics as t, {$db_prefix}messages as m, {$db_prefix}messages as m2
				LEFT JOIN {$db_prefix}members AS mem ON (mem.ID_MEMBER=m.ID_MEMBER)
				LEFT JOIN {$db_prefix}members AS mem2 ON (mem2.ID_MEMBER=m2.ID_MEMBER)
				LEFT JOIN {$db_prefix}log_topics AS lt ON (lt.ID_TOPIC=t.ID_TOPIC AND lt.ID_MEMBER=$ID_MEMBER)
				LEFT JOIN {$db_prefix}log_mark_read AS lmr ON (lmr.ID_BOARD=$currentboard AND lmr.ID_MEMBER=$ID_MEMBER)
			WHERE t.ID_TOPIC IN (" . implode(',', $topics) . ")
				AND m.ID_MSG=t.ID_LAST_MSG
				AND m2.ID_MSG=t.ID_FIRST_MSG
			ORDER BY $stickyOrder m.posterTime DESC") or database_error(__FILE__, __LINE__);

Error:
Unknown column ‘b.ID_LAST_TOPIC’ in ‘on clause’
File: /home/www/Sources/Recent.php
Line: 45

Original Code:

$request = mysql_query("
	SELECT m.posterTime, m2.subject, m.ID_TOPIC, t.ID_BOARD, m.posterName, t.numReplies, t.ID_FIRST_MSG
	FROM {$db_prefix}boards AS b, {$db_prefix}categories AS c
		LEFT JOIN {$db_prefix}topics AS t ON (t.ID_TOPIC=b.ID_LAST_TOPIC)
		LEFT JOIN {$db_prefix}messages AS m ON (m.ID_MSG=t.ID_LAST_MSG)
		LEFT JOIN {$db_prefix}messages AS m2 ON (m2.ID_MSG=t.ID_FIRST_MSG)
	WHERE c.ID_CAT=b.ID_CAT
		AND (FIND_IN_SET('$settings[7]', c.memberGroups) != 0 OR c.memberGroups='' OR '$settings[7]' LIKE 'Administrator' OR '$settings[7]' LIKE 'Global Moderator')
	ORDER BY m.posterTime DESC
	LIMIT 1;") or database_error(__FILE__, __LINE__);

Modified Code:

$request = mysql_query("
	SELECT m.posterTime, m2.subject, m.ID_TOPIC, t.ID_BOARD, m.posterName, t.numReplies, t.ID_FIRST_MSG
	FROM ({$db_prefix}boards AS b, {$db_prefix}categories AS c)
		LEFT JOIN {$db_prefix}topics AS t ON (t.ID_TOPIC=b.ID_LAST_TOPIC)
		LEFT JOIN {$db_prefix}messages AS m ON (m.ID_MSG=t.ID_LAST_MSG)
		LEFT JOIN {$db_prefix}messages AS m2 ON (m2.ID_MSG=t.ID_FIRST_MSG)
	WHERE c.ID_CAT=b.ID_CAT
		AND (FIND_IN_SET('$settings[7]', c.memberGroups) != 0 OR c.memberGroups='' OR '$settings[7]' LIKE 'Administrator' OR '$settings[7]' LIKE 'Global Moderator')
	ORDER BY m.posterTime DESC
	LIMIT 1;") or database_error(__FILE__, __LINE__);
Tagged , , ,

vCenter 5.1 Single Sign-on (SSO) Unable to expose the remote JMX registry. Port value out of range: -1

VMware vCenter 5.1 Single Sign-on can pose many problems since Single Sign-on has been introduced until VMware’s replacement with the 5.5 version of Single Sign-On. If you are required to still use vCenter’s 5.1 Single Sign-on server and experience the following “Unable to expose the remote JMX registry” or “Port value out of range: -1” the resolution is simple but let’s first identify this is the issue by analyzing the catalina log.

The following is an example from, C:Program FilesVMwareInfrastructureSSOServerlogscatalina.2013-09-02.log.

02-Sep-2013 00:38:51.903 INFO [WrapperSimpleAppMain] com.springsource.tcserver.security.PropertyDecoder.<init> tc Runtime property decoder using memory-based key
02-Sep-2013 00:38:52.854 INFO [WrapperSimpleAppMain] com.springsource.tcserver.security.PropertyDecoder.<init> tcServer Runtime property decoder has been initialized in 960 ms
02-Sep-2013 00:38:56.364 INFO [WrapperSimpleAppMain] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["http-bio-7445"]
02-Sep-2013 00:38:56.396 INFO [WrapperSimpleAppMain] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["http-bio-7444"]
02-Sep-2013 00:38:56.396 INFO [WrapperSimpleAppMain] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["http-bio-7080"]
02-Sep-2013 00:38:56.396 INFO [WrapperSimpleAppMain] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["ajp-bio-7009"]
02-Sep-2013 00:38:57.784 SEVERE [WrapperSimpleAppMain] com.springsource.tcserver.serviceability.rmi.JmxSocketListener.init Unable to expose the remote JMX registry.
 java.lang.IllegalArgumentException: Port value out of range: -1
	at java.net.ServerSocket.<init>(ServerSocket.java:18
	... debug junk ...

In C:Program FilesVMwareInfrastructureSSOServerconfcatalina.properties, towards the bottom you will find the following variables:

base.shutdown.port=7005
base.jmx.port=-1
ajp-vm.http.port=7080

Change base.jmx.port to equal 6969. By default, -1 is disabled but causes SEVERE warnings in the Single Sign-On (SSO) log files.

base.shutdown.port=7005
base.jmx.port=6969
ajp-vm.http.port=7080

See for further detail about the base.jmx.port property at http://pubs.vmware.com/vsphere-51/index.jsp?topic=%2Fcom.vmware.vsphere.install.doc%2FGUID-ABF63FAB-711C-4C8D-87D7-E6FB73B98425.html

Tagged , , , , , ,

EMC Avamar Error 10007 VMX file is Suspiciously Small Fix

EMC Avamar Error 10007 is an interesting error with a little history behind it. After opening up a support case with EMC, they indicated that error 10007, vmx file is suspiciously small, is a legacy error message from the VMware vCenter 4.x version of vCenter. From what the support technician said, is that the VMX file would be backed up through web services. This all changed in vCenter 5.x. Our error 10007, was resolved by restarting the MCS services on the Avamar grid. What happened is that the cached credentials and session state became stale/invalid. The EMC Avamar grid would try to reuse that session information that was now invalid thus causing the Avmar 10007 error.

This may seem like a strange fix, but restarting MCS services on the Avamar grid resolved our issue. This resolution was the recommended fix from EMC support. The article, Backup job fails for all virtual machines after vSphere Data Protection 5.1 deployment (2038597), is published by VMware regarding a similar error but this was not the particular error which we were encountering.

The EMC Avamar Error message would look like:

2013-07-08 08:31:41 avvcbimage Error : vmx file  is suspiciously small (under 30 bytes), please examine the log on the Avamar Administrator for root cause analysis (Log #2)
2013-07-08 08:31:41 avvcbimage Error : Backup of VM metadata failed. (Log #2)
2013-07-08 08:31:45 avvcbimage Error : Avtar exited with 'code 163: externally cancelled' (Log #2)

The EMC Avamar proxy log would display the following messages:

avmproxy1:/usr/local/avamarclient/var-proxy-1 # egrep -i "sdk|reused" Default_Domain-1372926600107-e55a99675c5f5a260028ec81d88bf620d9678824-1016-vmimagel.log
2013-07-04 08:32:32 avvcbimage Info : Login(https://vcenter.domain.com:443/sdk) problem with reused sessionID='52f3ba9e-ff44-fa6d-ed6f-61b44fe35ec5' contacting data center 'Cleveland'.
2013-07-04 08:32:32 avvcbimage Warning : Problem logging into URL 'https://vcenter.domain.com:443/sdk' with session cookie.
2013-07-04 08:32:32 avvcbimage Info : Logging into URL 'https://vcenter.domain.com:443/sdk' with user 'DOMAINavamar' credentials.
2013-07-04 08:32:32 avvcbimage Info : Login(https://vcenter.domain.com:443/sdk) problem with reused sessionID='524408a5-443b-1d22-9878-6f5fe4de2816' contacting data center 'Cleveland'.

Resolution: Restart MCS Services on the EMC Avamar Grid.

Tagged , , , , , ,

VMware vMotion View Horizons Replica VDI

When working with a linked-clone replica in VMware View Horizons, you must unprotect the linked-clone base image prior to vMotioning the linked-clone base image to another data store. The following commands will enable you to unprotect the base image, vMotion the base image to the new datastore then to finally reprotect the base image for continued use by VMware View Horizons.

1. Disable provisioning for the VMware View Pool.
2. Change settings in VMware View to reflect the new datastore to use.
3. Unprotect replica.

sviconfig -operation=UnprotectEntity -DsnName=<dsnname> -DbUsername=<dbusername> -DbPassword=<dbpassword> -VcUrl=https://<vcenterurl>/sdk -VcUsername=<username> -VcPassword=<password> -InventoryPath=//vm/VMwareViewComposerReplicaFolder/ -Recursive=true

4. vMotion replica.
5. Reprotect replica.

sviconfig -operation=ProtectEntity -DsnName=<dsnname> -DbUsername=<dbusername> -DbPassword=<dbpassword> -VcUrl=https://<vcenterurl>/sdk -VcUsername=<username> -VcPassword=<password> -InventoryPath=//vm/VMwareViewComposerReplicaFolder/ -Recursive=true

6. Re-enable provisioning.

Reference : http://kb.vmware.com/kb/1008704

Tagged , , , , ,

Installing XAMPP + Xdebug on Oracle Linux 6.4 x86

This guide will show you how to install XAMPP with Xdebug (compiled) on an RedHat/Oracle Linux 6.4 x86 installation in a few simple steps.

Login in as root or su over to root to start with. Let’s start by making sure we have all the development tools that are necessary to compile the Xdebug library for XAMPP in addition bringing everything the system most up-to-date.

yum update -y
yum groupinstall "Development Tools" -y

Grab the download links from ApacheFriends for XAMPP and use wget to get XAMPP and the Development Packages.

cd ~
wget http://www.apachefriends.org/download.php?xampp-linux-1.8.1.tar.gz
wget http://www.apachefriends.org/download.php?xampp-linux-devel-1.8.1.tar.gz

Extract XAMPP and move to it’s permanent location.

tar -xzvf xampp-linux-1.8.1.tar.gz
mv lampp/ /opt

Extract XAMPP Development libraries and copy the include directory into the base of the lampp directory for use with the compiler.

tar -xzvf xampp-linux-devel-1.8.1.tar.gz
cp -r lampp/include /opt/lampp/.

Using PECL install Xdebug which will invoke the process to compile it the extension.

/opt/lampp/bin/pecl update-channels
/opt/lampp/bin/pecl install Xdebug

Edit the php.ini file to add the newly compiled Xdebug.

vi /opt/lampp/etc/php.ini

Add in the following lines at the end of the php.ini configration file.

zend_extension = "/opt/lampp/lib/php/extensions/no-debug-non-zts-20100525/xdebug.so"
xdebug.remote_enable = 1
xdebug.remote_handler = "dbgp"
xdebug.remote_host = "localhost"
xdebug.remote_port = 9000

Start/Restart XAMPP. Browse to the http://host/xampp/phpinfo.php page to ensure Xdebug was loaded properly.

Tagged , , , , , , , , ,

vSphere Inventory Search 403 Query Service Failed Forbidden

The 403 error which I encountered was tied to the vSphere Client Login Screen. If the “Use Windows Session Credentials” is checked this would cause the 403 errors when searching for a virtual machine.

The work around is to type in the username and password you are authenticating as which bypasses and saved session credentials.

There are many articles about the 403 Forbidden error using the Search Inventory feature within the vSphere Client. There are a differnt range of solutions and work arounds including reinstalling vCenter! Reinstalling vCenter just didn’t sit well with me since that can be a long process depending on your environment and how your organization responds to certain changes.

Login to the query service failed. The server could not interpret the communcation from the client. (The remote server returned an error: (403) Forbidden.)

404 Error vSphere Login Screenshot

When investigating the log files on the vCenter Server for the Inventory Service located at:

       “C:ProgramDataVMwareInfrastructureInventory ServiceLogsds.log”

The following two errors stood out when related to this issue:

       “WARN com.vmware.vim.vcauthorization.impl.AuthorizationManagerImpl]
Unable to find user data for user: DOMAINUser”

       “ERROR com.vmware.vim.vcauthorization.impl.PrincipalContextImpl]
Failed to get group memembership”


Root Cause:
After reaching out to support, it turns out that the issue is at login. If the following option is used, “”. It can cause a 403 Error when using the Inventory search.

Work Around: Type in the username and password, even if it is the same identity you are logging in as.

vSphere Login 404 Workaround Screenshot

VMware Support Suggested Permanent Fix: Upgrade to vCenter Server 5.1 Update 1.

Tagged , , , , ,

Failed to Start Migration Pre-copy Error 0xbad003f vMotion Migration Fix

“A general system error occurred: Failed to start migration pre-copy. Error 0xbad003f. Connection closed by remote host, possibly due to timeout.”
“A general system error occurred: Failed to start migration pre-copy. Error 0xbad004b. Connection reset by peer.”

Another issue, that I recently came across was a live vMotion issue where the vMotion migration would fail during the pre-copy and always at 10%. The following issues were either one of the two:

VMware vCenter vSphere Event Log

I performed some basic troubleshooting such as a vmkping. I used the ping command and watched the response times remain consistent during the attempted vMotion migration. No packets were being lost which I thought that there would be packet loss if there was an issue with Layer 3 IP addressing.

VMware ESXi vmkping

While still on the command line with the ESXi host, I decided to look for any arp entries anyways regardless of my logic to rule it out. I ran the following:

cat /var/log/vmkernel | grep arp

I was wrong, there was another host on the network that had the same IP address!

VMware ESXi Log

I found a new IP address for my VMKernel, updated DNS then updated the IP address on the ESXi host and my issue was resolved!

Tagged , , , , , ,

Increase Login Timeout vSphere Client – KB1002721

Default the login in time is 20 seconds, in VMware KB 1002721 it recommends bumping up the client login time out up to 60 seconds. Below is the snippet of code to run against your VMware database to bump up login times from 20 seconds to 60 seconds.

Recently, there was an interesting issue after upgrading all our vCenter instances from 5.0 to 5.1. As a result, of the upgrade one of the particular items that was most notable was logins were starting to fail. This occurred randomly and could not find a particular reason for this behavior.

After the parameter is changed, the VMware vCenter Server service must be restarted for this parameter to take effect!

UPDATE [dbo].[VPX_PARAMETER]
SET [VALUE]='60'
WHERE [NAME]='client.timeout.normal'
Tagged , , , , ,