All,
First of all, apologies if this is an inappropriate forum. I do not have a support contact for this issue. We are VMware customers, however our licenses are purchased through an academic reseller. Our concern is that this bug has crept into recent releases and perhaps gone unnoticed in QA. It's serious enough for us to consider moving to other hypervisor products.
I believe there may be a serious bug in OVFtool's disk read code. We observe that VMs are created OK after upload or import, but disk contents are zero-filled. No error conditions are reported at any time. It is only clear that the bug has been exercised, when an uploaded or imported VM fails to boot.
We have several research VMs with multiple LANs, which run FreeBSD 8.2 in VMware Fusion 5.0.2, MacOS X 10.8.2 Mountain Lion. I was also able to reproduce the bug with a fresh installation of vSphere ESXi 5.1 and ovftool 3.0.1. V
Using the separately packaged ovftool, I can export the VMs inside Fusion, by giving fully qualified paths to their VMX files. The bug appears to be in VMDK read. I exercised it in two use cases, Fusion OVA import and ESXi OVA upload, as follows:
- If I import an affected OVA into Fusion from the GUI, Fusion will invoke its own copy of ovftool 3.0.1 to perform the import. VMs are created with a sane VMX file. VMDKs are created. However their contents are zero-filled, and the VMDK contents in the OVA have been ignored.
- If I upload an affected OVA to ESXi, using ovftool and a vi:// target URL on the command line, the same bug is observed with ESXi. Again, the VMX is sane - we inspected this by SSHing into ESXi and manually reviewing the VMX on the VMFS data store with vi. The VMDK contents are zeroed on the VMFS datastore, and the VMDK contents in the OVA have been ignored.
ovftool seems, in some cases, to ignore the contents of the packaged VMDK files in OVAs, and those associated with VMX files from VMware products.
To date I have only been able to reproduce the issue with FreeBSD guests. The bug does not appear in all exported OVAs, however affected VM and OVA combinations will consistently demonstrate it.
- We first noticed this problem in October 2012, and have not been in a position to exercise it thoroughly until now.
- I can make an OVA available which demonstrates the issue. Please contact me privately to arrange. The files are typically 2GB in size.
Other observations:
- There is no problem with OVA generation. I have inspected the OVAs manually using tar, extracted the embedded VMDKs inside, converted them back to 2GB sparse format using vmware-vdiskmanager (or vmkfstools under ESXi), and loaded them into ESXi and Fusion respectively.
- Since ovftool was introduced, VMDKs must be uploaded manually using SFTP to avoid triggering the bug. The functionality in the vmware-vdiskmanager command which allows direct upload to ESXi servers has been removed.
- Casual inspection with DTrace under OS X reveals that ovftool seems to be reading the full OVA file contents. We initially thought there may been a problem with VMDK upload, however other guests have not been affected.
- The serialized format VMDK contents in the OVA are consistent with the original VMDKs. We verified this in one case with MD5 checksums. This requires denying all writes during the test.
- We also reproduced this bug with out-of-box, unmodified FreeBSD installations, as our research VMs use ZFS in GPT partition containers.
- We also noticed that Fusion's private copy of ovftool will break if ESXi specific options are specified in $HOME/.ovftool - although this is a separate issue from the main bug.
- No source code is supplied for ovftool, so users are not able to fix this issue themselves.
Workarounds:
- We load the OVA into Oracle VirtualBox, and re-export the OVA from there.
- This is undesirable for many reasons, the main one being that it loses all the network binding information, which is critical to our work.
thanks,
Bruce