Create a vCenter Content Library using AWS S3 - Part 2

Gilles Chekroun
Lead NSX Systems Engineer - VMware Europe
10-April-2019 indexing script updated for SDDC ver 1.6!!
23-Sept-2018  UPDATE BELOW !!
A few months ago, I wrote a first part about creating a VMware Cloud on AWS vCenter Content Library using AWS S3. It was not an ideal solution since the indexing was done on the local machine.
Working together with Eric Yanping Cao and William Lam, we laid the foundation for having the complete Content Library in AWS S3 and indexing it directly in S3 without having to transfer any images locally.
Congratulations to Eric who did a fantastic job in python3 using boto3 to access the root S3 bucket and indexing it.

File Structure

Although AWS S3 is an object oriented service, we can have a pseudo file structure with a root bucket and subsequent folders below it with the various VM templates or ISO files.
NOTE: Make sure you don't have any spaces in the folders name !
For my tests I am using the following structure:

└── ContentLib
    ├── DSL4-4-10
    │   ├── DSL-4.4.10-disk1.vmdk
    │   ├──
    │   └── DSL-4.4.10.ovf
    ├── ISO
    │   ├── nanolinux64-1.3.iso
    │   └── ubuntu-16.04.4-server-amd64.iso
    ├── PhotonOS
    │   ├── photon-ova-disk1.vmdk
    │   ├── photon-ova.cert
    │   ├──
    │   └── photon-ova.ovf
    └── Small-vApp
        ├── Small-vApp-disk1.vmdk
        ├── Small-vApp-disk2.vmdk
        └── Small-vApp.ovf

AWS S3 permissions

Since we are going to subscribe to that library in S3, we need to grant read access to the root bucket. This access can be fine tuned with specific restrictions if needed following the AWS examples here.

Script usage 

NOTE: With the lates release of VMC 1.5, there are some improvements on the Content Library. The python script below has been updated to include:
1.      added an option --skip-cert to skip .cert file for OVF templates, by default it is true
2.      add contentVersion and generationNum tags to item json. 
The generationNum will match the last modified date of the S3 object. If the S3 object is changed, the generationNum will update and the contentVersion will increment by 1.
The rest is unchanged.


If not already done, install AWS cli following instructions here
Run aws configure command and insert your Access key, Secret key and region
This will create ~/.aws/config and credentials files. Make sure they are present

Then, download the python script here. It uses python 3 so make sure you have the proper version and if not, use virtualenv.

python3 -n test -t s3 -p gchek-s3-bucket/ContentLib

-n / --name: library name
-t / --type:    storage type of the library: either local or s3, default type is local
-p / --path:   storage path of the library, the pattern is <bucket-name>/<folder-path> for the path on a S3 bucket
--etag:         whether or not etag will be generated for each file, it is true by default


1. Support S3 bucket root as library path, i.e. <bucket-name>/

2. If a child folder under the library folder contains only ISO images, the script will create a folder for each ISO image and create the item JSON for the image there. The item.json and image location will be linked properly in the S3 bucket.

3. Each ISO item name will be the name of the ISO image.

4. If one such ISO image file is deleted later, the previous created ISO item JSON folder will be deleted.

Open vCenter

From the main Menu, select Content Libraries
Add a new Content Library
Give it a name
Subscribe to the "lib.json" from S3 indexing

This link is the URL from the "lib.json" file created by the script
Skip the authenticity warning, select the Workload Datastore and click Finish
Note the ZERO byte size of the new Content Library
Click on the newly created Library and note the 3 template folders (still ZERO bytes). At this time, the Type is "Unknown" because no VMs have been deployed using these templates yet.
And the 2 ISO files

Deploy a new VM from this Template

Give a new name, select Compute resources, accept Licence, select Workload Datastore, select network to plug in and click Finish
Note that at this stage, the template is synced on the local storage.
We can now Power-on our Photon-OS VM
Once the VM is deployed, we can save storage in the Content Library and "un-sync" or delete the library item without damaging it.
Acknowledge the warning about the Library item deletion and check the final Content Library stays at ZERO Bytes.


  1. Was able to use this method for a VMware Cloud on AWS environment, it does pop-up an error during sync however doesn't prevent the OVF's from deploying. This has saved me a massive amount of time in destroy-redeploying out SDDCs in validation testing in a VMC on AWS with Horizon environment (using Windows Images and other Appliances). I was also able to lock down the S3 Bucket to Specific IP access and the VPC and i highly recommend it.

    I really appreciate the information in this article as its extremely useful!

    Here are some of my lessons learned that might help others.
    1) Uploading the OVFs to the S3 Bucket i highly recommend using the aws cli as i had multiple issues uploading via the GUI with such large files (was slower and files would fail to upload).
    2) Using OVAs the Content Library didn't recognize (with Appliances like F5 and Horizon UAG) you must convert the OVA to OVF then upload to the Content Library.
    3) You can upload more images to the bucket even when connected to the content library just have to re-run the script after all your changes are done then sync the Content Library and all is well :)

  2. There is one part of this process that is not clear and when I tried to recreate the solution it failed at this point. When setting up the Subscribed Content Library the process fails due to S3 authentication issues.

    How/where do I configure the S3 access_id and access_key on the vCenter, so that these credentials are used during the Content Library creation?

    This process is documented earlier in the blog, but in relation to the indexing script but it is not made clear how this is configured on the vCenter.

    1. The S3 Content Library is indexed with the python script. On vCenter level you subscribe to the "lib.json" file created with the script. S3 bucket should have read access from the VMC attached VPC so you can use the ENI. Look at the paragraph "AWS S3 permissions" and the link to adapt to your needs and environment.

  3. ova/***.ova was recognized correctly as my tips


Post a Comment


Egress VPC and AWS Transit Gateway (Part1)

AWS Transitive routing with Transit Gateways in the same region

Build a VMware Cloud on AWS Content Library using AWS S3