You can create a Hadoop Template virtual machine using a customized version of CentOS 6.x operating system that includes VMware Tools for CentOS 6.x.

You can create a Hadoop Template virtual machine using CentOS 6.2 or 6.4 Linux as the guest operating system into which you can install VMware Tools for CentOS 6.x in combination with a supported Hadoop distribution. This allows you to create a Hadoop Template virtual machine using your organization's preferred operating system configuration. When you provision Big Data clusters using the customized template the VMware Tools for CentOS 6.x will be in the virtual machines that are created from the Hadoop Template virtual machine.

If you create Hadoop Template virtual machines with multiple cores per socket, when you specify the CPU settings for the virtual machine you must specify a multiple of cores per socket. For example, if the virtual machine uses two cores per socket, the vCPU settings must be an even number. For example: 4, 8, or 12. If you specify an odd number the cluster provisioning or CPU resizing will fail.

Deploy the Big Data Extensions vApp. See Deploy the Big Data Extensions vApp in the vSphere Web Client.

Obtain the IP address of the Serengeti Management Server.

Locate the VMware Tools version that corresponds to the ESXi version in your data center.

1

Create a virtual machine template with a 20GB thin provisioned disk, and install CentOS 6.x.

a

Download the CentOS 6.x installation package from www.centos.org to a datastore.

b

From vCenter Server, create a virtual machine template with a 20GB thin provision disk and select CentOS 6 (64-bit) as the Guest OS.

c

Right-click the virtual machine, select CD Device, and select the datastore ISO file for the centosversionISO.

d

Under Device Status, select connected and connect at power on, and click OK.

e

From the console window, install the CentOS 6.x operating system using the default settings.

You can select the language and time zone you want the operating system to use, and you can specify that the swap partition use a smaller size to save disk space (for example, 500 MBs). The swap partition is not used by Big Data Extensions, so you can safely reduce the size.

The new operating system installs into the virtual machine template.

2

After you have successfully installed the operating system into the virtual machine template, verify that you have network connectivity by running the ipconfig command. This procedure assumes the use of Dynamic Host Configuration Protocol (DHCP).

If IP address information appears, skip to step 4.

If there is no IP address information shown, which is the case when DHCP is configured, continue with step 3.

3

Configure the network.

a

Using a text editor open the /etc/sysconfig/network-scripts/ifcfg-eth0 file.

b

Locate the following parameters and specify the following configuration.

DEVICE=eth0
ONBOOT=yes
BOOTPROTO=dhcp
c

Save your changes and close the file.

d

Restart the network service using the commands sudo service network stop and sudo service network start.

sudo service network stop
sudo service network start
e

Verify that you have connectivity by running the ifconfig command.

4

Install JDK 6u31-linux-64-rpm.

a

From the Oracle® Java SE 6 Downloads page, download jdk 6u31-linux-x64-rpm and copy it to the virtual machine template's root folder.

b

Set the file access.

chmod a+x jdk-6u31-linux-x64-rpm.bin
c

Run the following command to install the JDK in the /usr/java/default directory.

./jdk-6u31-linux-x64-rpm.bin
d

Edit /etc/environment and add the following line JAVA_HOME=/usr/java/default to your environment.

This adds the /usr/java/default directory to your path.

e

Restart the virtual machine.

5

Install VMware Tools for CentOS 6.x.

a

Right click the CentOS 6 virtual machine in Big Data Extensions, then select Guest > Install/Upgrade VMware Tools.

b

Login to the virtual machine and mount the CD-ROM to access the VMware Tools installation package.

mkdir /mnt/cdrom
mount /dev/cdrom /mnt/cdrom
mkdir /tmp/vmtools
cd /tmp/vmtools
c

Run tar xf to extract the VMware Tools package tar file.

tar xf VMwareTools-*.tar.gz
d

Make vmware-tools-distrib your working directory, and run the vmware-install.pl script.

./vmware-install.pl

Press Enter to finish the installation.

e

Remove the vmtools temporary (temp) file that is created as an artifact of the installation process.

rm -rf /tmp/vmtools
6

(Optional) You can create a snapshot to use for recovery operations. In the vSphere Web Client, right-click the virtual machine and select Snapshot > Take Snapshot.

7

Deploy the Big Data Extension vApp. See Deploy the Big Data Extensions vApp in the vSphere Web Client.

8

Run the installation scripts to customize the local lib with CentOS 6.x.

a

Download the scripts from https://deployed_serengeti_server_IP/custos/custos.tar.gz.

b

Make the virtual machine template directory your working directory.

c

Run tar xf to uncompress the tar file.

tar xf custos.tar.gz
d

Run the installer.sh script specifying the /usr/java/default directory path.

./installer.sh /usr/java/default
9

Remove the /etc/udev/rules.d/70-persistent-net.rules file to prevent increasing the eth number during the clone operation.

If you do not remove the /etc/udev/rules.d/70-persistent-net.rules file, the virtual machine cloned from the template cannot get an IP address. If you power on this virtual machine to make changes, you must remove this file before shutting down this virtual machine.

10

Shut down virtual machine.

11

If you created a snapshot as described in Step 6, you must delete it. In the vSphere Web Client, right-click the virtual machine, select Snapshot > Snapshot Manager, select the serengeti-snapshot, and click Delete.

12

In the vSphere Web Client, edit the template settings, and deselect (uncheck) all devices, including the CD ROM and floppy disk.

13

Replace the original Hadoop Template virtual machine with the custom CentOS-enabled virtual machine that you have just created.

a

Move the original Hadoop Template virtual machine out of the vApp.

b

Drag the new template virtual machine that you just created into the vApp.

14

Log in to the Serengeti Management Server as the user serengeti, and restart the Tomcat service.

$ sudo service tomcat restart

Restarting the Tomcat service enables the custom CentOS virtual machine template, making it your Hadoop Template virtual machine.

If in the future you intend to modify or update the Hadoop Template virtual machine operating system, you must remove the serengeti-snapshot that is automatically created each time you shutdown and restart the virtual machine . See Maintain a Customized Hadoop Template Virtual Machine.