I've recently been required to use virtualisation increasingly in order to complete some project work. In order to get up to speed regarding the technologies currently available I've been working my way through VCP certification books. In this post are some of my research notes for the VCP-310 Exam.
Note that while this post is for VMWare, much of the same functionality is also supported by the Open Source Equivalent Xen (though the methods for interfacing and configuring these features mightn't be as slick at this stage) or through its enterprise equivalent XenServer. I may post my research notes regarding Xen but I feel its more likely that I may contribute towards the project in another way and as it stands a lot of the documentation out there is already of a high standard (though slightly fragmented).
Note, that the VI client/VirtualCentre is not the only method of monitoring/configuration for VMware. Its possible to use the Service Console to achieve many basic tasks.
Introducing VMware Infrastructure 3
Virtualisation is the process of adding an additional layer between hardware and software to allow multiple instances of any Operating System to be installed on the same set of hardware (desktop or server) through the concept of virtualisation of hardware. Control between the top software layer and the underlying hardware is handled through software known as a 'Hypervisor'.
Why You Need Virtualisation
It allows for consolidation of hardware, reduced power consumption,
Types of Virtualisation
Bare Metal - ESX/ESXi, XenServer
Host-Based Virtualisation - VMware Workstation/Server, MS Virtual PC, MS Hyper-V, VirtualBox
Application Virtualisation - Cameyo
Storage Virtualisation - StarWind, OpenFiler, FreeNAS
Virtual Machine Overview
Hardware is essentially 'virtual' and can be added/removed/configured at will through Hypervisor
*.vmx - VMware hardware configuration/specification details are supplied in this file
*.vmdk - VMware file which contains the hard drive equivalent of a physical hard drive. It may be split into smaller files to allow for compatibility with the underlying OS/filesystem (if we are talking about Type 2 hypervisors and older OSs/filesystems of course)
Isolation - one VM is independent of another
Encapsulation - VM's are contained inside of files
Hardware independence - VM's are unaware of actual hardware. Interaction is conducted through Hypervisor
Compatibility - standard x86/x64 architecture
Simulation and Emulation
Simulation - subset of the real thing (flight simulator)
Emulation - attempt to port hardware to software (SNES emulator)
Virtualisation - install software on virtual hardware (hypervisor)
Virtual and Physical Machine Comparison
Physical - underutilise resources, hardware bound, replication complex
Virtual - not hardware bound, standard x86 environment, collection of files can be easily managed
Why VMware Infrastructure 3?
Mainly enterprise features such as VMotion, Distributed Resource Scheduler (DRS) which allows for automated switching of VM's from one host to another if there are resource disparities and performance problems exist on the current host but not another
VMWare Infrastructure 3 Suite
ESX Vs ESXi - main difference is that ESX uses a Service Console (SC) as its management system unlike ESXi which relies of a seperate management utility but has a smaller memory footprint (32MB)
Virtual Symmetric Multi-Processing (SMP) - virtual CPU's using both logical and physical
Virtual Centre - control ESX from a Windows application
VMotion - allows live, transparent movement of VM's from one ESX host to another
Storage VMotion - allows live, transparent movement of VM files from one ESX host to another
Update Manager - handles patching of host as well as VMs
Convertor - convert physical to virtual
High Availability - restart a VM on a different ESX host
Distributed Resource Scheduler (DRS) -
Consolidated Backup (CB) -
Virtual Machine Filesystem (VMFS) - designed to be a lightweight filesystem designed for any VM's under ESX/i
The VMkernel
Proprietary kernel made by VMWare that acts as a bare metal hypervisor/resource regulator.
The Service Console
- Apache Tomcat Web Server
- Firewall
- SSH access
- SNMP agents
The VI Client
Use this Windows application to connect to an ESX host and/or Virtual Centre.
Planning, Installing, and Configuring ESX
Minimum Hardware Requirements
These change from one version to another. Please consult the the VMware website for more up to date details.
Disk Partitioning
Maximum of 4 primary partitions. Using extended partitions can increase this number. Maximum number of IDE disks is 63 while the maximum number for SCSI disks is 15.
/boot - boot files
/ - root of the SC OS
swap - swap partition
/var/log - log files
VMFS-3 - VM's are stored here
vmkcore - dump information is stored here after a system crash or Purple Screen of Death (PSOD)
Installation Using a CD-ROM
Basically identical to a Linux distribution installation. Graphical and Text modes are available.
Post-Installation Configurations
VI Client - installation by downloading the client from the IP address of the web server on the ESX/i host
SSH - use the VI client to create a user account if required. If remote root access is required than modify the relevant option in /etc/ssh/sshd_config
SC Memory Allocation - use the VI client to increase the default memory allocation from 272MB upwards (recommendation is between 272-800MB)
NTP Client - use the VI client
Troubleshooting Installation
Hardware Issues and Misconfigurations - hardware compatibility/reliability and misconfiguration
Purple Screen of Death - CPU/memory problems are the most likely to occur. Dump file can be sent to VMware to aide analysis and troubleshooting
Diagnostic Data Collection - look for recent changes (hardware, software, or environmental), and any console errors. Create a diagnostic data dump by using the VI client
Licensing VMware Infrastructure 3
Two main components that need to be licensed:
- ESX
- VirtualCenter
ESX Server Licensing
Foundation - starter edition with no enterprise features
Standard - foundation edition + HA
Enterprise -all standard + enterprise features
Host Vs Server Licensing Mode
- Evaluation Mode
- Serial Number
- License Server (Server Based, C:\Program Files\VMware\VMware License Server\Licenses)
- Host License File (Host Based, /etc/vmware/vmware.lic)
VirtualCenter Licensing
VirtualCenter Foundation - up to 3 ESX hosts
VirtualCenter - up to 200 ESX hosts
How the License Key Works
Per Processor - all VI3 software licenses per processor except for VirtualCentre
Per Instance - traditional one licence per machine, only VirtualCentre uses this form
License Server
Similar in concept to Windows licensing server if you've ever used one. Best practice is to install it on the same system where you have installed VirtualCentre.
Working with the License Server
You can change the location where license files if need be. Restart the License Server service in order re-read the license files or else use 'VMware License Server Tools'. License server uses following two processes/ports.
lmgrd.exe - 27000, TCP
vmwarelm.exe - 27010, TCP
Use 'netstat' to verify to ensure that these processes are running.
Losing the License Server
If the license server goes down there is a 14 day grace period during which you to perform maintenance. Use HA ability of ESX in order to deal with redundancy issues. VirtualCentre not impacted by license server outage as it uses a cache version of the license file.
Virtual Networking Options
Obviously, if you have VM's you'll need some way to 'bridge the gap' between physical and virtual networking hardware. You can achieve this through Virtual Switches as well as Virtual NICs.
What Are Virtual Switches?
Virtual equivalent of physical switches. Allow for isolated VM network which would afford the opporunity to create a DMZ, or provide fault tolerance and HA.
Virtual Switches:
- are software objects on the VMkernel of every ESX host
- can have between 8-1016 ports
- can be serviced by one of more physical NICs
- virtual NICs have unique MACs just like physical NICs
- allow for connectivity via 801.2q (known as VLAN tagging)
- can support port groups or connection types
Comparing Physical and Virtual Switches
Similarities:
- both have MAC address tables
- both check each frame's MAC address destination upon receiving it
- both forward frames to one or more ports
- both avoid unnecessary deliverables
Differences:
- STP not required/supported on VS
- inter VS connectivity not possible
- forwarding data table is unique to each VS
- VS isolation prevents loops (hence, STP not required)
Type of Virtual Switches
Internal Virtual Switch - no interaction with exterior physical networks
Single Adapter Virtual Switch - interaction with exterior physical networks via single NIC
Multiple Adapter Virtual Switch - interaction with exterior physical networks via multiple NICs
Type of Virtual Switch Ports
Basically switch ports which share the same connection type.
Service Console - allows for communication to/from the Service Console. vswif0 is automatically associated with vSwitch0. Think about multiple SC ports and SC NIC team for redundancy
VMkernel - allows for configuration for technologies such as VMotion, iSCSI, NAS/NFS
Virtual Machine - connects virtual to physical network
VLANs in Virtual Networking
Similar in concept to physical VLANs but virtual VLANs can also group virtual VLANs together.
Trunk Ports
Same conceptually as trunking of VLANs on physical switches.
802.1Q VLAN Tagging
Same conceptually as tagging of VLANs on physical switches.
Virtual Switch Policies
General - total number of ports
Security - promiscuous mode, MAC address changes, forged transmits (port security feature on Cisco switches)
Traffic Shaping - average/peak bandwidth and burst size
NIC Teaming - fault tolerance, redundancy, load balancing
Load Balancing
Route Based on Originating Virtual Port ID
Route Based on Source MAC hash
Route Based on IP Hash
Explcit Failover Order
Network Failover Detection
Link Status - detects cable connection
Beacon Probing - detection via hearbeat
Notify Switches
Performance tweak to allow for quicker update of physical switch lookup tables.
- physical NIC failover when a virtual NIC begins to use a new physical NIC to communicate
- a new NIC is added to a NIC team
Failback
Adjust how failed NICs come back. Whether they will become active again or return to standby mode.
Explicit Failover Order
Control order in which NICs failover.
Networking Maximums
Storage Operations
Currently four forms offered:
- local storage
- fibre channel
- iSCSI
- Network Attached Storage (NAS)
Transfers can either occur at:
block-level - similar to accessing local storage. iSCSI and FC are good examples of this
file-level - similar to network based shares such as SMB
Fiber Channel
Most efficient, reliable, and best performing but also most expensive of all storage options. High pseed protocol which allows for transport between nodes at up to 8GB.
ESX extends FC capabilities by allowing SAN Boot, VMFS Datastores, Enterprise Features, and allowing VMs access to Raw LUNs.
FC SAN Architecture
Host Bus Adapters - similar to NIC in that it has a UUID known as a World Wide Name (WWN) which identifies this adapter to Fibre Channel Networks
Fibre Channel Switches - also known as 'fabric'. SImilar in function to network switches but alo provie security as well
Logical Unit Numbers - group of disks
Storage Systems - actual collection of disks ready to be designated into LUNs
Storage Processor - brain that create LUNs, implements security, and controls access to LUNs
Masking
Uses HBAs or SCSI controllers to hide LUNs from view of OS. This is particularly important for certain types of OS that seek to write data to every single LUN they come across (Windows will write a signature) which could lead to data corruption if the LUN is being used by multiple systems.
Zoning
Equivalent of VLANs in the storage world.
Hard Zoning - implemented at FC switch level. Prevents physical access to any device that is not a member of the zone. More secure option.
Soft Zoning - security through obfuscation of relevant ports. Access still possible via directly accessing the physical address
FC Addressing
vmhba0(physical label and number):1(target number):23(LUN number):4(partition number)
Internet Small Computer System (iSCSI)
Basically SCSI commands over Ethernet. Cheaper than FC and with Ethernet speeds already at 10GigE performance disparities are rapidly disappearing. Like FC, ESX extends iSCSI capabilities by allowing SAN Boot, VMFS Datastores, Enterprise Features, and allowing VMs access to Raw LUNs.
iSCSI Addressing
iqn (iSCSI Qualified Name).2011-01 (year and month when organisation registered a valid domain/subomain).com.company (reversed domain):ipstor (alias which is optinal and represents)
Software Initiator
Basically, a driver which allows access to SCSI targets. When configuring the software initiator's parameters, there are two options to select from:
- one VS with two port groups
- two VS
You also need to open TCP, 3260 on the SC firewall.
Dynamic Discovery
Use IP address/Port to discover available LUNs.
CHAP Authentication
Use password/secret authentication to access LUNs (or not).
Hardware Initiator
Better performance than software initiator and takes some load off of the ESX host. Allows booting of ESX host off of SAN LUN and Static Discovery
Network Attached Storage (NAS)
ESX Features on NFS Datastores
VMotion, DRS, HA, and VCB.
Configuring NFS Datastores
Following parameters can be configured.
Share name, subnet (which subnets can access the share), sync, rw, no_root_squash (root access enabled)
Virtual Machine File System
Ultra light weight file system with minimal overhead. File locking as opposed to volume locking to allow for higher performance on concurrent access. File locking management is via metadata file. Entire partition is locked when metadata file is locked. Hence, LUN sizing critical in order to reduce possibility of I/O being a performance bottleneck.
Extending a Datastore
Maxiumum VMFS volume size is 2TB. Can be overcome through 'extents' of which there can be 32. You can add extents but can't remove them non-destructively (to the entire datastore) at this stage. Note that metadata file is stored in first extent. Losing this file will cause data loss across all other extents.
Multipathing
Basically allowing for redudant paths to LUNs. Two strategies:
Fixed - explicitly dictate path options
Most Recently Used (MRU) - obviously use the most recently used and after switch to the older/other path
Administration with VirtualCentre
Allows you to manage ESX from a Windows application.
Planning and Installing VC
Along with enterprise features it also apovides, Upate Manageer, Converter Enterprise, and Guided Consolidation. Depending on configuration various network ports may need to be opened on the system's firewall. Need to install: database server, license server, VC server, VI client.
VC Blueprint
Core Services - provision scheduler, events logging, etc...
Distributed Services - VMotion, HA, DRS, etc...
Additional Services - Converter, Updater, etc...
Database Interface
ESX host managemen
Active Directory Interface
Virtual Infrastructure Application Programming Interface (VI API) and Virtual Infrastructure Software Development Kit (VI SDK)
Hardware Prerequisites for VMware Centre
Designing a Functional VC Inventory
- folders
- datacentres
- clusters
Administration with VirtualCentre
VMware Infrastructure Client Tabs
- inventory (details of your current network)
- scheduled tasks (overnight restarts, etc...)
- events (alarms messages go here, first point for troubleshooting)
- administration (role, session, license, log management, etc...)
- maps (Visio like network diagram)
- consolidation (consolidate physical into VM)
Lockdown Mode
Block direct access to ESX hosts. Only allow via VC.
Plug-ins
Update Manager and Convertor Enterprise are two base plug-ins for VCentre. Many have since been created by third parties.
Virtual Machine Operations
VM Defined
See top of this post.
Virtual Hardware
Based on Intel 440BX and NS338 SIO chipset
VM Files
*.vmx: VM configuration file
*.vmdk: all information about HDD file
*.*-flat.vmdk:
*.log: log file
*.nvram: BIOS of VM
*.vswp: VM swapfile
*.vmsd: details of snapsnots of VM
Creating a VM
No explanation required if you've ever used any virtualisation software before.
Understanding VMWare Tools
Please see previous post from this blog.
Templates
Guest OS Customisation
Windows-Based OS Guest Customisation
To customise Windows installation drop files in following directory.
C:\Documents and Settings\All Users\Application Data\VMware\VMware VirtualCentre\Sysprep\
Linux Guest Customisation
- Computer Name
- Domain Name
- DHCP/IP Settings
- DNS
Deploying VM
- Create new
- Deploy from Template
- Clone
Managing VMs
Cold Migration
Move VM's while powered off. Allows you to move VM files as well moving the VM to a new host.
What are Snapshots?
Settings, memory, and disk states.
*#-delta.vmdk - files that register changes from base for this VM. # sequential starting from 1
*#.vmdk - snapshot description
*#.vmsn - state of the memory for this VM
VMWare Converter Enterprise
- P2V and V2V
- third-party VMs to ESX VMs
- restore VCB images to ESX VMs
- export ESX VMs to other formats
- customise VC VMs
Converter Enterprise Components
- Server (initiates actual conversion process)
- CLI (carries out commands issued by VI client or CLI)
- Agent (prepares physical machine for conversion)
- Client Plug-in (modifies VI GUI and enables Converter Enterprise Features)
Cloning
- Converter Enterprise is capable of the following:
- Hot Cloning
- Cold Cloning
- System Reconfiguration
- Remote Cloning
- Local Cloning
Cloning Modes
Volume based - useful when resizing disks. Supported via hot and cold cloning.
Disk based - exact copy of disk. Only supported via cold cloning.
Guided Consolidation
Ideally used in SME with about 100 physical servers. Only Windows systems can be discovered and analysed at this stage. Both physical and virtual machines can be detected. It relies on the 'Capacity Planner' and 'Converter' service to run.
Discovery and Analysis
You need a user account with certain privileges to work:
- member of local administrators group on VC server
- Log on as Service user right
- Read access to AD
- administrator rights on target machines
VMware Infrastructure Security and Web Access
VI Security Model:
User and group - accounts allowed to login
Role - set of permissions applicable to a role
Privilege - an allowed action for a user
Permission - right assigned to an object in the inventory and grants a user/group the right to interact with that object according to existing role/privileges
VC Security
Local or Domain (in relation to Windows authentication credentials)
ESX Server Security
Being Linux there is the root account as well as a 'vpxuser' account which is used to sent ESX commands.
Web Access
IP address of ESX host or VC.
Managing VMware Infrastructure Resources
VM CPU and Memory Management
Limit - maximum resources a VM may consume
Reservation - minimum required for VM to work properly
Shares - a method of reserving more memory and CPU time
VMotion
Migration from one ESX host to another transparently. Prerequisites for it to work properly include:
- access to all datastores on which the VM is configured
- virtual switches are labelled the same
- access to the same physical networks
- compatible CPUs
- GigE network connection
Distributed Resource Scheduler (DRS)
Automated load balancing of CPU/memory resources with a DRS cluster.
DRS Automation - works in three modes manual (suggests only), partially automated (DRS works on just powered on VMs but only suggests on subsequent VMs), and fully automated (VC handles everything). Works on 5 levels which dictate the level of performance enhancement from 'Most Conservative' (only when required) to 'Aggressive' (if there is just the smallest possibility of performance enhancement begin the migration process).
DRS Cluster Validity
Green - resource pool is fine for resources
Yellow - resource pool is overcommitted
Red - DRS cluster or HA rules have been violated which means the cluster is 'invalid'
DRS Rules
Allows you to reduce the chances of a single point of failure or increase performance by setting rules so that certain VMs should run by themselves (affinity) at all times or otherwise they should be migrated as a group (anti-affinity).
Monitoring VMware Infrastructure Resources
Resource Optimisation Concepts
Primary two issues with performance optimisation with regards to managing VM's are vCPU and vMemory.
Virtual CPU
1-4 vCPU per host. Hardware Execture Context (HEC) is the same as a thread on a
physical CPU.
Hyperthreading
Feature that was brought about by the Intel Pentium 4 chip and has been used since. Allows for multiple threads to be executed simulatenously on the same CPU. However, when CPU usage is already high contention issues may come into play. At a certain point there are diminishing returns for using HT.
vCPU Load Balancing
VMkernel is responsible for scheduling vCPUs and SC. SC always remains scheduled on CPU 0/first HEC while others are re/scheduled every 20 milliseconds.
Virtual Memory
VMkernel uses:
- transparent memory page sharing (same memory pages that are only being read share the same page)
- balloon-driver/vmmemctl (VMware tools driver inside guests which allows others to use its unused allocated memory when under strain)
- VMkernel swap (method of last resort for memory but just like real memory there will be a performance hit. Swap file is deleted/created on start/shutdown of VM. Swap file is difference beteween VM's memory limit and reservation.)
Monitoring Virtual Machines and Hosts
- CPU
- Memory
- Disk
- Network
Monitoring with Alarms
When certain thresholds are reached you can perform actions such as run a script, send an email, send an SNMP trap, or even SMS.
Backup and High Availability
In theory, similar to physical backup techniques.
Backup Scenarios
- backup agent within the VM
- backup the actual VM files (try to keep data and OS seperate for obvious reasons)
Host Backup Options
- service agent in SC
- imaging software for the ESX host
VMware Consolidated Backup
Means of file or image level backup using snapshots. Snapshot it then copied to location where backup proxy server can back it up.
- transparent, live backups
- backup load is moved away from ESX host
- backup agent is optional since VMware Tools provide backup functionality as well
High Availability
Ensure that VMs from failed hosts can be restarted on other hosts. Fault-tolerance ensures that VMs can be accessed in uninterruptedly in the event of host failure. Uses a heatbeat every 15s to determine host failure.
Virtual Machine Failure Monitoring
VMware Tools use a heatbeat every 20s to determine VM failure. This technology then uses this to determine if a restart of the actual VM is required.
HA Configuration Prerequisites
- VirtualCentre
- DNS Resolution
- Access to shared storage
- Access to same network
Service Console Redundancy
VMware HA will warn you if there is no SC redundancy. Either use port groups on different VS or use NIC team to a single SC port. You'll require certain ports to be open in order to achieve HA and state synchronisation on SC VS.
Host Failover Capacity Planning
Common sense should prevail here but it is likely that financial constaints will play an enormous impact on the level of redundancy that you can provide.
Host Isolation
Take certain steps in case of VM isolation.
Leave Powered On
Power Off
Use Cluster Setting
Virtual Machine Recovery Priority
High - restarted first
Medium - default
Low - restarted last
Use Cluster Setting - cluster setting
Disabled - VM doesn't power on
Clustering
Cluster-in-a-box - all VMs on one ESX host
Cluster-across-boxes - VMs spread across multiple ESX hosts with central storage