With the advent of 'Cloud Computing' security and compliance issues have become more pertinent. Below are some of the major security standards that are currently in use/existence in both commercial and non-commercial contexts. Note that individual industry or company policies may or may not take precedence.
Tuesday, December 20, 2011
Saturday, December 10, 2011
If you've ever used Vyatta before you've probably noticed its very Cisco-ish syntax even though it clearly has a Linux heritage. I've been working on a project that has required a better understanding of how this is ultimately achieved. Quagga, Bird and Xorp came into my sights after a preliminary search as well as a number of other 'younger' projects.
Initially, Bird/Xorp configuration seems to be a lot less readable than Quagga but simpler in that it relies on only one file. Edge to Quagga, then Bird and finally Xorp. Research indicates that Xorp may have been the default open source routing engine prior to Quagga. Whitepapers suggest that open source routing software on commodity hardware is able to achieve similar or superior speeds/performance to that of proprietary solutions at a fraction of the cost. Learning curve is reduced by use of similar syntax of Cisco/JUNOS CLI and of course open nature of routing protocols. Documentation for all options seem to be reasonable.
"zebra is an IP routing manager. It provides kernel routing table updates, interface lookups, and redistribution of routes between diﬀerent routing protocols.", http://www.quagga.net/docs.php
# and ! used to mark comments in configuration files.
telnet direct to 2601 (telnet to higher ports to configure/monitor the other protocols) and configure/monitor using a Cisco like CLI interface. Like Cisco/JUNOS CLI interface, modes (enable, global config, interface, etc...), commands (show, no, router, etc...), and auto-completion of commands are also supported.
Most protocols major protocols catered for except for proprietary ones such as IGRP and EIGRP. For example, RIP (v1/v2), RIPng (handles IPv6), OPSF (v2/v3 which handles IPv6), BGP, even ISIS is supported.
Cisco style access-list command and complete support for IPv6 available.
If you're familiar with the Cisco CLI then you'll be completely at home with the Quagga interface. Commands are identitical in most cases. In fact, its a fairly good way of revising for CCNP BSCI, and other Cisco certification exams if you don't have access to actual equipment or simulation software.
Although I haven't attempted to use this apparently 'SMUX' allows you to reference information from the various Quagga services by providing a bridge between SNMP and Quagga. A more apt description would be that, "SMUX is the snmp multiplexing protocol (RFC 1227). It can be used by an snmp agent to query variables maintained by another user-level process."
While I was doing all this I decided to 'clean up' a number of other anomalies on my system. For example, 'kdump' was not starting on boot. Obvious solution would be to simply remove kdump package but I just wanted to make it work. Thought it was just a non-configuration issue. Went through the uncommenting of the default options but it still wasn't starting. Went through the relevant 'init' file. Noticed that it required a 'crashkernel' parameter (eg. crashkernel=128M@128M) in order for it to work (/proc/cmdline contains kernel boot parameters if you're curious. While other /proc files can be written to this file was not responsive to chmod and being written to even if you are root). Ultimately, only way to test is to modify kernel boot parameters via /boot/grub/menu.1st. However, then noticed that kernel wasn't configured with this option available. Hence, boot was not possible. Had to update kernel and kernel source (required for my work with other packages). Thereafter, kdump startup was now possible (if you're curious "echo 1 > /proc/sys/kernel/sysrq", "echo c > /proc/sysrq-trigger" can be used to create a kernel crash situation if you're curious).
yum update kernel
yum update kernel-headers
yum update kernel-devel
Once again, noticed that performance when MTU is 1392 is much better than 1500 during download of packages via yum. Upgrade of kernel 'broke' my VirtualBox setup though. Obviously, suspected kernel module issues so went to /usr/src/vbox_host-* and had to re-run 'make' to re-recompile the relevant kernel modules and re-register relevant modules. Noticed later on that there was an option for the vboxdrv init file (setup) which was 'more correct' though. Used this and VirtualBox startup was all good again. It was a bit easier with VMWare Server though. Kernel modules are automatically re-compiled/re-registered and setup on startup.
Following is useful if you're interested in more about about kernel dump analysis,
While completing all of the above, I remembered previous work regarding PKI certificates to setup OpenVPN. Looked for mkcert.sh/CA.pl and found but also noticed tinyca2 and openvas-mkcert. tinyca2 is a very simple GUI based application for managing certificates while openvas-mkcert is CLI based. Haven't tried using these certificates with OpenVPN as yet though. Will experiment another time.
Friday, December 9, 2011
I recently picked up some old (but still very enjoyable games even though their graphics mightn't be up to todays standards). These games included RAC Rally Championship (DOS), Theme Hospital (Windows 95), and Rise of Nations (Windows XP). This post will go through the steps required to get them working under Linux. Note, that I'm assuming that you have installation discs for all relevant programs.
FIrst game I tried was RAC Rally Championship. I consider it to be one of the better rally arcade/simulator games prior to Colin McCrae's Rally and DIRT. To get it running first install, 'Dosbox' (its a DOS-emulator for various different platforms). Startup the 'dosbox' environment from the Linux CLI or via the relevant GUI menu option. Run the following commands. Then run the following series of commands.
- mount c /media/disk -t cdrom (mount your installation disc to C:)
- mount d ~/rally (mount a folder from your Linux home folder called rally to D: to give the installer a place to place its files. Assumes you have already created the folder of course)
- C: (go to root directory C:)
- install.exe (run the installer file. When prompted install it to D:\)
- cd d:\rally (when installation has completed change to the installation directory)
- ral.exe (run the game)
One thing you'll definitely notice is that while the games are still quite enjoyable they'll often feel quite 'eerie'. Imagine existing in the world of 'True Colour' (32-bit and millions of colours) and then suddenly finding yourself being transported to a world where only 256 colours exist.
The next game we'll attempt to setup/run is, 'Theme Hospital'. Similar in concept to 'Zoo Tycoon' it has you running running a Hospital, attempting to find a happy medium between healing patients and fulfilling more tangible/financial goals. I discovered that while installation was quite easy (insert the disc, mount it, then run the setup/installation file via 'wine'), getting everything running perfectly wasn't.
First I needed to get DirectX installed. Instructions are available online for this which involves downloading a redistributable installation file from the Microsoft Download Centre. I opted to use the my Tiger Woods 08 DVD to get DirectX the installation file. While the game worked, sound was not functional. Using 'winecfg' indicated that I was getting 'write' errors to the sound device file and testing obviously produced no sound.
I read up online that a lot of others were having issues with the 'PulseAudio' sound daemon. I also read that removing the package/s could result in 'odd' issues with their desktop installation. Like others, I discovered that killing the pulseaudio damon doesn't actually stop it because it respawns by default. Altering the relevant pulse/client.conf file should changed this behaviour whether done to the core configuration in /etc or via a users' local .pulse setup but didn't which indicated there must be another point of configuration as well. I disabled it through the 'Gnome Startup Preferences' menu option/s. At that point, the error/s disappeared but switching between the different sound daemons still didn't produce success (ESound, Pulse, OSS, ALSA) even though I had installed all relevant 'wine to sound daemon' plugins. Finally, I shutdown everything (sound related) except the core 'PulseAudio' daemon and finally the test/game sound seemed to work even though others have indicated that using ALSA seems to be the best/easiest solution.
While the installer for 'Rise of Nations' went well I've experienced many 'known issues'. Among these, having to use both mouse buttons to navigate through the menu system (though the left mouse button seems to work perfectly fine in the actual game itself), having some regular graphical anomalies (this only occurred with Rise of Nations but not when its Rise of Nations - Thrones and Patriots), a temporary black screen or stall on startup (not a real stall. Use the space bar to reach the main menu), and having no sound. Attempted to switch Direct X libraries between builtin/native as directed elsewhere online to get around these issues but switching to native actually caused an exception to occur on startup of the game so I have since gone back to the program defaults. I also believe that some of these problems may also be version related so I'll update at a later time.
As an inside, I recently scratched one of my game software discs (the game would only run through the install process part of the way before succuming to read errors) but had run out of disc repair solution. I've since discovered that 'toothpaste' actually works quite well as a repair agent since its a light/mild abrasive. It tends to blends the scratch in with its surrounding rather than polishing the disc though.
I recalled the early days of optical drive technology. There were often (and there still are though less drastic) differences in error detection/correction quality on different drives (I remember a disc that was completely undearable in one drive but perfectly readable in another). I switched from an onboard OptiArc drive to an external Plextor drive and disc reading seemed to be perfect.
I decided it was time for me to make an ISO backup of the disc (allowed under existing law). While the backup disc can be used to install the program, there are mechanisms which prevent it from being used as a startup/game disc (even via emulation software using just the ISO file). There are obviously ways around this type of detection but its always a game of cat and mouse between those who create copyright protection mechanisms and those who attempt to defeat them.
Tuesday, December 6, 2011
A particular project that I've been working on has called upon the need for an IPS/IDS as well as vulnerability detection solutions. Well known solutions that fill this particular void that I've come across in the wild are Snort, OpenVAS, Nessus, Suricata, and Tiger (obviously I've looked at many other systems as well and have played around with some of my rules but I'll leave that for another post). Below are some of my research notes.
OpenVAS is basically a fork of Nessus after that project went commercial/proprietary. It works but is somewhat rough around the edges. Installation was via YUM repository. Thereafter, its a case of fetching plugins and configuring them and then using web based or desktop based client in order to conduct scanning.
Note that first time, openvas-scanner can take a substantial amount of time to run. Following is 'timed' startup of before and after caching when starting openvas-scanner service. Obviously, its better to leave the '/var/cache/openvas/' folder be after the initial caching.
real - 0m21.924s
user - 0m16.962s
sys - 0m0.694s
real - 11m19.809s
user - 2m16.576s
sys - 4m3.631s
Funnily enough, there is no counter to indicate progress of processing scripts on launch (I used, 'ls /var/cache/openvas | wc -l' to watch for progress of service startup of openvas-scanner. Should incorporate something like this in init script) so its quite possible that you could be waiting for the service to start after several minutes which brings me to the 'watch' command (I used to use a BASH inifinite loop combined with a sleep command somwhere in between to acheive the same thing). It allows you to run a command periodically, for an infinite amount of time so that you can view the process of a particular command. However, the 'watch' command doesn't conditionally stop based on the 'man' page apparently. I may rectify this one day?
Like Nessus, OpenVAS is also dependent on publicly available plugins/rules in order to scan for vulnerabilities.
Had download issues when downloading feed. Presume that this is more likely related to wireless connection than with site/application level issues. Have noticed that when using 3G for larger downloads broken downloads more likley. Believe that it could have to do with variable nature of 3G connections with regards to variable delay/latency (possibly caused by network congestion) of connection leading to timeout. Have dealt with it previously be using resumption of download feature in download manager otherwise have had to use wget -c. Reminds me of Path MTU issues that I have previously come by in other work. Re-routed the issue by using an online proxy but another solution that I've tried is simply restarting the process hoping that the route would have reset itself at some point along the way (ifconfig mtu MTUSIZE if you're curious about changing MTU size for a NIC. I've found that MTU of 1392 is quite useful.).
multitail, is a useful tool if you're having issues and are unsure of the source of the problem. If you working with a particular service with a special folder and logrotate then its as simple as running multitail *.log
If you're new to it run the following in order to check things after setup/configuration.
Slick web based interface.
OpenVAS web access on, https://localhost:9392
Note that gsd (desktop client) seems to have a long way to go before reaching the state of the openvas web interace.
May need to run the following in order to properly register all plugins after extracting to relevant folder, /var/lib/openvas/plugins.
greenbone-security-assistant (gsd) is the desktop client scanning interface. Looks dated but does the job.
The following is for lazy people.
for i in `ls /etc/init.d/openvas* | sed 's/\/etc\/init.d\///'` ; do service $i start; done
Obviously, one of the 'original' vulnerability scanner which has since spawned forks and commercial offerings.
Need to obtain plugin updates via website,
Note, that documentation may be somewhat outdated. Need to obtain serial code in order to download/update feed.
nessus-fetch --register SERIAL
This may take a while long time. Using iptraf I calculated that the total file transfer to be around 25MB.
Put plugins in /var/lib/nessus/plugins/
Restart nessusd service to load plugins. Can take quite a while. Possible to have a snack/drink during this time. Thereafter, you need to run the following in order to login to the client and run scans.
service nessusd start
Startup the nessus client
Run the scan from within the client
May need to restart the system. Seemed to have memory allocation issues when restarting the nessusd service alone and then loading up the client. No details in relevant logfiles. May investigate further at some point by increasing logging. Research indicates that it may be version related.
Change the parameters from within the client, conduct the scan and hopefully there will be no issues. Otherwise, rectify as required.
Obviously, being a fork of Nessus the OpenVAS project shares many similar aspects/components including setup and configuration.
You'll need to download rules/config files from,
You may have strange errors during loading of rulesets. Have thought about attempting to fix/patch the issues myself but will leave for when I have more time due to the number of them. The project is still young though.
Under Fedora you'll need to change the startup script /etc/init.d/suricata to reference the correct configuration and rules files. The service quite simply will not work otherwise.
If you're lazy (or you don't have an Internet connection) you can just 'touch' the relevant required files as required in order to get the things up and running.
As stated previously, basically it won't start without a downloaded set of rules/configuration set. You can get around this on some installations either using the official 'rules' or using rules that have previously been created by the 'community'. Logs to /var/log/suricata/
Basically a collection of scripts that look for known issues that may exist on Linux/UNIX systems. Using a cron job periodic scans are run on the host with an email being sent to the sysadmin of the server in question. If you have difficulty understanding some of the cryptic codes than there is the tigexp command which when combined with the relevant code in the email will provide a more human readable explanation of the security vulnerability in question. Details are sent to /var/log/tiger also for easier referencing and/or parsing.
Like Nessus, metasploit has commercial forks as well the original open source version as well. Basically, another network mapping, vulnerability scanning engine however there are mechanisms through which to exploit known vulnerabilities as well and alter existing means for exploitation.
Even though it uses a single binary for installation the process of downloading and installing metasploit was anything but easy. As with the case of OpenVAS I've been having download issues with larger files. This time the download would actually stop at about 71%. Multiple attempts at resuming downloads via browser and wget were unsuccessful. Wireshark analysis indicated that PDU fragmentation wasn't occurring properly because TCP segments were somehow being lost. Moreover, when resumptions were successful somehow the download was extending past the anticipated initial download file size which ultimately meant a corrupt file. Cutting the file down to the correct size didn't work (use the cut command to get down to the correct byte size file). What did work funnily enough though, was using VIM and deleting sections at the end of file and then resuming the download from the modified file (manual defragmentation/disassembly?).
Word of warning, NeXpose can be a bit discretionary when it comes to what flavours of Linux it can be installed on. In the past, I've gotten and have seen others get around this issue by altering the relevant /etc/*-release file to something more 'pleasing'. However, on this occassion I decided to go for an installation under Ubuntu instead in a VM (you may be interested to know that VMWare Server may change your partition layout when expanding virtual drive capacity size. partprobe and parted are two lesser known commands you may be curious about if you need to do this). Note that you may need to sign up for a serial code depending on whether you use the community/commercial version of metasploit.
In the past I've used Snort before as part of already complete (software security) solutions and have experimented somewhat with it but its only now that I've really had to gain a better understanding of the technology. Just like a conventional firewall and/or security device its made up of a set of rules. Rules are generally derived from rules that have been generated by professionals/enthusiasts who have since found common ground with regards to particular security issues that they they/most people may be facing.
"Snort can be configured in three main modes: sniffer, packet logger, and network intrusion detection. In sniffer mode, the program will read network packets and display them on the console. In packet logger mode, the program will log packets to the disk. In intrusion detection mode, the program will monitor network traffic and analyze it against a ruleset defined by the user. The program will then perform a specific action based on what has been identified."
Due to its relative maturity and popularity many different tools available and largely compatible alternatives for easier analysis of results as well as creating rules are available.
Saturday, December 3, 2011
I've recently been required to use virtualisation increasingly in order to complete some project work. In order to get up to speed regarding the technologies currently available I've been working my way through VCP certification books. In this post are some of my research notes for the VCP-310 Exam.
Note that while this post is for VMWare, much of the same functionality is also supported by the Open Source Equivalent Xen (though the methods for interfacing and configuring these features mightn't be as slick at this stage) or through its enterprise equivalent XenServer. I may post my research notes regarding Xen but I feel its more likely that I may contribute towards the project in another way and as it stands a lot of the documentation out there is already of a high standard (though slightly fragmented).
Note, that the VI client/VirtualCentre is not the only method of monitoring/configuration for VMware. Its possible to use the Service Console to achieve many basic tasks.
Introducing VMware Infrastructure 3
Virtualisation is the process of adding an additional layer between hardware and software to allow multiple instances of any Operating System to be installed on the same set of hardware (desktop or server) through the concept of virtualisation of hardware. Control between the top software layer and the underlying hardware is handled through software known as a 'Hypervisor'.
Why You Need Virtualisation
It allows for consolidation of hardware, reduced power consumption,
Types of Virtualisation
Bare Metal - ESX/ESXi, XenServer
Host-Based Virtualisation - VMware Workstation/Server, MS Virtual PC, MS Hyper-V, VirtualBox
Application Virtualisation - Cameyo
Storage Virtualisation - StarWind, OpenFiler, FreeNAS
Virtual Machine Overview
Hardware is essentially 'virtual' and can be added/removed/configured at will through Hypervisor
*.vmx - VMware hardware configuration/specification details are supplied in this file
*.vmdk - VMware file which contains the hard drive equivalent of a physical hard drive. It may be split into smaller files to allow for compatibility with the underlying OS/filesystem (if we are talking about Type 2 hypervisors and older OSs/filesystems of course)
Isolation - one VM is independent of another
Encapsulation - VM's are contained inside of files
Hardware independence - VM's are unaware of actual hardware. Interaction is conducted through Hypervisor
Compatibility - standard x86/x64 architecture
Simulation and Emulation
Simulation - subset of the real thing (flight simulator)
Emulation - attempt to port hardware to software (SNES emulator)
Virtualisation - install software on virtual hardware (hypervisor)
Virtual and Physical Machine Comparison
Physical - underutilise resources, hardware bound, replication complex
Virtual - not hardware bound, standard x86 environment, collection of files can be easily managed
Why VMware Infrastructure 3?
Mainly enterprise features such as VMotion, Distributed Resource Scheduler (DRS) which allows for automated switching of VM's from one host to another if there are resource disparities and performance problems exist on the current host but not another
VMWare Infrastructure 3 Suite
ESX Vs ESXi - main difference is that ESX uses a Service Console (SC) as its management system unlike ESXi which relies of a seperate management utility but has a smaller memory footprint (32MB)
Virtual Symmetric Multi-Processing (SMP) - virtual CPU's using both logical and physical
Virtual Centre - control ESX from a Windows application
VMotion - allows live, transparent movement of VM's from one ESX host to another
Storage VMotion - allows live, transparent movement of VM files from one ESX host to another
Update Manager - handles patching of host as well as VMs
Convertor - convert physical to virtual
High Availability - restart a VM on a different ESX host
Distributed Resource Scheduler (DRS) -
Consolidated Backup (CB) -
Virtual Machine Filesystem (VMFS) - designed to be a lightweight filesystem designed for any VM's under ESX/i
Proprietary kernel made by VMWare that acts as a bare metal hypervisor/resource regulator.
The Service Console
- Apache Tomcat Web Server
- SSH access
- SNMP agents
The VI Client
Use this Windows application to connect to an ESX host and/or Virtual Centre.
Planning, Installing, and Configuring ESX
Minimum Hardware Requirements
These change from one version to another. Please consult the the VMware website for more up to date details.
Maximum of 4 primary partitions. Using extended partitions can increase this number. Maximum number of IDE disks is 63 while the maximum number for SCSI disks is 15.
/boot - boot files
/ - root of the SC OS
swap - swap partition
/var/log - log files
VMFS-3 - VM's are stored here
vmkcore - dump information is stored here after a system crash or Purple Screen of Death (PSOD)
Installation Using a CD-ROM
Basically identical to a Linux distribution installation. Graphical and Text modes are available.
VI Client - installation by downloading the client from the IP address of the web server on the ESX/i host
SSH - use the VI client to create a user account if required. If remote root access is required than modify the relevant option in /etc/ssh/sshd_config
SC Memory Allocation - use the VI client to increase the default memory allocation from 272MB upwards (recommendation is between 272-800MB)
NTP Client - use the VI client
Hardware Issues and Misconfigurations - hardware compatibility/reliability and misconfiguration
Purple Screen of Death - CPU/memory problems are the most likely to occur. Dump file can be sent to VMware to aide analysis and troubleshooting
Diagnostic Data Collection - look for recent changes (hardware, software, or environmental), and any console errors. Create a diagnostic data dump by using the VI client
Licensing VMware Infrastructure 3
Two main components that need to be licensed:
ESX Server Licensing
Foundation - starter edition with no enterprise features
Standard - foundation edition + HA
Enterprise -all standard + enterprise features
Host Vs Server Licensing Mode
- Evaluation Mode
- Serial Number
- License Server (Server Based, C:\Program Files\VMware\VMware License Server\Licenses)
- Host License File (Host Based, /etc/vmware/vmware.lic)
VirtualCenter Foundation - up to 3 ESX hosts
VirtualCenter - up to 200 ESX hosts
How the License Key Works
Per Processor - all VI3 software licenses per processor except for VirtualCentre
Per Instance - traditional one licence per machine, only VirtualCentre uses this form
Similar in concept to Windows licensing server if you've ever used one. Best practice is to install it on the same system where you have installed VirtualCentre.
Working with the License Server
You can change the location where license files if need be. Restart the License Server service in order re-read the license files or else use 'VMware License Server Tools'. License server uses following two processes/ports.
lmgrd.exe - 27000, TCP
vmwarelm.exe - 27010, TCP
Use 'netstat' to verify to ensure that these processes are running.
Losing the License Server
If the license server goes down there is a 14 day grace period during which you to perform maintenance. Use HA ability of ESX in order to deal with redundancy issues. VirtualCentre not impacted by license server outage as it uses a cache version of the license file.
Virtual Networking Options
Obviously, if you have VM's you'll need some way to 'bridge the gap' between physical and virtual networking hardware. You can achieve this through Virtual Switches as well as Virtual NICs.
What Are Virtual Switches?
Virtual equivalent of physical switches. Allow for isolated VM network which would afford the opporunity to create a DMZ, or provide fault tolerance and HA.
- are software objects on the VMkernel of every ESX host
- can have between 8-1016 ports
- can be serviced by one of more physical NICs
- virtual NICs have unique MACs just like physical NICs
- allow for connectivity via 801.2q (known as VLAN tagging)
- can support port groups or connection types
Comparing Physical and Virtual Switches
- both have MAC address tables
- both check each frame's MAC address destination upon receiving it
- both forward frames to one or more ports
- both avoid unnecessary deliverables
- STP not required/supported on VS
- inter VS connectivity not possible
- forwarding data table is unique to each VS
- VS isolation prevents loops (hence, STP not required)
Type of Virtual Switches
Internal Virtual Switch - no interaction with exterior physical networks
Single Adapter Virtual Switch - interaction with exterior physical networks via single NIC
Multiple Adapter Virtual Switch - interaction with exterior physical networks via multiple NICs
Type of Virtual Switch Ports
Basically switch ports which share the same connection type.
Service Console - allows for communication to/from the Service Console. vswif0 is automatically associated with vSwitch0. Think about multiple SC ports and SC NIC team for redundancy
VMkernel - allows for configuration for technologies such as VMotion, iSCSI, NAS/NFS
Virtual Machine - connects virtual to physical network
VLANs in Virtual Networking
Similar in concept to physical VLANs but virtual VLANs can also group virtual VLANs together.
Same conceptually as trunking of VLANs on physical switches.
802.1Q VLAN Tagging
Same conceptually as tagging of VLANs on physical switches.
Virtual Switch Policies
General - total number of ports
Security - promiscuous mode, MAC address changes, forged transmits (port security feature on Cisco switches)
Traffic Shaping - average/peak bandwidth and burst size
NIC Teaming - fault tolerance, redundancy, load balancing
Route Based on Originating Virtual Port ID
Route Based on Source MAC hash
Route Based on IP Hash
Explcit Failover Order
Network Failover Detection
Link Status - detects cable connection
Beacon Probing - detection via hearbeat
Performance tweak to allow for quicker update of physical switch lookup tables.
- physical NIC failover when a virtual NIC begins to use a new physical NIC to communicate
- a new NIC is added to a NIC team
Adjust how failed NICs come back. Whether they will become active again or return to standby mode.
Explicit Failover Order
Control order in which NICs failover.
Currently four forms offered:
- local storage
- fibre channel
- Network Attached Storage (NAS)
Transfers can either occur at:
block-level - similar to accessing local storage. iSCSI and FC are good examples of this
file-level - similar to network based shares such as SMB
Most efficient, reliable, and best performing but also most expensive of all storage options. High pseed protocol which allows for transport between nodes at up to 8GB.
ESX extends FC capabilities by allowing SAN Boot, VMFS Datastores, Enterprise Features, and allowing VMs access to Raw LUNs.
FC SAN Architecture
Host Bus Adapters - similar to NIC in that it has a UUID known as a World Wide Name (WWN) which identifies this adapter to Fibre Channel Networks
Fibre Channel Switches - also known as 'fabric'. SImilar in function to network switches but alo provie security as well
Logical Unit Numbers - group of disks
Storage Systems - actual collection of disks ready to be designated into LUNs
Storage Processor - brain that create LUNs, implements security, and controls access to LUNs
Uses HBAs or SCSI controllers to hide LUNs from view of OS. This is particularly important for certain types of OS that seek to write data to every single LUN they come across (Windows will write a signature) which could lead to data corruption if the LUN is being used by multiple systems.
Equivalent of VLANs in the storage world.
Hard Zoning - implemented at FC switch level. Prevents physical access to any device that is not a member of the zone. More secure option.
Soft Zoning - security through obfuscation of relevant ports. Access still possible via directly accessing the physical address
vmhba0(physical label and number):1(target number):23(LUN number):4(partition number)
Internet Small Computer System (iSCSI)
Basically SCSI commands over Ethernet. Cheaper than FC and with Ethernet speeds already at 10GigE performance disparities are rapidly disappearing. Like FC, ESX extends iSCSI capabilities by allowing SAN Boot, VMFS Datastores, Enterprise Features, and allowing VMs access to Raw LUNs.
iqn (iSCSI Qualified Name).2011-01 (year and month when organisation registered a valid domain/subomain).com.company (reversed domain):ipstor (alias which is optinal and represents)
Basically, a driver which allows access to SCSI targets. When configuring the software initiator's parameters, there are two options to select from:
- one VS with two port groups
- two VS
You also need to open TCP, 3260 on the SC firewall.
Use IP address/Port to discover available LUNs.
Use password/secret authentication to access LUNs (or not).
Better performance than software initiator and takes some load off of the ESX host. Allows booting of ESX host off of SAN LUN and Static Discovery
Network Attached Storage (NAS)
ESX Features on NFS Datastores
VMotion, DRS, HA, and VCB.
Configuring NFS Datastores
Following parameters can be configured.
Share name, subnet (which subnets can access the share), sync, rw, no_root_squash (root access enabled)
Virtual Machine File System
Ultra light weight file system with minimal overhead. File locking as opposed to volume locking to allow for higher performance on concurrent access. File locking management is via metadata file. Entire partition is locked when metadata file is locked. Hence, LUN sizing critical in order to reduce possibility of I/O being a performance bottleneck.
Extending a Datastore
Maxiumum VMFS volume size is 2TB. Can be overcome through 'extents' of which there can be 32. You can add extents but can't remove them non-destructively (to the entire datastore) at this stage. Note that metadata file is stored in first extent. Losing this file will cause data loss across all other extents.
Basically allowing for redudant paths to LUNs. Two strategies:
Fixed - explicitly dictate path options
Most Recently Used (MRU) - obviously use the most recently used and after switch to the older/other path
Administration with VirtualCentre
Allows you to manage ESX from a Windows application.
Planning and Installing VC
Along with enterprise features it also apovides, Upate Manageer, Converter Enterprise, and Guided Consolidation. Depending on configuration various network ports may need to be opened on the system's firewall. Need to install: database server, license server, VC server, VI client.
Core Services - provision scheduler, events logging, etc...
Distributed Services - VMotion, HA, DRS, etc...
Additional Services - Converter, Updater, etc...
ESX host managemen
Active Directory Interface
Virtual Infrastructure Application Programming Interface (VI API) and Virtual Infrastructure Software Development Kit (VI SDK)
Hardware Prerequisites for VMware Centre
Designing a Functional VC Inventory
Administration with VirtualCentre
VMware Infrastructure Client Tabs
- inventory (details of your current network)
- scheduled tasks (overnight restarts, etc...)
- events (alarms messages go here, first point for troubleshooting)
- administration (role, session, license, log management, etc...)
- maps (Visio like network diagram)
- consolidation (consolidate physical into VM)
Block direct access to ESX hosts. Only allow via VC.
Update Manager and Convertor Enterprise are two base plug-ins for VCentre. Many have since been created by third parties.
Virtual Machine Operations
See top of this post.
Based on Intel 440BX and NS338 SIO chipset
*.vmx: VM configuration file
*.vmdk: all information about HDD file
*.log: log file
*.nvram: BIOS of VM
*.vswp: VM swapfile
*.vmsd: details of snapsnots of VM
Creating a VM
No explanation required if you've ever used any virtualisation software before.
Understanding VMWare Tools
Please see previous post from this blog.
Guest OS Customisation
Windows-Based OS Guest Customisation
To customise Windows installation drop files in following directory.
C:\Documents and Settings\All Users\Application Data\VMware\VMware VirtualCentre\Sysprep\
Linux Guest Customisation
- Computer Name
- Domain Name
- DHCP/IP Settings
- Create new
- Deploy from Template
Move VM's while powered off. Allows you to move VM files as well moving the VM to a new host.
What are Snapshots?
Settings, memory, and disk states.
*#-delta.vmdk - files that register changes from base for this VM. # sequential starting from 1
*#.vmdk - snapshot description
*#.vmsn - state of the memory for this VM
VMWare Converter Enterprise
- P2V and V2V
- third-party VMs to ESX VMs
- restore VCB images to ESX VMs
- export ESX VMs to other formats
- customise VC VMs
Converter Enterprise Components
- Server (initiates actual conversion process)
- CLI (carries out commands issued by VI client or CLI)
- Agent (prepares physical machine for conversion)
- Client Plug-in (modifies VI GUI and enables Converter Enterprise Features)
- Converter Enterprise is capable of the following:
- Hot Cloning
- Cold Cloning
- System Reconfiguration
- Remote Cloning
- Local Cloning
Volume based - useful when resizing disks. Supported via hot and cold cloning.
Disk based - exact copy of disk. Only supported via cold cloning.
Ideally used in SME with about 100 physical servers. Only Windows systems can be discovered and analysed at this stage. Both physical and virtual machines can be detected. It relies on the 'Capacity Planner' and 'Converter' service to run.
Discovery and Analysis
You need a user account with certain privileges to work:
- member of local administrators group on VC server
- Log on as Service user right
- Read access to AD
- administrator rights on target machines
VMware Infrastructure Security and Web Access
VI Security Model:
User and group - accounts allowed to login
Role - set of permissions applicable to a role
Privilege - an allowed action for a user
Permission - right assigned to an object in the inventory and grants a user/group the right to interact with that object according to existing role/privileges
Local or Domain (in relation to Windows authentication credentials)
ESX Server Security
Being Linux there is the root account as well as a 'vpxuser' account which is used to sent ESX commands.
IP address of ESX host or VC.
Managing VMware Infrastructure Resources
VM CPU and Memory Management
Limit - maximum resources a VM may consume
Reservation - minimum required for VM to work properly
Shares - a method of reserving more memory and CPU time
Migration from one ESX host to another transparently. Prerequisites for it to work properly include:
- access to all datastores on which the VM is configured
- virtual switches are labelled the same
- access to the same physical networks
- compatible CPUs
- GigE network connection
Distributed Resource Scheduler (DRS)
Automated load balancing of CPU/memory resources with a DRS cluster.
DRS Automation - works in three modes manual (suggests only), partially automated (DRS works on just powered on VMs but only suggests on subsequent VMs), and fully automated (VC handles everything). Works on 5 levels which dictate the level of performance enhancement from 'Most Conservative' (only when required) to 'Aggressive' (if there is just the smallest possibility of performance enhancement begin the migration process).
DRS Cluster Validity
Green - resource pool is fine for resources
Yellow - resource pool is overcommitted
Red - DRS cluster or HA rules have been violated which means the cluster is 'invalid'
Allows you to reduce the chances of a single point of failure or increase performance by setting rules so that certain VMs should run by themselves (affinity) at all times or otherwise they should be migrated as a group (anti-affinity).
Monitoring VMware Infrastructure Resources
Resource Optimisation Concepts
Primary two issues with performance optimisation with regards to managing VM's are vCPU and vMemory.
1-4 vCPU per host. Hardware Execture Context (HEC) is the same as a thread on a
Feature that was brought about by the Intel Pentium 4 chip and has been used since. Allows for multiple threads to be executed simulatenously on the same CPU. However, when CPU usage is already high contention issues may come into play. At a certain point there are diminishing returns for using HT.
vCPU Load Balancing
VMkernel is responsible for scheduling vCPUs and SC. SC always remains scheduled on CPU 0/first HEC while others are re/scheduled every 20 milliseconds.
- transparent memory page sharing (same memory pages that are only being read share the same page)
- balloon-driver/vmmemctl (VMware tools driver inside guests which allows others to use its unused allocated memory when under strain)
- VMkernel swap (method of last resort for memory but just like real memory there will be a performance hit. Swap file is deleted/created on start/shutdown of VM. Swap file is difference beteween VM's memory limit and reservation.)
Monitoring Virtual Machines and Hosts
Monitoring with Alarms
When certain thresholds are reached you can perform actions such as run a script, send an email, send an SNMP trap, or even SMS.
Backup and High Availability
In theory, similar to physical backup techniques.
- backup agent within the VM
- backup the actual VM files (try to keep data and OS seperate for obvious reasons)
Host Backup Options
- service agent in SC
- imaging software for the ESX host
VMware Consolidated Backup
Means of file or image level backup using snapshots. Snapshot it then copied to location where backup proxy server can back it up.
- transparent, live backups
- backup load is moved away from ESX host
- backup agent is optional since VMware Tools provide backup functionality as well
Ensure that VMs from failed hosts can be restarted on other hosts. Fault-tolerance ensures that VMs can be accessed in uninterruptedly in the event of host failure. Uses a heatbeat every 15s to determine host failure.
Virtual Machine Failure Monitoring
VMware Tools use a heatbeat every 20s to determine VM failure. This technology then uses this to determine if a restart of the actual VM is required.
HA Configuration Prerequisites
- DNS Resolution
- Access to shared storage
- Access to same network
Service Console Redundancy
VMware HA will warn you if there is no SC redundancy. Either use port groups on different VS or use NIC team to a single SC port. You'll require certain ports to be open in order to achieve HA and state synchronisation on SC VS.
Host Failover Capacity Planning
Common sense should prevail here but it is likely that financial constaints will play an enormous impact on the level of redundancy that you can provide.
Take certain steps in case of VM isolation.
Leave Powered On
Use Cluster Setting
Virtual Machine Recovery Priority
High - restarted first
Medium - default
Low - restarted last
Use Cluster Setting - cluster setting
Disabled - VM doesn't power on
Cluster-in-a-box - all VMs on one ESX host
Cluster-across-boxes - VMs spread across multiple ESX hosts with central storage
Physical-to-virtual cluster - mix of VM and physical hardware