Static ipmitool binary for use on ESXI 5.1

There’s a great tutorial over here for compiling a static ipmitool binary to run on ESXi servers. You can grab my copy from GitHub.

I’ve done this on a CentOS 4.8 box after my ESXi host’s iDRAC would not let me login to it anymore and needed to be reset. So, I threw ipmitool.static onto it and gave it a cold reset. Works great now.

First copy it over (after SSH enabled)

scp ipmitool.static [email protected]:/scratch

Then run it from the host CLI

~ # /scratch/ipmitool.static mc reset cold
Sent cold reset command to MC

iDRAC RAC0218 Error with web GUI and racreset

The RAC0218 is a pretty well known firmware bug with the iDRACs – or at least it will be once you own a few iDRAC7 equipped servers. Some of the errors may include:

Upon attempting to login to the web gui:

RAC0218: The maximum number of user sessions is reached

Or attempting to get info or racreset with racadm:

racadm racreset
ERROR: Unable to perform requested operation

I’ve gotten this to resolve by pressing and holding the blue (i) button on the front of the server, but we don’t always have people onsite to perform this action. So how can you reset via SSH to the host?

I decided to take a stab at this using ipmitool via SSH to the host with the iDRAC problems. First, you’ll need to install ipmitool:

sudo yum|apt-get install ipmitool

Next, we need to load the ipmi kernel module to discover the ipmi0 device in /dev so we can use ipmitool on the local iDRAC. Make sure the ipmi device shows up after too.

sudo modprobe ipmi_devintf
[[email protected] ~]#  sudo ls -alh /dev/ | grep ipmi
crw-rw----.  1 root root    247,   0 Sep 23 11:02 ipmi0

Now (hopefully), you should be able to issue a cold reset to the iDRAC – NOTE that a ‘reset warm’ will not work, it has to be cold.

[[email protected] ~]# sudo ipmitool mc reset cold
Sent cold reset command to MC

The reset takes a few minutes, and upon completion you should be able to login.

Some preventative measures you may want to take:

  • Upgrade the firmware – apparently this was fixed in 1.56.55. Here’s the latest version.
  • Regularly reboot your iDRACs (at least every other week) – not ideal at all but if you have a server with racadm installed you can script it easily

Install racadm on Ubuntu / Debian for Dell iDRAC

Very similar to the CentOS6 / RHEL setup, I pulled the following from the doc here.

First, add the Dell repo that provides the srvadmin packages:

[email protected]:~# echo 'deb http://linux.dell.com/repo/community/deb/latest /' | sudo tee -a /etc/apt/sources.list.d/linux.dell.com.sources.list

Next you need to download the OMSA repo key and add it to apt to verify the packages:

[email protected]:~# gpg --keyserver pool.sks-keyservers.net --recv-key 1285491434D8786F
[email protected]:~# gpg -a --export 1285491434D8786F | sudo apt-key add -

Now, update apt

sudo apt-get update

Install srvadmin-idrac7 to provide racadm

[email protected]:~# sudo apt-get install srvadmin-idrac7

The install appends the new srvadmin/sbin directory to $PATH (like the CentOS install) so you will need to logout for the changes to take effect. Or, you can run the racadm utility using the full path (most likely /opt/dell/srvadmin/sbin/racadm) .

     **********************************************************
     After the install process completes, you may need
     to log out and then log in again to reset the PATH
     variable to access the Dell OpenManage CLI utilities
     **********************************************************

Now you can run the racadm utility

[email protected]:~# /opt/dell/srvadmin/sbin/racadm -r 192.168.0.120 -u root -p calvin help

Security Alert: Certificate is invalid - self signed certificate
Continuing execution. Use -S option for racadm to stop execution on certificate-related errors.

 help [subcommand]    -- display usage summary for a subcommand
 arp                  -- display the networking ARP table
 clearasrscreen       -- clear the last ASR (crash) screen
 closessn             -- close a session
 clrraclog            -- clear the RAC log
 clrsel               -- clear the System Event Log (SEL)
 config               -- Deprecated: modify RAC configuration properties
 coredump             -- display the last RAC coredump
 coredumpdelete       -- delete the last RAC coredump
 eventfilters         -- Alerts configuration commands
 fwupdate             -- update the RAC firmware
 get                  -- display RAC configuration properties
 getconfig            -- Deprecated: display RAC configuration properties
 getled               -- Get the state of the LED on a module.
 getniccfg            -- display current network settings
 getraclog            -- display the RAC log
 getractime           -- display the current RAC time
 getsel               -- display records from the System Event Log (SEL)
 getsensorinfo        -- display system sensors
 getssninfo           -- display session information
 getsvctag            -- display service tag information
 getsysinfo           -- display general RAC and system information
 gettracelog          -- display the RAC diagnostic trace log
 getuscversion        -- display the current USC version details
 getversion           -- display the current version details
 ifconfig             -- display network interface information
 inlettemphistory     -- inlet temperature history operations
 krbkeytabupload      -- upload kerberose keytab file to the RAC
 lclog                -- LCLog operations
 frontpanelerror      -- hide LCD errors - color amber to blue
 netstat              -- display routing table and network statistics
 ping                 -- send ICMP echo packets on the network
 ping6                -- send ICMP echo packets on the network
 racdump              -- display RAC diagnostic information
 racinternalversion   -- Deprecated: Internal command for version exchange
 racreset             -- perform a RAC reset operation
 racresetcfg          -- restore the RAC configuration to factory defaults
 remoteimage          -- make a remote ISO image available to the server
 serveraction         -- perform system power management operations
 set                  -- modify RAC configuration properties
 setled               -- Set the state of the LED on a module.
 setniccfg            -- modify network configuration properties
 sshpkauth            -- manage SSH PK authentication keys on the RAC
 sslcertupload        -- upload an SSL certificate to the RAC
 sslcertdelete        -- delete an SSL certificate on the iDRAC
 sslcertdownload      -- download an SSL certificate from the RAC
 sslcertview          -- view SSL certificate information
 sslcsrgen            -- generate a certificate CSR from the RAC
 sslEncryptionStrength -- Display or modify the SSL Encryption strength.
 sslkeyupload         -- upload an SSL key to the RAC
 sslresetcfg          -- resets the web certificate to default and restarts the web server.
 testemail            -- test RAC e-mail notifications
 testtrap             -- test RAC SNMP trap notifications
 testalert            -- test RAC SNMP - FQDN trap notifications
 traceroute           -- print the route packets trace to network host
 traceroute6          -- print the route packets trace to network host
 usercertupload       -- upload an user certificate to the DRAC
 usercertview         -- view user certificate information
 vflashpartition      -- manage partitions on the vFlash SD card
 vflashsd             -- perform vFlash SD Card initialization
 vmdisconnect         -- disconnect Virtual Media connections
 vmkey                -- Deprecated: perform vFlash operations
 license              -- License Manager commands
 debug                -- Field Service Debug Authorization facility commands
 raid                 -- Monitoring and Inventory of H/W RAID connected to the server.
 hwinventory          -- Monitoring and Inventory of H/W NICs connected to the server.
 nicstatistics        -- Statistics for NICs connected to the server.
 fcstatistics         -- Statistics for FCs connected to the server.
 update               -- Platform Update of the devices on the server
 jobqueue             -- Jobqueue of of the jobs currently scheduled
 systemconfig         -- Backup &/or Restore of iDRAC Config and Firmware

 Groups

idRacInfo            -- Information about iDRAC being queried
cfgRemoteHosts       -- Properties for configuration of the SMTP server
cfgUserAdmin         -- Information about iDRAC users
cfgEmailAlert        -- Parameters to configure e-mail alerting capabilities
cfgSessionManagement -- Information of the session Properties
cfgSerial            -- Provides configuration parameters for the iDRAC
cfgOobSnmp           -- Configuration of the SNMP agent and trap capabilities
cfgRacTuning         -- Configuration for various iDRAC properties.
ifcRacManagedNodeOs  -- Properties of the managed server OS
cfgRacSecurity       -- Configure SSL certificate signing request settings
cfgRacVirtual        -- Configuration Properties for iDRAC Virtual Media
cfgActiveDirectory   -- Configuration of the iDRAC Active Directory feature
cfgLDAP              -- Configuration properties for LDAP settings
cfgLdapRoleGroup     -- Configuration of role groups for LDAP
cfgLogging           -- Group Description for group cfgLogging
cfgStandardSchema    -- Configuration of AD standard schema settings
cfgIpmiSerial        -- Properties to configure the IPMI serial interface
cfgIpmiSol           -- Configuration the SOL capabilities of the system
cfgIpmiLan           -- Configuration the IPMI over LAN of the system
cfgIpmiPef           -- Configuration the platform event filters
cfgServerPower       -- Provides power management features
cfgServerPowerSupply -- Provides information related to the power supplies
cfgVFlashSD          -- Configure the properties for the vFlash SD card
cfgVFlashPartition   -- Configure partitions on the vFlash SD Card
cfgUserDomain        -- Configure the Active Directory user domain names
cfgSmartCard         -- Properties to access iDRAC using a smart card
cfgServerInfo        -- Configuration of first boot device
cfgSensorRedundancy  -- Configure the power supply redundancy
cfgLanNetworking     -- Parameters to configure the iDRAC NIC
cfgStaticLanNetworking -- Parameters to configure the iDRAC NIC
cfgNetTuning         -- Group Description for group cfgNetTuning
cfgIPv6LanNetworking -- Configuration of the IPv6 over LAN networking
cfgIPv6StaticLanNetworking -- Configuration of the IPv6 over LAN networking
cfgIPv6URL           -- Configuration of the iDRAC IPv6 URL.

For Help on configuring the properties of a group - racadm help config

-----------------------------------------------------------------------

Install racadm on CentOS 6 / RHEL 6 for Dell iDRAC

Almost every tutorial I’ve looked at tells you to install the racadm utility by downloading the OMSA tarball and installing all the RPMs, etc. It’s a lot easier to install from the srvadmin repository. If you’re using Ubuntu, click here. For Centos/RHEL, here’s how:

First, wget and run the repo install script from Dell. This will make the srvadmin packages available to yum.

[[email protected] ~]# wget -q -O - http://linux.dell.com/repo/hardware/latest/bootstrap.cgi | bash

Next, you can list all the srvadmin packages and pick what you want. In my case, all I need is racadm for iDRAC7, so I’ll just install that to keep it simple.

[[email protected] ~]# yum -y install srvadmin-idrac7

You will see this message after it installs, meaning you will need to logout to use plain old racadm from the CLI. Otherwise, you’ll need to use the full path since $PATH has not been reset (until you logout/login).

     **********************************************************
     After the install process completes, you may need
     to log out and then log in again to reset the PATH
     variable to access the Server Administrator CLI utilities

     **********************************************************

At this point, whether you logout or not, racadm is available to use from the full path. On my CentOS 6 installation, it was in /opt/dell/srvadmin/sbin/.

[[email protected] ~]# sudo /opt/dell/srvadmin/sbin/racadm -r 192.168.0.120 -u root -p calvin help
Security Alert: Certificate is invalid - self signed certificate
Continuing execution. Use -S option for racadm to stop execution on certificate-related errors.

 help [subcommand]    -- display usage summary for a subcommand
 arp                  -- display the networking ARP table
 clearasrscreen       -- clear the last ASR (crash) screen
 closessn             -- close a session
 clrraclog            -- clear the RAC log
 clrsel               -- clear the System Event Log (SEL)
 config               -- Deprecated: modify RAC configuration properties
 coredump             -- display the last RAC coredump
 coredumpdelete       -- delete the last RAC coredump
 eventfilters         -- Alerts configuration commands
 fwupdate             -- update the RAC firmware
 get                  -- display RAC configuration properties
 getconfig            -- Deprecated: display RAC configuration properties
 getled               -- Get the state of the LED on a module.
 getniccfg            -- display current network settings
 getraclog            -- display the RAC log
 getractime           -- display the current RAC time
 getsel               -- display records from the System Event Log (SEL)
 getsensorinfo        -- display system sensors
 getssninfo           -- display session information
 getsvctag            -- display service tag information
 getsysinfo           -- display general RAC and system information
 gettracelog          -- display the RAC diagnostic trace log
 getuscversion        -- display the current USC version details
 getversion           -- display the current version details
 ifconfig             -- display network interface information
 inlettemphistory     -- inlet temperature history operations
 krbkeytabupload      -- upload kerberose keytab file to the RAC
 lclog                -- LCLog operations
 frontpanelerror      -- hide LCD errors - color amber to blue
 netstat              -- display routing table and network statistics
 ping                 -- send ICMP echo packets on the network
 ping6                -- send ICMP echo packets on the network
 racdump              -- display RAC diagnostic information
 racinternalversion   -- Deprecated: Internal command for version exchange
 racreset             -- perform a RAC reset operation
 racresetcfg          -- restore the RAC configuration to factory defaults
 remoteimage          -- make a remote ISO image available to the server
 serveraction         -- perform system power management operations
 set                  -- modify RAC configuration properties
 setled               -- Set the state of the LED on a module.
 setniccfg            -- modify network configuration properties
 sshpkauth            -- manage SSH PK authentication keys on the RAC
 sslcertupload        -- upload an SSL certificate to the RAC
 sslcertdelete        -- delete an SSL certificate on the iDRAC
 sslcertdownload      -- download an SSL certificate from the RAC
 sslcertview          -- view SSL certificate information
 sslcsrgen            -- generate a certificate CSR from the RAC
 sslEncryptionStrength -- Display or modify the SSL Encryption strength.
 sslkeyupload         -- upload an SSL key to the RAC
 sslresetcfg          -- resets the web certificate to default and restarts the web server.
 testemail            -- test RAC e-mail notifications
 testtrap             -- test RAC SNMP trap notifications
 testalert            -- test RAC SNMP - FQDN trap notifications
 traceroute           -- print the route packets trace to network host
 traceroute6          -- print the route packets trace to network host
 usercertupload       -- upload an user certificate to the DRAC
 usercertview         -- view user certificate information
 vflashpartition      -- manage partitions on the vFlash SD card
 vflashsd             -- perform vFlash SD Card initialization
 vmdisconnect         -- disconnect Virtual Media connections
 vmkey                -- Deprecated: perform vFlash operations
 license              -- License Manager commands
 debug                -- Field Service Debug Authorization facility commands
 raid                 -- Monitoring and Inventory of H/W RAID connected to the server.
 hwinventory          -- Monitoring and Inventory of H/W NICs connected to the server.
 nicstatistics        -- Statistics for NICs connected to the server.
 fcstatistics         -- Statistics for FCs connected to the server.
 update               -- Platform Update of the devices on the server
 jobqueue             -- Jobqueue of of the jobs currently scheduled
 systemconfig         -- Backup &/or Restore of iDRAC Config and Firmware

 Groups

idRacInfo            -- Information about iDRAC being queried
cfgRemoteHosts       -- Properties for configuration of the SMTP server
cfgUserAdmin         -- Information about iDRAC users
cfgEmailAlert        -- Parameters to configure e-mail alerting capabilities
cfgSessionManagement -- Information of the session Properties
cfgSerial            -- Provides configuration parameters for the iDRAC
cfgOobSnmp           -- Configuration of the SNMP agent and trap capabilities
cfgRacTuning         -- Configuration for various iDRAC properties.
ifcRacManagedNodeOs  -- Properties of the managed server OS
cfgRacSecurity       -- Configure SSL certificate signing request settings
cfgRacVirtual        -- Configuration Properties for iDRAC Virtual Media
cfgActiveDirectory   -- Configuration of the iDRAC Active Directory feature
cfgLDAP              -- Configuration properties for LDAP settings
cfgLdapRoleGroup     -- Configuration of role groups for LDAP
cfgLogging           -- Group Description for group cfgLogging
cfgStandardSchema    -- Configuration of AD standard schema settings
cfgIpmiSerial        -- Properties to configure the IPMI serial interface
cfgIpmiSol           -- Configuration the SOL capabilities of the system
cfgIpmiLan           -- Configuration the IPMI over LAN of the system
cfgIpmiPef           -- Configuration the platform event filters
cfgServerPower       -- Provides power management features
cfgServerPowerSupply -- Provides information related to the power supplies
cfgVFlashSD          -- Configure the properties for the vFlash SD card
cfgVFlashPartition   -- Configure partitions on the vFlash SD Card
cfgUserDomain        -- Configure the Active Directory user domain names
cfgSmartCard         -- Properties to access iDRAC using a smart card
cfgServerInfo        -- Configuration of first boot device
cfgSensorRedundancy  -- Configure the power supply redundancy
cfgLanNetworking     -- Parameters to configure the iDRAC NIC
cfgStaticLanNetworking -- Parameters to configure the iDRAC NIC
cfgNetTuning         -- Group Description for group cfgNetTuning
cfgIPv6LanNetworking -- Configuration of the IPv6 over LAN networking
cfgIPv6StaticLanNetworking -- Configuration of the IPv6 over LAN networking
cfgIPv6URL           -- Configuration of the iDRAC IPv6 URL.

Rsyslog Not Connecting to Splunk on Non-Standard Port

I was puling my hair out the other day trying to figure out why some server logs weren’t shipping to our Splunk server. netstat -antp | grep 3514 yielded nothing, so this server was not even connecting to Splunk. I ran a tcpdump and it captured no packets. I scratched my head for a little while after I disabled iptables and the results were the same. It took me a little more time to realize that SELinux was the culprit so I needed to add a context for the non-standard syslog port I was using (3514 in this case).

First, I installed policycoreutils-python to get the semanage tool.

yum -y install policycoreutils-python

Next, I added the context to allow port 3514 and restart rsyslog

semanage port -a -t syslogd_port_t -p tcp 3514
sudo service rsyslog restart

Once I verified it worked on one server, I added this bit to my rsyslog Puppet class

        package {'policycoreutils-python': ensure=> 'installed',}

        exec {'semanage_syslog':
                command => 'semanage port -a -t syslogd_port_t -p tcp 3514',
                unless => "semanage port --list | /bin/grep '3514'",
                path => ['/usr/bin', '/sbin', '/bin', '/usr/sbin'],
                require => Package['policycoreutils-python'],
                notify => Service['rsyslog'],
        }

Migrate VMWare VM (.vmdk) to KVM (.qcow2)

This migration was done on a CentOS 6.5 host. I’m sure Ubuntu is not a whole lot different other than the network configs and some package names.

First, install libvirt, kvm and other necessary packages.

yum -y install kvm qemu-kvm python-virtinst libvirt libvirt-python virt-manager libguestfs-tools bridge-utils

Next, I created a new directory because of the way my paritioning was done, but you could just copy the existing .vmdk to the libvirt images directory.

sudo mkdir /your/vm/path
sudo cp -av /path/to/old/vm.vmdk /your/vm/path

Once it’s copied, convert it to .qcow2 for KVM compatibility

sudo qemu-img convert /your/vm/path/vm.vmdk -O qcow2 /your/vm/path/vm.qcow2

Now you will need to configure a bridge so virtual networking will work as needed. Since your NIC ports will become slaves to the bridge, this config will now be the one with all the IP configuration.

To avoid any issues on startup, I first remove NetworkManager (on CentOS). Make sure your existing interface(s) are not NM_CONTROLLED before you remove it.

Next, create the bridge (replace as needed with your LAN information):

DEVICE=br0
TYPE=Bridge
BOOTPROTO=static
ONBOOT=yes
IPADDR=10.201.202.30
NETMASK=255.255.255.0
DELAY=0
GATEWAY=10.201.202.1
DNS1=10.201.202.1

For a seamless transition to the bridge (without your session being dropped), I first bring up the new bridge

sudo ifup br0

After that, we will re-configure your NIC port(s) to be slaves to the bridge. Delete all the lines in your eth0/em0 except for the following:

TYPE=Ethernet
ONBOOT=yes
HWADDR=[YOUR MAC]
BRIDGE=br0

Then you can restart networking to see your changes take full effect

sudo service network restart

Now that our networking and virtual disk are ready, we can start libvirtd and spin up the new VM. Amend the various parameters in the virt-install command to your needs. In this case, I was migrating a Windows 7 VM. The –import at the end is important as it tells virt-install to use the existing virtual disk.

service libvirtd start
chkconfig libvirtd on
 
#install the vm with virt-install
virt-install --connect qemu:///system --virt-type kvm --ram 1024 -n win64 -r 1024 --os-type=windows --os-variant=win7 --disk path=/your/vm/path/vm.qcow2,device=disk,format=qcow2 --vcpus=1 --vnc --noautoconsole --import

Now you can connect to your VM with VNC, SSH, RDP or whatever you have configured.

Londiste3 queue position lost error

I recently encountered a londiste3 replication error (in a multi-master environment) where the status command noted that the queue position had been lost.

[[email protected] ~]# sudo londiste3 /opt/skytools-3.1.5/etc/remote_slave.ini status
Queue: facility1   Local node: remote_slave
master (root)
  |                           Tables: 1/0/0
  |                           Lag: 4m32s, Tick: 12080717, NOT UPTODATE
  +--: remote_slave (leaf)
                              Tables: 0/1/0
                              Lag: 1d22h50m5s, Tick: 11995419
                              ERR: remote_slave: Lost position: batch 11995443..11995444, dst has 11995419

The resolution was fairly simple. I used the –reset option on the worker to reset the queue position on the remote site and then issued wait-sync to get the table queue moving again.

[[email protected] etc]# sudo londiste3 /opt/skytools-3.1.5/etc/remote_slave.ini worker --reset
2015-06-17 15:54:46,307 1206 INFO Resetting queue tracking on dst side
[[email protected] etc]# sudo londiste3 /opt/skytools-3.1.5/etc/remote_slave.ini status
Queue: facility1   Local node: remote_slave
master (root)
  |                           Tables: 1/0/0
  |                           Lag: 7m19s, Tick: 12080717, NOT UPTODATE
  +--: remote_slave (leaf)
                              Tables: 0/1/0
                              Lag: 1d22h52m51s, Tick: 11995443

[[email protected] etc]# sudo londiste3 /opt/skytools-3.1.5/etc/remote_slave.ini wait-sync
2015-06-17 15:58:35,715 1619 INFO Waiting until all tables are in sync
2015-06-17 15:58:35,959 1619 INFO 1/1 table(s) to copy