SNMP server not reporting disk size

I was setting up snmpd on an Ubuntu box, and noticed that it was reporting weird numbers for a couple of XFS filesystems I had setup for Minio.

An snmpwalk showed values like this:

HOST-RESOURCES-MIB::hrStorageType.51 = OID: HOST-RESOURCES-TYPES::hrStorageFixedDisk
HOST-RESOURCES-MIB::hrStorageType.52 = OID: HOST-RESOURCES-TYPES::hrStorageFixedDisk
...
HOST-RESOURCES-MIB::hrStorageDescr.51 = STRING: /minio/disk1
HOST-RESOURCES-MIB::hrStorageDescr.52 = STRING: /minio/disk2
...
HOST-RESOURCES-MIB::hrStorageAllocationUnits.51 = INTEGER: 0 Bytes
HOST-RESOURCES-MIB::hrStorageAllocationUnits.52 = INTEGER: 0 Bytes
...
HOST-RESOURCES-MIB::hrStorageSize.51 = INTEGER: 0
HOST-RESOURCES-MIB::hrStorageSize.52 = INTEGER: 0
...
HOST-RESOURCES-MIB::hrStorageUsed.51 = INTEGER: 0
HOST-RESOURCES-MIB::hrStorageUsed.52 = INTEGER: 0

So it was reporting the existence of the disks, but with all 0 values. The root ext4 filesystem showed up fine, was it something to do with XFS?

Turns out the answer was NO, it was the permissions of the /minio directory that the filesystems were mounted under. I figured this out when I noticed that df -h showed the disks when I was running as root

/dev/sdb1        60G  461M   60G   1% /minio/disk1
/dev/sdc1        60G  461M   60G   1% /minio/disk2

But when running as non-root, such as the Debian-snmp user, df -h didn't show the disks at all.

Turns out I had been too strict with the permissions on /minio, I had originally set that to

drwx------   4 minio-user root       4096 Feb 14 14:57 minio

But that apparently, and to my surprise, prevented snmpd from being able to read the size and usage information for the mounts under that directory. Changing that to 0755 fixed the problem, and I just made sure that the mountpoints had more strict permissions

drwxr-x--- 3 minio-user root 24 Feb 27 13:08 disk1
drwxr-x--- 3 minio-user root 24 Feb 27 13:08 disk2

Automatically restarting Percona XtraDB cluster

I've been experimenting with Percona XtraDB cluster, and found that by default it requires manual intervention to restart the cluster from an all-nodes-down state when the nodes were gracefully shutdown. The docs talk about identifying which node has safe_to_bootstrap: 1 in it's /var/lib/mysql/grastate.dat file, and on that node starting the mysql@boostrap service instead of just plain mysql.

Looking at a file and acting on what's found seems like something that could be automated, so here's my take for an Ubuntu 22.04 setup:

On each node (yay Ansible!) I added this script as /usr/local/sbin/choose-mysql-service.sh

#!/bin/bash

GRASTATE="/var/lib/mysql/grastate.dat"

service="mysql"

# Start a different service if grastate.dat is present
# with safe_to_bootstrap: 1
#
if [ -f $GRASTATE ]; then
    if grep --quiet "^safe_to_bootstrap: 1" $GRASTATE; then
        service="mysql@bootstrap"
    fi
fi

echo "Starting $service"
systemctl start $service

Then I added a one-shot systemd unit to execute at boot time, as /etc/systemd/system/choose-mysql-service.service

[Unit]
Description=Choose MySQL service
After=network.target

[Service]
Type=oneshot
ExecStart=/usr/local/sbin/choose-mysql-service.sh
RemainAfterExit=true

[Install]
WantedBy=multi-user.target

And the disabled the default mysql service and enabled my new unit with:

systemctl daemon-reload
systemctl disable mysql
systemctl enable choose-mysql-service

So now when the OS boots, instead of just blindly trying to start mysql, it looks at the grastate.dat and if it has safe_to_bootstrap: 1 it starts mysql@bootstrap instead - or otherwise falls back to the default of starting mysql

I also shared this on the Percona Forum, look for feedback there

UFW and LXC/LXD on Ubuntu 22.04

I recently setup a new Ubuntu server with LXC containers. At first it all went great, but then later when I enabled UFW, things got flaky. Looking at /var/log/syslog I saw UFW was blocking lots of traffic from inside the containers.

Also when restarting a container, the container wouldn't get one of the bridged 10.x.x.x IP addresses.

After Googling a bit, I found the magic commmands on this discussion:

ufw allow in on lxdbr0
ufw route allow in on lxdbr0

In hindsight, I think it would have been better to enable ufw before doing anything else with the new install, that way the problems would have been more obvious right away - rather than it being a "geez, it was working before" type situation.

Decrease snmpd logging level on Ubuntu 18.04

I recently updated some servers from Ubuntu 16.04 to 18.04, and found that the snmpd daemon was generating way too many log entries in /var/log/syslog - one for every SNMP query coming from our monitoring system.

In older Ubuntus I had edited /etc/defaults/snmpd to change the SNMPDOPTS line to have a different -Ls parameter, but it seems that on the new Ubuntu, the systemd service for snmp doesn't use that defaults file at all. A comment on this serverfault question gave me a clue on how to fix it in systemd - I thought I'd elaborate here.

If you run

systemctl cat snmpd.service

You see the current service file:

# /lib/systemd/system/snmpd.service
[Unit]
Description=Simple Network Management Protocol (SNMP) Daemon.
After=network.target
ConditionPathExists=/etc/snmp/snmpd.conf

[Service]
Environment="MIBSDIR=/usr/share/snmp/mibs:/usr/share/snmp/mibs/iana:/usr/share/snmp/mibs/ietf:/usr/share/mibs/site:/usr/share/snmp/mibs:/usr/share/mibs/iana:/usr/share/mibs/ietf:/usr/share/mibs/netsnmp"
Environment="MIBS="
Type=simple
ExecStartPre=/bin/mkdir -p /var/run/agentx
ExecStart=/usr/sbin/snmpd -Lsd -Lf /dev/null -u Debian-snmp -g Debian-snmp -I -smux,mteTrigger,mteTriggerConf -f
ExecReload=/bin/kill -HUP $MAINPID

[Install]
WantedBy=multi-user.target

I wanted to override the ExecStart line with something different. To do that, run

systemctl edit snmpd.service

This brings up your default editor with a blank file. I entered these new lines:

# Override default "-Lsd" paramter to "-LSwd" to decrease logging level
[Service]
ExecStart=
ExecStart=/usr/sbin/snmpd -LSwd -Lf /dev/null -u Debian-snmp -g Debian-snmp -I -smux,mteTrigger,mteTriggerConf -f

The first ExecStart= line is a bit odd, without it you get an error:

snmpd.service: Service has more than one ExecStart= setting, which is only allowed for Type=oneshot services. Refusing.

so the first line 'clears' the setting before processing your own version.

Save your file (systemctl edit stores it as /etc/systemd/system/snmpd.service.d/override.conf), and then run service snmpd restart to have take effect. If you re-run systemctl cat snmpd.service you should now see:

# /lib/systemd/system/snmpd.service
[Unit]
Description=Simple Network Management Protocol (SNMP) Daemon.
After=network.target
ConditionPathExists=/etc/snmp/snmpd.conf

[Service]
Environment="MIBSDIR=/usr/share/snmp/mibs:/usr/share/snmp/mibs/iana:/usr/share/snmp/mibs/ietf:/usr/share/mibs/site:/usr/share/snmp/mibs:/usr/share/mibs/iana:/usr/share/mibs/ietf:/usr/share/mibs/netsnmp"
Environment="MIBS="
Type=simple
ExecStartPre=/bin/mkdir -p /var/run/agentx
ExecStart=/usr/sbin/snmpd -Lsd -Lf /dev/null -u Debian-snmp -g Debian-snmp -I -smux,mteTrigger,mteTriggerConf -f
ExecReload=/bin/kill -HUP $MAINPID

[Install]
WantedBy=multi-user.target

# /etc/systemd/system/snmpd.service.d/override.conf
# Override default "-Lsd" paramter to "-LSwd" to decrease logging level
[Service]
ExecStart=
ExecStart=/usr/sbin/snmpd -LSwd -Lf /dev/null -u Debian-snmp -g Debian-snmp -I -smux,mteTrigger,mteTriggerConf -f

Which is a combination of the default service along with your override.

If you have other servers you want to copy your /etc/systemd/system/snmpd.service.d/override.conf file to, you need to run

systemctl daemon-reload
service snmpd restart

to have it take effect.

VM Serial Console part 2

Fooling around a bit more with accessing a VM's serial console from a KVM hypervisor with

virsh console mymachine

I found one thing that doesn't carry over from the host to the VM is the terminal window size, so if you try to use something like vim through the console connection, it seems to assume a 80x25 or so window, and when vim exits your console is all screwed up.

It looks like a serial connection doesn't have an out-of-band way of passing that info the way telnet or ssh does, so you have set it manually. You can discover your settings on the host machine with

stty size

which should show something like:

60 142

on the VM, the same command probably shows

0 0

zero rows and columns, no wonder it's confused. Fix it by setting the VM to have the same rows and columns as the host with something like:

stty rows 60 columns 142

and you're in business.

Enabling VM serial console on stock Ubuntu 10.04 server

So I've been running Ubuntu 10.04 server virtual machines on a host running KVM as the hypervisor, and thought I should take a look at accessing the VM's console from the host, in case there's a problem with the networking on the VM.

The hosts's VM libvirt definition shows a serial port and console defined with

<serial type='pty'>
  <source path='/dev/pts/1'/>
  <target port='0'/>
  <alias name='serial0'/>
</serial>
<console type='pty' tty='/dev/pts/1'>
  <source path='/dev/pts/1'/>
  <target type='serial' port='0'/>
  <alias name='serial0'/>
</console>

and within the stock Ubuntu 10.04 server VM, dmesg | grep ttyS0 shows:

[    0.174722] serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
[    0.175027] 00:05: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A

So the virtual hardware is all setup on both ends, but ps aux | grep ttyS0 doesn't show anything

We need to have a process listening to that port. To do that, create a file named /etc/init/ttyS0.conf with these contents:

# ttyS0 - getty
#
# This service maintains a getty on ttyS0 from the point the system is
# started until it is shut down again.

start on stopped rc RUNLEVEL=[2345]
stop on runlevel [!2345]

respawn
exec /sbin/getty -L 38400 ttyS0 xterm-color

and then run

initctl start ttyS0

back in the host machine run virsh list to find the name or id number of your VM, and then

virsh console <your-vm-name-or-number>

to connect, hit return and you should see a login prompt.

Customizing cloned Ubuntu VMs

I was playing with creating and cloning Ubuntu virtual machines the other day, and got to the point where I had a nicely setup reference image that I could just copy to fire up additional VMs that would be in a pretty usable state.

There are a few things within a cloned VM that you'd want to change if you were going to keep the new instance around, such as the hostname, SSH host keys, and disk UUIDs. I threw together a simple shell script to take care of these things automatically.

#!/bin/sh
#
# Updates for cloned Ubuntu VM
#

#
# Some initial settings cloned from the master
#
ROOT=/dev/vda1
SWAP=/dev/vdb1
LONG_HOSTNAME=ubuntu.local
SHORT_HOSTNAME=ubuntu

if [ -z $1 ]
then
    echo "Usage: $0 <new-hostname>"
    exit 1
fi

# 
# Update hostname
#
shorthost=`echo $1 | cut -d . -f 1`
echo $1 >/etc/hostname
hostname $1
sed -i -e "s/$LONG_HOSTNAME/$1/g" /etc/hosts
sed -i -e "s/$SHORT_HOSTNAME/$shorthost/g" /etc/hosts

#
# Generate new SSH host keys
#
rm /etc/ssh/ssh_host_*
dpkg-reconfigure openssh-server

#
# Change root partition UUID
#
OLD_UUID=`blkid -o value $ROOT | head -n 1`
NEW_UUID=`uuidgen`
tune2fs -U $NEW_UUID $ROOT
sed -i -e "s/$OLD_UUID/$NEW_UUID/g" /etc/fstab /boot/grub/grub.cfg

#
# Change swap partition UUID
#
OLD_UUID=`blkid -o value $SWAP | head -n 1`
NEW_UUID=`uuidgen`
swapoff $SWAP
mkswap -U $NEW_UUID $SWAP
swapon $SWAP
sed -i -e "s/$OLD_UUID/$NEW_UUID/g" /etc/fstab

#
# Remove udev lines forcing new MAC address to probably show up as eth1
#
sed -i -e "/PCI device/d"     /etc/udev/rules.d/70-persistent-net.rules
sed -i -e "/SUBSYSTEM==/d" /etc/udev/rules.d/70-persistent-net.rules

echo "UUID and hostname updated, udev nic lines removed,  be sure to reboot"

I'd then run it on the cloned machine with something like

update_clone.sh mynewmachine.foobar.com

This somewhat particular to my specific master VM, in that it's expecting one disk dedicated to root and one disk dedicated to swap, and the VM was created with ubuntu.local as the hostname. Hopefully though it'll give some ideas about what to look for and how to script those changes.

Setting up a PXE environment for OS installations

If you're fooling around with various OSes, installing them by first burning CDs or DVDs gets to be a drag - and you end up with piles of old discs that just go into a landfill. Sure, there are rewritable disks, but they wear out and get scratched eventually. USB memsticks can be painful too - sometimes difficult to create and with different BIOSes having different levels of support.

A slick way to go is to set yourself up to do PXE (Preboot eXecution Environment) installations over a network. Most network cards have had PXE support included for many years now. If you have a machine handy that can act as a simple server, you can have an enviroment where you boot a machine, select the OS you want to install from a menu, and everything will just be pulled over your local network.

There are plenty of writeups on how to PXE install Ubuntu from an Ubuntu server, or FreeBSD from a FreeBSD server - but to make things more interesting and explicit I'll go cross-platform and talk about deploying Ubuntu Server 11.04 from a FreeBSD 8.2 server, and try to make it general enough so that later on we can add other OSes to the menu such as CentOS or OpenBSD.

Requirements

PXE booting a machine requires two basic services be present on your network:

  • DHCP - to assign the booted machine an IP address and tell it what "network bootstrap program" (NBP) to fetch from a TFTP server

  • TFTP (Trivial FTP - not to be confused with regular FTP) serves up the initial boot files

OSes such as Ubuntu or CentOS require a third service:

  • HTTP Server - serves up the bulk of the OS install files.

PXELINUX

For the Network Bootstram Program, we'll use PXELINUX, which is available as part of the SYSLINUX project. The name SYSLINUX is a bit misleading in that it's not actually Linux, but rather a collection of bootloaders that are often used with Linux, and capable of loading other OSes as well. Think of something more along the lines of GRUB, than an actual Linux distro.

To start off with, I'll create a /tftpboot directory, download syslinux-4.04.tar.gz from here, extract and copy two files we want:

mkdir /tftpboot
fetch http://www.kernel.org/pub/linux/utils/boot/syslinux/syslinux-4.04.tar.gz
tar xzvf syslinux-4.04.tar.gz
cp syslinux-4.04/core/pxelinux.0 /tftpboot
cp syslinux-4.04/com32/menu/menu.c32 /tftpboot

We're done with the syslinux download now, so you could clean it up if you want with:

rm -rf syslinux-4.04*

Next, create a configuration directory

mkdir /tftpboot/pxelinux.cfg

and in that directory create a file named default with these initial contents:

DEFAULT menu.c32
PROMPT 0
TIMEOUT 200                           

LABEL local                           
    MENU LABEL Local Boot
    LOCALBOOT 0                     

That should be enough to get us a barebones menu when we PXE boot a machine, with a single option to boot off the local harddisk (we'll get to Ubuntu later).

Enable TFTP

TFTP is already included in FreeBSD, just need to make sure it's enabled.

In /etc/inetd.conf make sure this line has the default # removed from the front (so it's not commented out)

tftp   dgram   udp     wait    root    /usr/libexec/tftpd      tftpd -l -s /tftpboot

In /etc/rc.conf, make sure inetd is enabled, adding if necessary:

inetd_enable="YES"

Depending on what you had to do above, start, or reload the inetd daemon with:

service inetd start

or

service inetd reload

Check that the machine is now listing on UDP port 69

sockstat | grep :69

See if you can fetch the NBP using the tftp utility (assuming your server's IPv4 address on the network you'll be doing PXE boots is 10.0.0.1)

cd /tmp
tftp 10.0.0.1
tftp> get /pxelinux.0
tftp> quit
rm pxelinux.0

If it works you should have seen somthing like:

Received 26443 bytes during 0.1 seconds in 53 blocks

Tweak DHCP Server

For this part I'm assuming you're running an ISC dhcpd server (if not, we'll have to cover that in another post). You basically just need to add two lines to /usr/local/etc/dhcpd.conf telling a client what server to use for TFTP and what NBP to fetch:

next-server 10.0.0.1;
filename "/pxelinux.0";

On my server, I just wanted to do this on one particular subnet, so there's a chunk that looks something like this now:

subnet 10.0.0.0 netmask 255.255.255.0 
    {
    range 10.0.0.127 10.0.0.250;
    option routers 10.0.0.1;

    next-server 10.0.0.1;
    filename "/pxelinux.0";
    }

Restart dhcpd

service isc-dhcpd restart

Give it a try

On your client machine, you may have to poke around in the BIOS to enable PXE booting. You'll have to figure out this part for yourself. If you can select your Network Card as the boot device, and everything else is working right, you should see a simple menu something like this:

Initial success

OK! we're at the "Hello World" stage, we know the client and server are doing the bare minimum necessary for PXE to function at all. Time to move on to the good stuff.

Ubuntu Server 11.04

For this next step, I'll assume you've downloaded an ISO into say /foo/ubuntu-11.04-server-amd64.iso The specific version shouldn't matter too much, so if you want to do 10.04 LTS or something else, it should all be about the same.

Mount the ISO image, so we can copy a couple files into /tftpboot and share the rest with a web server.

mkdir -P /iso_images/ubuntu-11.04-server-amd64
mount -t cd9660 /dev/`mdconfig -f /foo/ubuntu-11.04-server-amd64.iso` /iso_images/ubuntu-11.04-server-amd64
mkdir /tftpboot/ubuntu-11.04-server-amd64
cp /iso_images/ubuntu-11.04-server-amd64/install/netboot/ubuntu-installer/amd64/linux /tftpboot/ubuntu-11.04-server-amd64
cp /iso_images/ubuntu-11.04-server-amd64/install/netboot/ubuntu-installer/amd64/initrd.gz /tftpboot/ubuntu-11.04-server-amd64

So now our /tftpboot directory has these five files underneath it:

pxelinux.0
pxelinux.cfg/default
menu.c32
ubuntu-11.04-server-amd64/linux
ubuntu-11.04-server-amd64/initrd.gz

To the /tftpboot/pxelinux.cfg/default file append

LABEL ubuntu-11.04-server-amd64-install             
    MENU LABEL Ubuntu 11.04 Server AMD64 Install
    kernel ubuntu-11.04-server-amd64/linux
    append vga=788 initrd=ubuntu-11.04-server-amd64/initrd.gz

Try PXE booting your client again, this time you'll have "Ubuntu 11.04 Server AMD64 Install" as one of your choices, select that, cross your fingers, and if all goes well in a few seconds you should see:

Initial success

and you can go through and answer the initial questions about the install.

If you're OK with pulling the bulk of the OS over the internet from the official Ubuntu mirrors, it should work although it might be slow. Since we have a nice server sitting on our LAN with a copy of the ISO, we should setup to use that and do a much faster install.

Web Server

For this example, I'll assume nginx has been installed as the webserver (any one will do though, so if you've already got apache installed - that'll work fine too).

The default nginx install uses /usr/local/www/nginx as its docroot, lets put a symlink to our mounted ISO image in there:

ln -s /iso_images/ubuntu-11.04-server-amd64 /usr/local/www/nginx

and also put in a minimal Debian Installer "preseed" file in there that'll help things along by telling the installer to use our webserver for the installation packages. Create a text file named /usr/local/www/nginx/ubuntu-11.04-server-amd64.txt with these contents:

d-i mirror/country string manual
d-i mirror/http/hostname string 10.0.0.1
d-i mirror/http/directory string /ubuntu-11.04-server-amd64
d-i mirror/http/proxy string

Check that you can fetch that file with the URL: http://10.0.0.1/ubuntu-11.04-server-amd64.txt

Edit the /tftpboot/pxelinux.cfg/default file and append

url=http://10.66.0.1/ubuntu-11.04-server-amd64.txt

to the end of the append line of our Ubuntu section, so it now looks like:

LABEL ubuntu-11.04-server-amd64-install             
    MENU LABEL Ubuntu 11.04 Server AMD64 Install
    kernel ubuntu-11.04-server-amd64/linux
    append vga=788 initrd=ubuntu-11.04-server-amd64/initrd.gz url=http://10.66.0.1/ubuntu-11.04-server-amd64.txt

Try PXE booting the Ubuntu install again. You'll still get some initial questions about language and keyboard (we can deal with those in another post), but you shouldn't be asked about mirrors - the installer will know to pull files from your local webserver.

Go through the install on the client, watch the /var/log/nginx-access.log file on the server, you'll see the installer fetching all kinds of files, so you'll know it's all working.

You're in business

So at this point you've got yourself a working PXE installation environment and can do a basic Ubuntu server install.

By adding a few more parameters to your seed file and the PXE configuration you can eliminate some of the installer questions. I'll probably write about that in another post, but if you want to figure it out yourself, check out the Ubuntu Installation Guide - Appendix B. Automating the installation using preseeding

There's so many things you can do with the PXE menus, kernel options, and so on - it can't all be covered in one place. But hopefully you've got a good starting point now, if you know all the basic services are in place and working.

KVM Networking

Still playing with KVM (Kernel-based Virtual Machine), this time checking out some networking features. I've been running Ubuntu 8.04 LTS Server (Hardy Heron), both as the host and as a VM on that host. Networking is setup to use a bridge.

KVM offers different emulated NICs, I took a quick look at running iperf between the VM and the host, and got these speeds for a few select NIC models:

  • RTL-8139C+ (the default): ~210 Mb/sec
  • Intel e1000 (somewhat recommended here): ~ 330Mb/sec
  • virtio: ~ 700Mb/sec

The thing about virtio though is that it doesn't work when the VMs RAM is set to 4GB. So I guess you can have fast networking, or lots of memory, but not both.

Playing with KVM and LVM on Linux

I'm still experimenting with Ubuntu 8.04 Server (Hardy Heron), and have switched from Xen to KVM (Kernel-based Virtual Machine). Xen worked well on a little test machine I had, but when I tried it on a brand-new Supermicro server, it turned out to have a problem with the Intel NIC. Since it seems Ubuntu is recommending KVM over Xen, and the server supports hardware virtualization, I figured I'd give it a try.

One big difference is that KVM does full emulation, which means any disk space you give it from LVM (Logical Volume Manager), will be a full virtual disk, with a partition table. It's a little more complicated to access filesystems within the virtual disk that it was with Xen, I wanted to jot some notes down here mostly for myself on how to do that.

If I've created a logical volume named /dev/myvg/test_vm and installed another linux on it with a single ext3 filesystem (/dev/sda1 from the point of view of the VM) and some swap space (/dev/sda5), it can be accessed when the VM isn't running with the help of the kpartx utility...

kpartx -av /dev/myvg/test_vm

would read the partition table on the virtual disk and create:

/dev/mapper/myvg-test_vm1 
/dev/mapper/myvg-test_vm2 
/dev/mapper/myvg-test_vm5

Then you can

mount /dev/mapper/myvg-test_vm1 /mnt

to mess with the VMs /dev/sda1. To clean things up when finished, run:

umount /mnt
kpartx -d /dev/myvg/test_vm

Snapshots

If you want to look at the contents of a running VM's disks (perhaps for backing it up) you can use LVM snapshots. For example:

lvcreate --snapshot --size 1G --name test_snap /dev/myvg/test_vm
kpartx -av /dev/myvg/test_snap
mount /dev/mapper/myvg-test_snap1 /mnt
   .
   (play with VM's /dev/sda1 in /mnt)
   .
umount /mnt
kpartx -dv /dev/myvg/test_snap
lvremove /dev/myvg/test_snap