Troubleshooting custom cron scripts

I often have trouble when creating my own scripts to be run by cron on Linux servers. On most of the servers that I administer, I use a homegrown backup script (based on these excellent instructions) that I place in /etc/cron.daily. After upgrading a server to Debian Etch recently I found that my backup script, named backup-script.sh, was no longer working.

After stumbling around in the dark for a while, I found a useful tool to use when trying to troubleshoot cron jobs:

run-parts --list /etc/cron.daily

This command lists the valid scripts in a directory. I noticed that backup-script.sh was the only script in the directory that was missing from the list. After checking that ownership and permissions on all of the scripts were the same (using ls -l), it clicked that the only difference between my backup script and the other valid scripts was that the backup script’s name ended with the extension .sh. After removing that, I used run-parts --list again and my backup script was now included in the list of valid files. Sweet! And a definite relief given how important reliable backups are!

Adventures in netbooting, part 2

Back at the ALP office and therefore back to work on netbooting. After making sure that TFTP was up and running, I moved on to downloading the netbootable OS that I’ll be using on my network. I decided to go with debirf because I like and am familiar with Debian, and because folks I know put it together and I trust their work (and appreciate their willingness to help me out of I find myself in a bind!) I found myself a bit confused as to what I needed to get in order to netboot their streamlined version of Debian, but I decided to go with their pre-built disk rescue kernel and initramfs. However, I ran into a whole bunch of problems there and couldn’t figure out how to get debirf going. I plan to give it a try in the future, but for the time being I’ve switched to using RIP (Rescue Is Possible) instead. I downloaded their easy-to-set-up PXE files, unzipped them into my tftpboot directory, and netbooted my first computer. Very exciting!

Now - on to creating the image that I want to clone onto my computers via netboot.

ETA: dkg helped me figure out by debirf problems. I was stuck because I thought I needed a debirf-specific pxelinux.0 file, but I don’t. I successfully got a copy of pxelinux.0, what dkg describes as the earliest stage of the pxelinux bootloader, from the syslinux package:

aptitude install syslinux
cp /usr/lib/syslinux/pxelinux.0 /var/lib/tftpboot/

Then I set up my pxelinux.cfg/default file as so:

DISPLAY boot.txt

DEFAULT debirf

LABEL debirf
kernel debirf/vmlinuz-2.6.22-3-486
append vga=normal initrd=debirf/debirf-rescue_lenny_2.6.22-3-486.cgz --

PROMPT 1
TIMEOUT 0

And my boot.txt file as so:

- BOOT MENU -
============

debirf

And now I’m netbooting debirf quite nicely!

Adventures in netbooting, part 1

(need to write about why I’m doing this)

The first thing that I found out was that I’ll need to be running a DHCP server and a TFTP server in order to set up a netboot environment. I assumed that there must be a DHCP server running somewhere in our network but figured that our Linksys cable/DSL router was taking care of DHCP; I’d need to set up a DHCP server on the Linux box and have it take over responsibilities from the router.

I checked out the settings on the router, a Linksys BEFSR41v4. After logging into the router’s control panel, the DHCP info was front and center. The router was indeed providing DHCP services, distributing IP addresses in a range from 192.168.1.100 to 192.168.1.149. I decided to set up my new Linux-box based DHCP server with a range from 192.168.1.150 to 192.168.1.199 to avoid any overlap with old IP addresses after the switch.

Now that I knew what the current DHCP situation was, I turned to setting up something new. dkg suggested checking out dnsmasq, which “is designed to provide DNS and, optionally, DHCP, to a small network.” Its simplification of DNS sounded particularly appealing to me, since I’m kinda intimidated by the idea of setting up my own full-blown nameservers (and also don’t have all the time in the world!)

First I started up a screen session, which I try to remember to do every time I’m logging into a Linux box remotely and doing anything approaching tricky. (Read more about the screen command here and here).

Since I’m running Debian Etch and dnsmasq is part of the Debian distro, I was able to use apt-get to easily install:

sudo apt-get install dnsmasq

libdbus-1-3 was automatically installed as a dependency, resolvconf was suggested and dbus was recommended; I took note of those last two in case I did find I needed them down the road. dnsmasq downloaded, installed and started up on my server. The dnsmasq setup documentation says that “a machine which already has a DNS configuration (ie one or more external nameservers in /etc/resolv.conf and any local hosts in /etc/hosts) can be turned into a nameserver simply by running dnsmasq, with no options or configuration at all” simply by telling the other machines on the network to use the machine running dnsmasq as a nameserver. And it worked! Sweet.

Next came setting up the DHCP server. The dnsmasq setup documentation wasn’t as helpful on this point, but luckily the dnsmasq.conf configuration file is really easy to read and understand. I found the required setting, uncommented it to enable the DHCP server, edited a few other settings and saved the config file, restarted dnsmasq, disabled DHCP on my router, and forced my laptop to renew its DHCP lease to see what happened.

First I celebrated - my IP address was being assigned in the correct range, I could connect to the server, and the nameserver was clearly working (I could ping the server’s hostname, which is found only in /etc/hosts on the server and not on my laptop.) However, my celebration was soon stymied because I couldn’t reach the outside world (for instance, google.com), only machines on my LAN. After some fruitless googling (I re-enabled DHCP on the router), I decided to take a look at dnsmasq.conf again, this time more patiently and carefully. Lo and behold, I found this key setting:

# Override the default route supplied by dnsmasq, which assumes the
# router is the same machine as the one running dnsmasq.
dhcp-option=3,192.168.1.1

This is what the setting looks like now that it’s properly configured. Before, this option was commented out. dhcp-option can be used to change all sorts of DHCP configurations that are set by default by dnsmasq. The “3″ refers to the option that I was changing, which is the router option. 192.168.1.1 is the IP address of the office router, which is indeed not the same machine as the one running dnsmasq. Properly configuring that setting made everything work like buttah. Beautiful. My return to the config file also revealed some settings that are important when running Samba on the same machine as dnsmasq, so I got those squared away as well.

Now that dnsmasq was working properly for DHCP and DNS services, I had to get TFTP going. I’d read that TFTP was included in dnsmasq, but couldn’t get it working. I have a hunch that it just isn’t available in dnsmasq as it’s installed automatically via apt for Debian. Maybe I’m wrong - someone should let me know if I am - but regardless, I decided to set up TFTP separately, following the directions in this handy guide to setting up netbooting.

And with that, I ran out of time! Luckily, much of the work is done. Next up - configuring PXE and getting the kernels up for the Linux distros I want to netboot and making sure that all works. And after that, I get to embark on a whole other adventure: setting up the first of the eight new PCs, creating an image of that first machine, then using my new netboot environment to clone the image onto the other computers.