Custom Debian Kernel Build without “make menuconfig”
I maintain a cluster of MySQL servers. I mentioned below that Linux has some memory management issues which cause our MySQL servers to swap when they really should not. So my idea was to upgrade the kernel to the Debian Etch Testing 2.6.25 kernel… but patched with Rik van Riel’s Split LRU patch (check it out: http://www.surriel.com/node/6).
Also… I don’t feel any need to compile a super efficient static kernel with only the modules we use. I’m OK with the Debian default initrd module loading kernel.
So what I ended up doing was first installing Debian’s kernel :
apt-get install linux-image-2.6.25-2-amd64
Then I grab the Debian kernel source (Linux kernel patched by Debian) :
apt-get source linux-image-2.6.25-2-amd64
which creates these files in the currect directory :
linux-2.6-2.6.25 linux-2.6_2.6.25-7.diff.gz linux-2.6_2.6.25-7.dsc linux-2.6_2.6.25.orig.tar.gz
Patch the kernel with the split-LRU patch :
cd linux-2.6-2.6.25
patch -p1 < ../linux-2.6.25-splitlru.patch
I move the changelog so that make-kpkg doesn’t complain and quit :
cd debian
mv changelog changelog.fu
cd ..
Here’s the secret sauce… I’m lazy and don’t want to do a “make menuconfig”
because I’m happy enough with the way Debian configures the kernel + initrd module loading.
So I copy the Debian config from the Kernel we just installed :
cp /boot/config-2.6.25-2-amd64 .config
Edit the Makefile and put unique id in EXTRAVERSION; this'll show up in a 'uname -a' :
e.g.
EXTRAVERSION = spinn3r.r1
I was building on a dual core machine so I set the CONCURRENCY_LEVEL
to tell the Debian make-kpkg to spawn instances of the compiler for building.
However if I was doing this often I'd use DistCC (http://code.google.com/p/distcc/).
export CONCURRENCY_LEVEL=2
Build the kernel, create debian package with a custom name :
fakeroot make-kpkg --initrd --us --uc --revision=spinn3r kernel_image
After some period of time that'll create a .deb file in the kernel directory's parent directory
which you can installed via a dpkg -i file.deb
Linux Swap Issue
I no longer have to use this workaround… since we patched our kernel.
The current 2.6 Linux Kernels seem to have some swap issues.
The Linux Kernel really likes to swap MySQL out to disk.
If for example you do a :
cat /proc/swaps
Often times on MySQL servers, with a data-set which can easily fit within memory,
swap is reported to be in use even though it should not.
Additionally sometimes too much swap space is reported used.
Here’s some related links on the subject :
- Re: Ability to limit or disable page caching?
- Using O_DIRECT on Linux and INNODB to Fix Swap Insanity
- MySQL and the Linux swap problem
- Should you have your swap file enabled while running MySQL ?
Here’s my work-around for this situation :
I create two swap files :
dd if=/dev/zero of=/swap01 bs=1MB count=34000
dd if=/dev/zero of=/swap02 bs=1MB count=34000
mkswap /swap01
mkswap /swap02
Now I can run my Perl script to rotate the active swap filesystem between /swap01 and /swap02.
If /swap01 is active, then Perl script does this :
swapon /swap02
swapoff /swap01
This causes the pages written to swap to be reloaded into memory and ensures I’m not using swap. MySQL shouldn’t get swapped in the first place but I feel this is a pretty good workaround. I run this little Perl script from cron every half hour. Notice the locking… :
#!/usr/bin/perl
use strict;
use warnings;
use LockFile::Simple qw(lock trylock unlock);
# set to 1 to turn verbosity off
my $verbose = 0;
my $lock = '/var/lock/rotate_swap.lock';
Main();
sub rs_lock
{
die "already locked\n" unless trylock($lock);
$verbose == 1 || print "acquired lock\n";
}
sub rs_unlock
{
unlock($lock);
$verbose == 1 || print "released lock\n";
}
sub err
{
my $msg = shift;
print "$msg\n";
# remove lock
rs_unlock();
exit -1;
}
sub Main
{
my %swap;
$swap{'/swap01'} = '/swap02';
$swap{'/swap02'} = '/swap01';
# verify that valid swap files exist
my $freecmd_output = `free`;
my $totalmem;
if($freecmd_output =~ m/Mem:\s+([^\s]+)/)
{
$totalmem = $1;
}
else
{
err("'free' cannot determine available memory");
}
my @stat_field;
my $swap_size;
foreach(keys %swap)
{
@stat_field = stat($_);
$swap_size = $stat_field[7];
$swap_size = $swap_size / 1024;
if($swap_size > $totalmem)
{
if($verbose == 0)
{
print "swap file: $_ size: $swap_size is greater than free mem size: $totalmem\n";
}
}
else
{
err("swap file: $_ size: $swap_size is not greater than free mem: $totalmem");
}
}
eval
{
# grab a mutex for this swap /var/lock/rotate_swap.lock
rs_lock();
# make sure at least ONE swap partition is up and running...
my $status = `cat /proc/swaps`;
unless($status =~ /Priority\n.+/)
{
err("No swap units available!");
}
# determine the TARGET swap partition.
my $target_swap;
my @line;
@line = split(/\n/,$status);
my @field = split(/\s+/,$line[1]);
my $current_swap = $field[0];
if(!defined($swap{$current_swap}))
{
$target_swap = '/swap01';
}
else
{
$target_swap = $swap{$current_swap};
}
$verbose == 1 || print "currently swap $current_swap\n";
# attempt to mount it
unless(system("swapon $target_swap") == 0)
{
err("swapon failed for : $target_swap");
}
$verbose == 1 || print "enabled swap $target_swap\n";
# attempt to umount the stable swap partition
unless(system("swapoff $current_swap") == 0)
{
err("swapoff failed for : $current_swap");
}
$verbose == 1 || print "disabled swap $current_swap\n";
}; # end eval {...
# unlock
rs_unlock();
if ($@)
{
### catch block
die "caught unexpected error: $!\n";
}
}
__END__
Megaraid Nagios Plugins
I wrote three small, useful Perl scripts, Nagios NRPE plugins to be precise, each scheduled to monitor the LSI RAID controller cache policy/status, controller’s patrol read status (since we never want a patrol read to affect performance) and the controller’s Battery Backup Unit status.
As a sys admin I often write small programs like these. I want to start writing custom stuff like this in Python. Python seems to be a very clean looking language with not a lot of syntactic sugary layers. It seems to have a clean try catch syntax like Java. Anyway here’s check_megaraid_cachepolicy.pl :
#!/usr/bin/perl
use strict;
use warnings;
my $cmd = 'sudo /usr/local/sbin/MegaCli64 -LDGetProp -Cache -Lall -aALL';
my $output = `$cmd`;
open(F,">/tmp/fu") || die "$!\n";
print F "output: $output\n";
close F;
if($output =~ /Cache Policy:WriteBack, ReadAheadNone, Direct, No Write Cache if bad BBU/)
{
print "OK: MegaRAID cache policy is ok\n";
exit 0;
}
else
{
print "WARNING: MegaRAID cache policy is not ok\n";
exit 1;
}
check_megaraid_patrolread.pl :
#!/usr/bin/perl
use strict;
use warnings;
my $cmd = 'sudo /usr/local/sbin/MegaCli64 -AdpPR -Info -aALL';
my $output = `$cmd`;
if($output =~ /Patrol Read Mode: Disabled/)
{
print "MegaRAID patrol read mode is disabled\n";
exit 0;
}
else
{
print "WARNING: MegaRAID patrol read mode is NOT disabled\n";
exit 1;
}
check_megaraid_bbustatus.pl:
<pre>#!/usr/bin/perl
use strict;
use warnings;
my $cmd = 'sudo MegaCli64 -AdpBbuCmd -GetBbuStatus -aALL';
my $output = `$cmd`;
# Fully Discharged : No
# Fully Charged : Yes
if($output =~ /Fully Charged : Yes/)
{
print "OK: BBU Fully Charged\n";
exit 0;
}
if($output =~ /Fully Discharged : Yes/)
{
print "WARNING: MegaRAID cache BBU is Fully Discharged\n";
exit 1;
}
if($output =~ /Fully Charged : No/)
{
print "WARNING: BBU Not fully charged or discharged";
exit 1;
}
On the Nagios server I add these two entries into our service config file :
define service{
hostgroup_name megaraid_servers
service_description megaraid cache policy
# notifications_enabled 0
check_command check_nrpe_1arg!check_megaraid_cachepolicy
use generic-service
}
define service{
hostgroup_name megaraid_servers
service_description megaraid patrolread
# notifications_enabled 0
check_command check_nrpe_1arg!check_megaraid_patrolread
use generic-service
}
define service{
hostgroup_name megaraid_servers
service_description megaraid bbu status
notifications_enabled 0
check_command check_nrpe_1arg!check_megaraid_bbustatus
use generic-service
}
On the cluster nodes run the Nagios NRPE server which is configured run certain nagios health check plugins locally sending the results to the server and thus waking me up at 3am with a e-mail to my cellphone.
/etc/nagios/nrpe.cfg:
command[check_megaraid_cachepolicy]=/usr/lib/nagios/custom-plugins/check_megaraid_cachepolicy.pl command[check_megaraid_patrolread]=/usr/lib/nagios/custom-plugins/check_megaraid_patrolread.pl command[check_megaraid_bbustatus]=/usr/lib/nagios/custom-plugins/check_megaraid_bbustatus.pl
2 comments