Category: Software

  • procps-ng 3.3.3 released

    top
    Top in colour mode

    This weekend procps-ng version 3.3.3 was tagged and released for distribution.  There have been many patches and fixes involved in this release as we move from an unchanging static sort of code into something that is easier to maintain and build on various architectures.  The good thing is I’m down to 1 or 2 patches in the Debian archive which is a big change from the 30 or 40-odd I used to carry.  For the sole metric of getting that number down, the project fork has been a success.

     

    There were some post-release bugs I found and these were more to do with the various options turned on or off rather than what you’d see if you did the basic ./configure && make.  One of them was how the version numbers are defined in git, but would only appear if certain files were older than others (such as aclocal.m4 versus config.h.in)  Others were when certain features were turned on.  The make check doesn’t see all of this because it uses the default configure flags.

    One annoying thing of the autotools is conditionally installing man pages. This is where you don’t or cannot compile a binary so you don’t install the corresponding man page.  The automake documentation is of course obscure about this but I cannot see a way of distributing only a man page, so we have this fiddle where a file goes into dist_man_MANS or EXTRA_DIST depending.

    Interestingly, there has been some bike-shedding around Fedora-land (see the link below) regarding the name procps-ng versus procps.  Debian is lucky that we do have different upstream and package names (though not ideal) so apt-get install procps still gives you procps.  There also has been discussion about merging procps-ng into util-linux whichin the short-term won’t be happening.

    Enhanced by Zemanta
  • PHP floats and locales

    I recently had a bug report in JFFNMS that the SLA checks were failing with bizarre calculations.  Things like 300% disk drive utilization and the like.  Briefly, JFFNMS is written in PHP and checks values that come out of rrdtool and makes various comparisons like have you used more than 80% of your disk or have there been too many errors.

    The logs showed strange input variables coming in, all were integers below 10.  I don’t know of many 1 or 3 kB sized disk drives. What was going on?  I ran a rrdtool fetch command on the relevant file and got output of something like 1,780000e+07 which for an 18GB drive seemed ok. Notice the comma, in this locale that’s a decimal point… hmm.

    In lib/api.rrdtool.inc.php there is this line around the rrdtool_fetch area:

    $value[] = ($line1[$i]=="nan")?0:(float)$line1[$i];

    A quick check and I was finding that my 1,7…e+07 was coming back as 1.  We had a float conversion problem.  Or more specifically, php has a float conversion problem.  I built a small check script like the following:

    setlocale(LC_NUMERIC,'pl_PL.UTF-8');
    $linfo = localeconv();
    $pi='3,14';
    print "Decimal is "$linfo[decimal_point]". Pi is $pi and ".(float)($pi)."n";
    print "Half is ".(1/2)."n";

    Which gave the output of:

    Decimal is “,”. Pi is 3,14 and 3

    Half is 0,5

    So… PHP is saying that decimal point is a comma and it uses it BUT if a string comes in with a comma, its not a decimal point. Really?? Are they serious here?  I tried various combinations and could not make it parse correctly.

    The fix was made easier for me because I know rrdtool fetch only outputs values in scientific notation. That means if there is a string with a comma, then it must be a decimal point as it could never be used for a thousands mark.  By using str_replace to replace any comma with a period the code worked again and didn’t even need the locale to be set correctly, or that the locale for the shell where rrdtool is run is the same as the locale in php.

    Enhanced by Zemanta
  • JFFNMS 0.9.3 1st release candidate

    I have been putting a lot of testing into JFFNMS lately.  I have been very lucky to have had someone with the time and patience to try out various sub versions and give me access to their results.

    The end-result of all this testing is a much, much less buggy JFFNMS.  There have been a strack of problems with caching results, for example, where status would not be updated or even worse the status of one device impacted on another.

    The poller parent scheduler had a problem too where it would almost always sit in the first child starving the others of work which slowed things down. The scheduler now is a lot fairer across the children giving a speed up. I’ve heard speed-ups of 15x for this one change alone.

    I also had a curious bug where if a device was set to not gather state it still did and created events but not alerts.  This meant your event table was spammed with down interface alerts even on interface you know are down and you turned state checking off.  0.9.3 now does it the right way.

    The first RC is now uploaded and can be found at https://sourceforge.net/projects/jffnms/files/jffnms%20RC/ to try out.

    I’m a little worried that the pollers now run too fast and could overwhelm the usually crummy control stack found in network devices for parsing SNMP.  I’m interested to hear how people find it.

    Enhanced by Zemanta
  • JFFNMS 0.9.2 Released

    JFFNMS Interfaces and Events

    JFFNMS version 0.9.2 was released today both as an upstream tar.gz file and a new Debian package.  This version fixes some bugs including making sure it works with PHP5.4.

    The biggest change in PHP 5.4 is that you can no longer call by reference.  Previously you could call a function like myfunc(&blah); which would send a pointer to blah and not the item itself. Now the function definition needs to define what it wants rather than change it each time.

     

    Enhanced by Zemanta
  • psmisc 22.16 Released

    psmisc version 22.16 was released today.  It is a bugfix release that bascially fixes a problem around strings in C.  Process name lengths are only supposed to be 16 characters long, so a 17 bye buffer is ok; until you have processes with brackets which means the string is 18 characters.

    The next wrinkle is that at times the brackets are stripped out so matches fail because the lengths don’t quite line up. You’ll see this with the Debian 22.15-2 version of psmisc where killall won’t find long-named processes.

    So, 22.16 fixes all that.

    Test Processes

    It really shows that psmisc needs a set of tests like procps has already. The difficulty with both is that its not simple in the DejaGNU framework to make test processes. These are not the programs within the package but other processes that the programs can work on.  There really needs to be an equivalent to touch for processes just for this sort of thing.  Creating processes is rather simple, but ensuring they go away is the tricky part, or they die with certain signals.

    Enhanced by Zemanta
  • VMware Player on Debian

    For various reasons, having vmware running on my desktop would be kind of useful.  VMware provide a Free (as in beer) version of their software called VMware Player. I downloaded the file VMware-Player-4.0.2-591240.x86_64.bundle off their website and tried to build it.

    It failed to build. Given my previous lack of success with VMware server, I wasn’t too surprised.  What was surprising was it wasn’t too hard to fix it.  The problem was that the vmnet module would not compile and that was due to three things:

    • net device ops no longer has set_multicast_list (in netif.c)
    • the linux module header needs to be included to define THIS_MODULE
    • skb_frag_t has been redefined and needs an adjustment

    The patch is only a few lines and means I can compile vmware on my Debian sid computer running kernel 3.2.0-1

    vmnet.patch

    To use it, you will need to find where the modules are built, for me it is /usr/lib/vmware/modules/source

    1. mv vmnet.tar vmnet2.tar
    2. tar xf vmnet2.tar
    3. patch -p0 < vmnet.patch
    4. tar cf vmnet.tar vmnet-only

    With that you can run the player which will try to build the modules and you’re done!

     

     

    Enhanced by Zemanta
  • Unlucky sometimes

    Sometimes life throws little curves at you to see if you are still awake, today has been one of those days.

    fglrx is (apparently) fixed

    I’ve had a long-running problem with fglrx on my laptop.  The problem stems from ATI closed-source drivers with one of those laptops that has an ATI and Intel driver. It means I am basically using the slow Intel chip only.  This morning I had enough and backed up my home and started to rebuild the laptop with Debian 6.0.3.

    So I kicked off the very very slow process of reformatting the crypto drive (it has taken 5 hours and still going) let it gurgle on its merry way and started to read my email.  One of the  emails was that my bug about fglrx not working is closed, apparently it is fixed.  If I had read that 10 minutes earlier, a simple ‘apt-get install fglrx-driver‘ would of perhaps fixed it; oh well.

    My problem is now is do I move to the latest driver and hope their fix is my fix or leave it with some ancient version?  My preference is the former; I only hope it works!

    psmisc 22.15 and buffer overflows

    psmisc has a program called pstree which prints the set of processes in a tree fashion.  It hasn’t changed much for quite a while.  I released version 22.15 and the Debian package 22.15-1.  22.15-1 I also adopted the harden CFLAGS as suggested for procps.

    I was a little surprised that I received an important bug.  The report was saying I had a buffer overflow introduced in 22.15-1, but no relevant code had changed.  The compiler options had done their job and stopped a buffer being overflowed.

    But where exactly was the overflow?  Running gdb on pstree quickly showed that it was line 267 of pstree.c which uses strcpy().  That function set off warning bells. The relevant code is:

        PROC *new;
    
        if (!(new = malloc(sizeof(PROC)))) {
            perror("malloc");
            exit(1);
        }
        strcpy(new->comm, comm);

     

    Now comm is the short command name you find in /proc//stat.  It is fixed in the kernel at 16 characters.  The PROC structure has this field as 17 characters long, one extra for the NUL.  I went and checked the Linux source and yes, it is still 16 characters long.  The clue was in the name of the program that it died on.

    #6  new_proc (comm=0x6111b0 "{console-kit-dae}", pid=1571, uid=0)
        at pstree.c:267

     

    That string is 17 characters long. The problem is that 16 characters is for the name only. If the name is in brackets or braces, then that 16 character limit doesn’t apply.  The buffer overflow bug has been there for a long time, but only with the compiler flags did it become visible.

    Given you need to read names out of the /proc filesystem and if someone can fiddle with that you have bigger problems it doesn’t seem to be too much of an issue.  It should be (and is in Debian 22.15-2) fixed but is a nice example of the compiler catching bad things.

     

    Enhanced by Zemanta
  • MediaServer with Rygel

    Rygel XVI
    Image by Cayusa via Flickr

    Like a lot of  people, I have one of those set-top TV boxes that can record TV shows at set times.  I made sure that I could get at the files (using a FTP server in this case) and that the files were some sort of common standard (MPEG 4 TS).  I also have a bunch of mp3 music files.

    That’s fine when I’m on the desktop because the files are local.  I wanted to make these available to anyone in the household. DLNA seemed to be a reasonably OK way of doing this, the problem was, how to get it working in Linux?

    A lot of the problem is that it is hard to find a DLNA only server.  Sure MythTV could do it, but it needs a tv tuner or a lot of fiddling around.  XMBC can also do it, but it needs to be running a GUI.  I even tried mediatomb but could not get the thing to compile as the library calls to mozjs were all using deprecated functions. I just wanted a daemon that served stuff, nothing more; no fancy ui, no need for X just file serving goodness.

    Rygel is almost that.  You could say it is a user server much like a torrent client/server.  The nice thing is you can fiddle around with rygel so it becomes close to a real server.  This is how I did it.

    First, I made a rygel user with a home directory, but disabled login. I don’t like programs running root if they don’t need it and rygel doesn’t need it.  The home directory needs to be writeable to the rygel user too otherwise the program doesn’t work too well. I use /var/local/rygel as its home.

    For the configuration, copy /etc/rygel.conf to ~rygel/.config/rygel.conf  This is the configuration file for rygel. I disabled all of the modules except MediaExport. Make sure you disable Tracker otherwise MediaExport will not work. Tracker is only useful for real users who are logged in and have dbus etc going which this user is certainly not.

    I made a simple rygeld file in /usr/local/sbin which basically starts the program, forks and grabs the PID to write to a pidfile. This mean it was easier to track the program in the init scripts.

    #!/bin/sh
    #
    # Rygel daemon handling
    RYGEL='/usr/bin/rygel'
    RYGEL_ARGS=''
    su -s /bin/sh -c "nohup $RYGEL $RYGEL_ARGS > /var/local/rygel/rygel.log 2>&1 &"
    rygel
    EXIT_CODE=$?
    if [ $EXIT_CODE != 0 ] ; then
            return $EXIT_CODE
    fi
    PGROUP=`ps --no-headers -o pgrp $$`
    PID=`pgrep -g $PGROUP -f $RYGEL`
    echo $PID > /var/run/rygel.pid
    exit 0

    In case you were wondering, the pgrp finds the program group so the pgrep finds the right rygel process that has the same program group as the starting shell.

    The init script is a standard init script except the –exec flag checks for /usr/bin/rygel but the start line starts /usr/local/sbin/rygeld  This is because we want to kill the real rygel process but start it with the script.

    This setup works rather well. You do get some messages in the logfile about dbus not working but it is harmless. I tried disabling the mpris and external plugins but no matter what flag or configuration file option I tried, they would always try to start and fail with no dbus.

    Rygel is a a reasonably light-weight way of serving media to your home network. It idles 200 MB virtual with 16MB resident and when idle uses almost no CPU.

     

    Enhanced by Zemanta
  • procps 3.32 Debian packages

    Following up from the upstream release of a new procps, the Debian packages have also been updated. This upload has a significant change in that, I hope, procps is now multi-arch compliant. To make this happen, the libprocps library is now in it’s own package, separate from the binaries. It also means that if you have programs not from procps that link to this library they are now broken. I put in a Breaks: line for the three I know about (xmem, guymager and open-vm-tools) which will need a recompile with a small tweak in the control file and linked statements.

    As suggested in the multi arch implementation wiki page, I tested the libprocps0-dev package by compiling something against it, in this case another Debian package xmem. Doing this was very useful for teasing out some bugs on the dev package itself that did not appear while linking the library to the procps binaries.

    In short, the new procps has a lot fewer patches than the old ones and the next version will have less as I have already included the current changes into the upstream git repository. The main differences are now

    • freebsd linux version is read from a file not from uts
    • includes use __restrict not the auto make defined restrict which may not be present in third party packages
    • libnucrsesw conditionally linked with watch for 8bit watch

    The three are really bugs, especially the last, which is why the patches will disappear next release.

    Enhanced by Zemanta
  • rpath bites me again

    Debian OpenLogo
    Image via Wikipedia

    It’s funny, in  bad way, when certain sorts of bugs come back to bite you.  People may remember the fun Debian had with rpath issues around in the late 90s where some binaries wanted to set the path of where their libraries were located.  After much contraversy the consensus was at least in Debian (with some specific exceptions) RPATH is bad and don’t use it.  I lived through that “fun” and remember hacking libtool or autoconf or some such file.

    The horror of those times came back to haunt me today while working on procps 3.3.2 packages.  lintian was suddenly complaining that binary-or-shlib-defines-rpath for all my binaries.  I’ve not seen rpath set for many years and as I’m also part of the upstream and never saw it set there it took me a while to work out what was going on.

    Certainly, my binaries were setting the RPATH, it wasn’t lintian getting the wrong idea.  You can see this with objdump myfile | grep RPATH. If you get a hit, you got RPATH set, an empty result means no RPATH. libtool was definitely setting the rpath, with

    cc -Iproc -g -O2 -o $progdir/$file pmap.o  ./proc/.libs/libprocps.so 
     -Wl,-rpath -Wl,/home/csmall/debian/procps/procps/proc/.libs 
     -Wl,-rpath -Wl,//usr/lib/x86_64-linux-gnu

     

    That’s not one rpath setting but two, for double insult. Configure flags such as –disable-rpath wouldn’t work. I flicked back to a pure upstream archive and there was no rpath there in either the compile flags or objdump. There was obviously some sort of side-effect going on.

    The problem, after much testing, was in the debian/rules file:

            ./configure 
              --build=$(DEB_BUILD_GNU_TYPE) 
              --enable-watch8bit --enable-w-from 
              --prefix=/ 
              --mandir=$${prefix}/usr/share/man  
              --libdir=$${prefix}/usr/lib/$(DEB_HOST_MULTIARCH)

     

    The problem is above, and all it was was a “/”. libtool does some checking to see if the library directory exists and if it doesn’t silently turns on rpath. It’s not smart enough to know that //usr/lib/x86_64-linux-gnu is the same as /usr/lib/x86_64-linux-gnu and so it enables rpath. Removing the slash before the usr fixed the problem.

    I’d say this is bad behavour on libtools part. It should know that / is // or all the other alternative ways of specifying paths.  Hopefully this post will stop someone else wasting their time like I did.

     

    Enhanced by Zemanta