Articles

Software - Sub Categories

Link Filter Drupal Module

Here's yet another URL Link Filter for Drupal.
Versions:
Drupal 7: linkfilter-7.x-1.3beta.zip [Not tested - looking for someone to help with testing]
Drupal 6: linkfilter-6.x-1.2.zip
Drupal 4 and 5: linkfilter-4.x-5.x-1.1.zip

The goal for this filter is to be somewhat like the URL filter included with Drupal, with the additional requirement to be Drupal installation directory independent as well as domain independent so that the URLs in Drupal nodes don't have to be re-edited when a Drupal site is moved to a different sub-directory or a different domain. Additionally, it allows for link text to be specified for the URL, and it preserves the input characters as much as possible, performing no or minimal HTML entity conversions of the input characters. Finally - it distinguishes various links with classes, which can be used to display link icons for specific links. If the link filter tag points to internal Drupal node, then a class containing the type of the node is generated, for example, class="linkfilter-drupal-node-image", which can be used to show distinguishing icons based on Drupal node type. This site uses this filter, and the link icons are displayed based on the class generated by the filter: for external links (linkfilter-urlfull class), images (linkfilter-drupal-node-acidfree or linkfilter-drupal-node-image class), mailto links (linkfilter-mailto class).

Link filter tags [l:URL text] in the input text will be replaced with a link to the given URL, which can be a Drupal link, an external web link, or a local non-Drupal link. Prefixes representing the site url and the Drupal directory are added, as appropriate:
1) Site url is prefixed if URL begins with a / character
2) No prefix is added if the URL has a : in it, as in http: or ftp: etc

Fedora Core 7 Install Notes

Notes on things to look out for working with Fedora Core installs.

perl CGI scripts hanging
Fedora Core 7 current perl package is perl-5.8.8-27 [Jan 2008]. This perl package contains version 3.15 of CGI.pm, which has a problem handling POST_MAX. Any HTTP post of size greater than POST_MAX value will cause the CGI script to peg the CPU at the $q = new CGI statement and not terminate for a very long time. After the CGI script is automatically terminated, a HTTP 500 Internal Sever Error status code returned to the requester.

This problem exists in all distributions using Perl 5.8.8, and is not limited to Fedora Core 7.

This issue is very hard to track down. Most of the time, the CGI scripts will work fine. When they do fail, there will invariably be nothing more than a single line in the Web server log stating that the job took around 300 seconds and a HTTP 500 Internal Server Error message was sent back. The times at which the failures occur will be all over the clock, there will be no pattern to the time when the failure occurs. If you do manage to see this rare failure happening, the CPU will show that all free CPU time is being used by the CGI script - but the script has no logging or database activity thus indicating that it locked up very early in the script. That will eventually lead to looking at the $q = new CGI statement, and to the fix.

The fix is simple - upgrade to CGI.pm version 3.21 or later. But to make this stick will require some additional configuration because while CGI.pm can be updated on its own, it is also bundled with perl. So to avoid a future yum update restoring CGI.pm back to version 3.15 when perl 5.8.8 is updated, requires preventing perl from being updated, or keeping two copies of CGI.pm around and changing scripts to load the right one.

Show blocked hosts on web

This script uses PHP and MySQL to create a web page that lists all the blocked hosts.

It uses a IP to country mapping table to show country flags.

To see this working visit tanchaz.hu/blockhosts/

That page also includes a link to download the software.

Drupal MySQL Performance Problem

A very small site started running into performance problems - some pages taking too long to load, and certain MySQL queries taking over 5000 to 6000 milliseconds, and being killed because of resource limits set on the hosting computer. The pages affected were the watchdog log display pages - one of which is the Menu -> administer page when logged in as the adminstrator, and it displays data from the watchdog table.

This seemed odd - for a site with less than 200 nodes, and very low traffic, there should be no performance issue, and no single database query should be taking as long as 6000 milliseconds.

So the options were to increase the time limit for queries, or to spend the time debugging the problem.

Drupal is very feature rich, and this may have negative impact on performance, but in this case, it turned out to be a database issue. The Drupal site has many performance related tips, including a subsection on Tuning MySQL for Drupal.

After looking around in the database for the site, it was discovered that the overhead for the watchdog table was over 40 times its actual size! So, the size was 176MiB, and the overhead was 172MiB. Running optimize on this table got the size down to under 4MiB, overhead to 0, and got the queries to be much faster - way below the 6000 millisecond time limit, and the administer and log display pages now rendered much faster, way below the old times.

One question remains - why did removing overhead fix the query times?

strftime in Python

Time has never been easy to work with, and while it is best to use UTC to store time, it is necessary to use local time zones when displaying time to the user.

With regards to displaying local timezones using strftime and the format specifications %Z for timezone name and %z for the RFC-2822 conformant [+-]hhmm displays, Perl and PHP work fine, at least on Fedora FC5. Python 2.4 does not yet have support for %z, and the best support in Python for timezone is using the basic time module; the enhanced datetime module has no built-in support for time zones.

That takes care of strftime, what about strptime? Unfortunately, that is a topic that is even more convoluted, so try all variations out before you use it on any system.

Here's a strftime support summary for Python date/time users:

  • Use the datetime module if you don't need any timezone handling. The standard library comes with no support for any timezones, not even local timezones, so datetime module is useless for those who need to use time zone information. This can be done of course, but requires coding up your own timezone routines.
  • Use the time module if you can live with just %Z and don't need %z. Python 2.4 time module always prints wrong value +0000 for %z, even while it gets %Z correct.
  • Need %z, the RFC-2822 conformant time display? This requires writing your own code, Python does not support this. Using the basic time module, here's example code on how to get these values:
    lt = localtime(t)
    if lt.tm_isdst > 0 and time.daylight:
        tz = time.tzname[1]
        utc_offset_minutes = - int(time.altzone/60)
    else:
        tz = time.tzname[0]
        utc_offset_minutes = - int(time.timezone/60)
    utc_offset_str = "%+03d%02d" % (utc_offset_minutes/60.0, utc_offset_minutes % 60)

    Note: in the utc_offset_str computation, the use of 60.0 float in the / operation is necessary to get a value rounded to 0 instead of negative infinity, for example, -90 minutes offset should be -0130 and not -0230.

Here's the sample code and the output from Perl, Php, and Python, for two example strftime calls:

Updating Drupal

Updating drupal using the standard instructions is very time consuming - have to turn off modules/themes, update settings, reinstall modules/themes.

But many users find that for many updates to the same major release, for example, 4.7.x series, simpler upgrades can work - official Drupal install instructions do not allow this, nor is the structure of Drupal folder structure a help, since user installed modules/themes are in the same folder as the drupal files (would be good to have these separate), and no easy way exists to try out a new release before switching to it on a running site.

Given all that, here's how an update can be done fast - very important to read the UPGRADE.txt Drupal document first, and do all backups, and be ready to restore quickly if things don't work.
No guarantees on this method, but it has been known to work.

Assume that old drupal install is in current/ and new one is new/

  1. extract the new drupal release, into the new/ directory.
  2. copy over all the new files and directories from your current install, to this new/ directory. See script "update.sh" below which does this.
  3. update the config files - sites/default
  4. login to current drupal as admin
  5. backup: rename current/ to current.backup/
  6. rename new/ dir to current/
  7. run current/update.php from drupal and on success, log off the admin user. and test it out

And here's the update.sh script - edit $OLD variable, run this from inside the new/ directory, and redirect output to update.run, take a look at the update.run commands, and then run the commands in update.run:

#!/bin/sh
OLD="current/"

echo "# Current Directory: " `pwd`

for i in `cd $OLD; find . -print`
do
    if [ ! -e "$i" ]
    then
        if [ -d "$OLD/$i" ]
        then
            echo mkdir -p -v "$i"
        fi
        if [ -f "$OLD/$i" ]
        then
            echo cp -p -v -i --reply=no "$OLD/$i"  `dirname "$i"`
        fi
    fi
done

Web provider changes umask, Gallery stops working

So, I maintain a few web sites and one site uses Gallery software. This worked fine until recently - when a user tried to create a new album, it failed with an error about being unable to create lock files [ Error: Could not open lock file (/..../public_html/albums/album01/photos.dat.lock) for writing! ]

That was a somewhat misleading error, but - in the end, this turned out to be a hosting provider issue, and not a Gallery issue.

Turns out the hosting provider advertently or inadvertently set the default umask for process under Apache to 0111. This umask removes all execute permissions from new files and directories created by scripts run under Apache.

Gallery keeps the default umask, so it inherited the 0111 umask, and when it tried to create a directory with permissions 0700, it in fact got a directory with a permission of 0600 - read, write, but no execute. Of course, without execute permission, a directory is not of much use - cannot move into that directory, cannot create files in that directory, basically, things will start erroring out from that point out. Software could be written to handle this - maybe always do a chmod after a mkdir? But that is a different discussion.

It did not take too long to find this out, but getting this resolved at the hosting provider took a while - explaining umask, mkdir, and directory behavior. I guess that is the first reaction of technical support - they must get too many false reports, that when a real problem comes up, they have to take some time! [Though I am happy with the provider - they were at least engaged and responded fast with questions, and in the end, they resolved this pretty quickly.] Add to this, all morning today the Gallery site was inaccessible - so I could not search the forums for this issue. In any case, this was a new problem, not previously posted on the Web, nor mentioned at the Gallery site.

GoDaddy heavy handed in shutting down domains

nodaddy.com tells a scary story of how Go Daddy went in and disabled a site, for what seems to be totally unjustified reasons, and totally insufficient attempts made to

What is worse, is that Go Daddy continues to insist they did they right thing, compounding the significance of this issue.

My domains are registered with Go Daddy, I was hoping for a better response from them - maybe say it was a mistake and that they now have a new process in place to handle this, but looks like that is not going to happen.

Need to think hard if this is a good domain registrar - though looks like it is not that is easy to find any registrar that is good in this respect, at least in the US.

PHP regular expression issue

Fun with perl vs php regular expression handling.

PHP has a regular expression search and replace function, preg_replace. This is supposed to be the standard greedy algorithm for patterns, for example, A* means zero or more A characters, and it matches the longest string of A's at that point.

So, if you want to match zero or more slash (/) characters at end of a string the pattern to use is: /*$

And, if we want to replace zero or more slash characters at end of a string with a single slash character, the perl code is:   s!/*$!/!

Full example in perl:

foreach ('path/to', 'path/to//', 'path/to///', 'path/to////') {
    my $s = $_;
    $s =~ s!/*$!/!;
    print "string: $_ changed to $s\n";
}

Output is: 
string: path/to changed to path/to/
string: path/to// changed to path/to/
string: path/to/// changed to path/to/
string: path/to//// changed to path/to/

And the above output is all correct.

Equivalent PHP code does not work, using PHP version 5.1.6.

Following PHP code:  $t = preg_replace('!/*$!', '/', $s); fails, it will end up with one / if the input has 0 or 1 / characters, but it will end up with two / characters if the input has 2 or more / characters, instead of a single / character. So, the match was not greedy, but for reason, was split into two matches, and then each matched group was replaced with a single / character.
The workaround for this is to specify a limit count of 1 to preg_replace, which makes the code work fine. preg_match seems to work fine, only preg_replace has this problem.

Here's php code, and its output, showing the failure:

<?php
foreach (array('path/to', 'path/to//', 'path/to///', 'path/to////') as $s) {
     # preg_match('!/*$!', $s, $matches); // works fine, is greedy - $matches[0] is zero or all slashes
     # $t = preg_replace('!/*$!', '/', $s, 1); // works, limit of 1 helps

Video recording, editing, creating DVD

Nothing like fiddling with MPEG packets on a rainy Saturday afternoon!

This post will be periodically updated, until a reasonably easy, scripted list of steps is documented, on how to make a DVD out of video recorded on a Linux system.

TV Receiver and MPEG2 Encoder: Hauppauge WinTV-PVR-150 (MCE Editon)
This is supposed to be for Windows Media Center Edition only, so will not install on any other Windows operating systems, but works fine on a Linux computer! There is something amazing about that sentence - will not work on Windows, works on Linux! How far has Linux come...

Software: dvdauthor, avidemux2, mkisofs, growisofs, ivtv-drivers, xine, etc
And running on a Fedora FC5 Linux system.

The goals of the steps are to use scripts to save MPEG encoded video, and then perform simple editing - cut out portions not needed, and create a simple DVD structure. Avoid transcoding of video - sure, it is technically possible to get lower bit rates from higher bit rate video, but the quality reduction using transcoding is pretty drastic (possibly because it is very complex with many possible ways to do this), so best to capture at rates desired, and make sure no intermediate step involves transcoding.

Procedure:

  1. Record video as needed, using the script shown elsewhere here - copy video for given duration. In the scheduled command, use ivtvctl commands to set bitrate, tune to correct channel. Choose a DVD-compatible bitrate, for example, I use 6Mb/sec CBR for capturing NTSC Standard-Definition video.
  2. Load up the clip in avidemux2, cut out all ads or portions that are not needed. The avidemux2 pages have good tips on how to make cuts that wll allow the video frames to be just copied - place both A and B marker on I-Frames of the MPEG stream, and then cut.