Blogs

Cheney and Bush - Guilty

What everyone already knew is now official - Cheney and Bush, along with help from John Yoo @ dick-i-pedia and others have turned the US Constitution on its head, and have dragged the US into the company of tyrannical regimes.

Cartoons say it well:
World is watching you
Cheney is the mastermind, Bush was "just following orders"

Articles:
The Banality of Bush White House Evil states the core truth - "Torture was a tool in the campaign to exploit 9/11 so that fearful Americans would support a war that had nothing to do with Al Qaeda."
Torture was useless - Ali Soufan, a former F.B.I. agent who questioned Abu Zubaydah in 2002, points out that the terrorist operative provided important intelligence under traditional interrogation methods. This is a point on which the Cheney strategy right now is to raise a lot of confusion.

It is sad, shameful, and even pitiable to see some radio and tv talking heads try to defend this Cheney/Bush policy - torture is something the US should never do, and it is never necessary to do. It was downright cowardly for Cheney (who took five draft deferments himself) and Rumsfeld types who allowed the rank-and-file in the army to stand trial and lose jobs over the Guantanamo and Abu Ghraib incidents when all along Cheney created such policies in the first place. The army did what they were told - policy set from the White House itself and the White House should be the one condemned and not the people in the army.

Drupal is a lot of trouble

This site uses Drupal. Drupal has turned into a nightmare. It was fine when there was a single 4.x version out there, but soon after 4.x, there was 5.x. Then 6.x. Upgrading from a older version is near impossible.

There always was the assumption that some amount of coding would be required by anyone running a Drupal site. But be prepared - you will be hacking modules left-and-right to get any thing running. At this time, one has to question whether the amount of hacking required to get things to run are worth it. Maybe all CMSes have this problem, but certainly Drupal is really a poster-child for impossible-to-ever-upgrade software.

The problem occurs because Drupal changes the API every release, adds new incompatible features, and modules and themes become unusable. And since modules and themes are merely someone's weekend project, it can be months or years before a module becomes compatible with the newer Drupal version.
Core drupal does not have image handling capabilities or spam fighting capability so even a basic site will need to use external modules. Add things like forums, automatic aliases, FAQs, it becomes a large collection of non-core modules.

The advantage of Drupal is that it is extensively customizable, and has a wide range of modules. This is exactly the same thing that makes a Drupal site near-impossible to upgrade. Once a site is up and starts to depend on a bunch of modules, rest assured that when a new Drupal version comes out quite a few required modules will not make it to that new version!

Drupal core does get upgraded without problems. But Drupal itself has become super-bloated. Web hosts that worked fine with Drupal 4.7 will not support Drupal 6.x because of heavily increased memory and CPU requirements.

Spam Email Counts

Is email on the way out? That is probably not yet an easy question, but the amount of spam seems to be holding steady, with periodic bursts of spam email storms.

Here are some graphs of spam at one of my mailboxes. This is for a very public email address. The spam detection is using spamassassin which runs under procmail with a customized whitelist and blacklist. Over the few years I've used this, there have been only 1-2 false positives for spam (of course, detection of false positives is not easy since this requires digging through 100s of spam messages, but I have no reason to believe that false positives are more prevalent). There have been quite a few false negatives - messages that are spam, but missed by spamassassin. These are usually around 1%-10% of the total detected spam messages, which is low enough that the graphs below are still useful to show the trend of spam message counts.

2010 Spam Counts 2010 Spam Counts
The Spam Counts images are updated periodically, usually every day, to include data of the previous complete 24-hour period.
[This image is no longer updated - the last counts will be for 2010-September. I no longer pull all email, there are only 1-3 non-spam emails and 80-100 spam messages per day. Will move all email use to one of the publicly available web sites, and have started using text chat more, and possibly move to voice chat in future too. Email is no longer very useful for home use.]

Fedora Core 7 Install Notes

Notes on things to look out for working with Fedora Core installs.

perl CGI scripts hanging
Fedora Core 7 current perl package is perl-5.8.8-27 [Jan 2008]. This perl package contains version 3.15 of CGI.pm, which has a problem handling POST_MAX. Any HTTP post of size greater than POST_MAX value will cause the CGI script to peg the CPU at the $q = new CGI statement and not terminate for a very long time. After the CGI script is automatically terminated, a HTTP 500 Internal Sever Error status code returned to the requester.

This problem exists in all distributions using Perl 5.8.8, and is not limited to Fedora Core 7.

This issue is very hard to track down. Most of the time, the CGI scripts will work fine. When they do fail, there will invariably be nothing more than a single line in the Web server log stating that the job took around 300 seconds and a HTTP 500 Internal Server Error message was sent back. The times at which the failures occur will be all over the clock, there will be no pattern to the time when the failure occurs. If you do manage to see this rare failure happening, the CPU will show that all free CPU time is being used by the CGI script - but the script has no logging or database activity thus indicating that it locked up very early in the script. That will eventually lead to looking at the $q = new CGI statement, and to the fix.

The fix is simple - upgrade to CGI.pm version 3.21 or later. But to make this stick will require some additional configuration because while CGI.pm can be updated on its own, it is also bundled with perl. So to avoid a future yum update restoring CGI.pm back to version 3.15 when perl 5.8.8 is updated, requires preventing perl from being updated, or keeping two copies of CGI.pm around and changing scripts to load the right one.

Public Libraries and Audio Book Downloads

Public Libraries in the US have now started offering audio book downloads. For example, in my local library, the books The Black Swan: The Impact of the Highly Improbable, The Dork of Cork, Candide, and many others are available for online borrowing as a MP3 download. There is a limit to the number of audio books checked-out and downloaded and each book is licensed for playback for a certain number of days only.

Visit your local library web site to see if they offer NetLibrary downloads. Available at most Public Libraries in the US and UK, and many other countries too.

These specific audio books are Microsoft DRM protected, so no Apple iPod support.

Audio books are a great invention given the amount of time spent commuting stuck in a car, or waiting at bus stations, train stations or airports. MP3 player user interfaces have not caught up well enough with this use, though. While it is great that books can be played on extremely tiny flash MP3 players, these players don't yet offer good bookmarking capabilities, only a single pause/resume capability is offered for all content on the MP3 device. Listening to books would be a much better experience with a multiple bookmarks capability per book...

The Black Swan: The Impact of the Highly Improbable

The Black Swan: The Impact of the Highly Improbable
by Nassim Nicholas Taleb

Read the First Chapter at New York Times. Throw out the Gaussian! In with the Power Law??
Excerpt:
The central idea of this book concerns our blindness with respect to randomness, particularly the large deviations: Why do we, scientists or nonscientists, hotshots or regular Joes, tend to see the pennies instead of the dollars?

The book is written in a very confrontational style, as if the author was emailing text or posting on Usenet, very harsh tone at times towards economists, and others, and a very professorial tone in laying traps that the author uses to berate reviewers on missing his clues! But taken with a hint of humor (which may have been the original intent) it is quite entertaining. Ignoring all such exclamations, and the times when it seems that the author is taking things to the extreme to make a point (are people really that taken with the Gaussian?) the technical parts of the book are very illuminating, and the central premise of the existence of Black Swans is certainly important. The author certainly is an interesting character, in another article, the simple question of Street Charity is turned into long winded response bordering on the incomprehensible, in this Freakonomics - Street Charity Quorum!

The Dork of Cork

The Dork of Cork
by Chet Raymo

With humor, the story describes the life of Frank Bois, a dwarf who is obsessed with all things beautiful. Reviews and Comments available at the Amazon.com web site.

The book starts with ‘Begin with beauty’, and continues with tales of interesting personalities and their travails. The ending though is a major let-down, completely different from the flow created by rest of the story.

Candide

Candide
by Voltaire

As I was listening to this book on my MP3 player, I was stunned at the story - seemed incredulous. Then I read the historical background to this story, and then it starts making sense, and it becomes a great, fantastic tale. A google search yields many reviews, many sites have the whole book online, here's one Literature Network - Candide.

Excerpts:

...
Master Pangloss ... "It is demonstrable," said he, "that things cannot be otherwise than as they are; for as all things have been created for some end, they must necessarily be created for the best end.
...
"Now we are upon this subject," said Candide, "do you think that the earth was originally sea, as we read in that great book which belongs to the captain of the ship?"
"I believe nothing of it," replied Martin, "any more than I do of the many other chimeras which have been related to us for some time past."
"But then, to what end," said Candide, "was the world formed?"
"To make us mad," said Martin.
...

This book is also available in MP3 Audio Download format. Unfortunately, this is DRM protected and requires support for Microsoft DRM (so no Apple iPod support). Available at most Public Libraries in the US and UK. Visit your library web site to see if they offer NetLibrary downloads.

Drupal MySQL Performance Problem

A very small site started running into performance problems - some pages taking too long to load, and certain MySQL queries taking over 5000 to 6000 milliseconds, and being killed because of resource limits set on the hosting computer. The pages affected were the watchdog log display pages - one of which is the Menu -> administer page when logged in as the adminstrator, and it displays data from the watchdog table.

This seemed odd - for a site with less than 200 nodes, and very low traffic, there should be no performance issue, and no single database query should be taking as long as 6000 milliseconds.

So the options were to increase the time limit for queries, or to spend the time debugging the problem.

Drupal is very feature rich, and this may have negative impact on performance, but in this case, it turned out to be a database issue. The Drupal site has many performance related tips, including a subsection on Tuning MySQL for Drupal.

After looking around in the database for the site, it was discovered that the overhead for the watchdog table was over 40 times its actual size! So, the size was 176MiB, and the overhead was 172MiB. Running optimize on this table got the size down to under 4MiB, overhead to 0, and got the queries to be much faster - way below the 6000 millisecond time limit, and the administer and log display pages now rendered much faster, way below the old times.

One question remains - why did removing overhead fix the query times?

strftime in Python

Time has never been easy to work with, and while it is best to use UTC to store time, it is necessary to use local time zones when displaying time to the user.

With regards to displaying local timezones using strftime and the format specifications %Z for timezone name and %z for the RFC-2822 conformant [+-]hhmm displays, Perl and PHP work fine, at least on Fedora FC5. Python 2.4 does not yet have support for %z, and the best support in Python for timezone is using the basic time module; the enhanced datetime module has no built-in support for time zones.

That takes care of strftime, what about strptime? Unfortunately, that is a topic that is even more convoluted, so try all variations out before you use it on any system.

Here's a strftime support summary for Python date/time users:

  • Use the datetime module if you don't need any timezone handling. The standard library comes with no support for any timezones, not even local timezones, so datetime module is useless for those who need to use time zone information. This can be done of course, but requires coding up your own timezone routines.
  • Use the time module if you can live with just %Z and don't need %z. Python 2.4 time module always prints wrong value +0000 for %z, even while it gets %Z correct.
  • Need %z, the RFC-2822 conformant time display? This requires writing your own code, Python does not support this. Using the basic time module, here's example code on how to get these values:
    lt = localtime(t)
    if lt.tm_isdst > 0 and time.daylight:
        tz = time.tzname[1]
        utc_offset_minutes = - int(time.altzone/60)
    else:
        tz = time.tzname[0]
        utc_offset_minutes = - int(time.timezone/60)
    utc_offset_str = "%+03d%02d" % (utc_offset_minutes/60.0, utc_offset_minutes % 60)

    Note: in the utc_offset_str computation, the use of 60.0 float in the / operation is necessary to get a value rounded to 0 instead of negative infinity, for example, -90 minutes offset should be -0130 and not -0230.

Here's the sample code and the output from Perl, Php, and Python, for two example strftime calls: