Akismet to the Rescue

Any web site that’s been online for more than a few seconds seems to attract spammers, especially if it supports comment posting or other user generated content. The magnitude of the problem seems to be higher when the site is using a well known software package such as WordPress. My blog is no exception and the problem has become steadily worse as the amount of content on my site has increased.

Continue reading “Akismet to the Rescue”

Netbook Aweigh

Have you ever wanted to bugger off somewhere semi-remote to just get away from people? I know I have. However, my idea of enjoying such a location does not involve sitting around staring at the scenery or slogging through random tracts of back country bush for hours on end. Rather, I want to enjoy the serenity of the location but actually do something at the same time.

Since one of my hobbies is writing, I decided that it would be a great way to “decompress” if I could head off to a scenic locale and spend a few hours writing stories or what have you. So why don’t I? Well, my handwriting is barely legible at best and there is little point writing something if nobody can read it later. And that doesn’t even account for hand cramps and so on.

I could take a manual typewriter or something similar along and use that, but those things are clunky, slow, and take a lot of effort to operate. For someone used to computer keyboards, that is a deal breaker. And, if cramping from writing long hand is a problem, a manual typewriter is no better.

That leaves portable electronics. The problem there is most portable devices with a decent battery life are too small to be useful for typing. The ones that are well suited for typing tend to have hours of battery life measured in the low single digits. That means I must compromise.

Fortunately, such a compromise is available. A reasonable netbook does not cost a fortune. It has a keyboard that is large enough to actually use, if not as efficiently as a full keyboard. It has a screen that is large enough to read text. And a decent one tends to have a battery that lasts upwards of eight hours when power drains such as 802.11 are disabled. Even better, should it be necessary, a full sized keyboard can be attached by USB.

To this end, I just acquired an ASUS Eee PC 1005PE. While the screen is smaller than I’d like and the keyboard is also less than ideal, it is a workable compromise between a full sized notebook with maybe three hours battery life and a mobile device like an iPhone. With the correct software installed, it works a charm for what I plan to do with it.

It’s clear from my recent experiences with my new Eee PC that a netbook is not a substitute for a full sized notebook or desktop. However, I can now finally see why such devices sell. There is always a compromise and the netbook makes a different compromise to attain a reasonable combination of battery life, portability, and functionality.

ACLs, PHP, and suPHP

At my day job, I maintain shared web servers. One of the biggest problems for shared web servers is how to protect different users from each other. In particular, one wants to prevent one user’s files from being read by another user on the same server. This turns out to be a non-trivial problem on a standard Linux server using traditional tools. Continue reading “ACLs, PHP, and suPHP”

MongoDB

As some of you may be aware, I’ve been working on a scheme for cataloguing my book collection for as long as I’ve known what a book catalogue is, which is somewhere on the order of a quarter century now. I’ve started numerous times and even managed to create something useful in some instances. However, my efforts have always been somewhat lacking for various reasons.

In the early days, the lack of capability of the computer systems I had at my disposal was a severe limiting factor. I employed some amusing tricks to get things working in a manner that would not limit me too much. Some of the tricks I employed were less than amusing.

Eventually, I discovered relational databases and thought that they might be the solution to the data storage problem. However, as I tinkered with them, I realized that my goals were fundamentally incompatible with the relational paradigm. Oh, it isn’t that the data cannot be expressed relationally so much as the data doesn’t naturally fit into a rigid schema which relational databases usually require. While a non-schematic data structure can be implemented in a relational database, the benefits of using such a database pretty much evaporate when one does so.

Finally, I discovered the NoSQL “movement” which came into its own in 2009. I spent some time reading about the various NoSQL schemes out there and stumbled across MongoDB which seems to fit the needs of my book catalogue perfectly. MongoDB does not enforce any sort of schema on objects stored in the database and it uses a relatively simple variation on JSON to store data. This makes it trivial to store random extra information about a book (or an author if need be) without having to resort to the somewhat complicated gymnastics a relational database requires.

I will concede that the MongoDB scheme is noticeably slower for many operations. However, most of the operations that are slow are either a result of my own relative ineptitude with MongoDB or they are operations that are not performed regularly. Even so, the flexibility offered by MongoDB is well worth a speed trade off. And we talking going from no noticeable time taken to perform a task to maybe a few seconds. This is hardly a great hardship.

So, those of you out there looking to solve problems which suffer from data that is not strictly regular are strongly encouraged to look into MongoDB and its cousins.

mod_rewrite-fu

Over the past few years, I’ve become increasingly impressed by mod_rewrite. Today a bashed up against a problem which turned out to have a fairly elegant solution, thought it looked like a lost cause at first.

Over the past few years, I’ve become increasingly impressed by mod_rewrite. It allows one to do extraordinarily complex things with Apache. Today, I was trying to use it to implement a very large batch of redirections without resorting to hundreds of RedirectMatch directives. This is something that ought to be trivial but turns out to be slightly less than obvious.

First, I figured I’d use a text map to hold the redirection list. So I added the following to the VirtualHost configuration:

RewriteMap redirs txt:/path/to/redirs/file

Then I added the necessary stuff in .htaccess:

RewriteEngine On
RewriteBase /
RewriteRule ^(.*)$ ${redirs:$1|$1} [L]

That turned out not to work. I had the keys in the text map with leading / characters. I also realized I had the redirection URLs already URL encoded. So I discovered I needed to do the following instead:

RewriteEngine On
RewriteBase /
RewriteRule ^(.*)$ ${redirs:/$1|$1} [L,NE]

It also occurred to me that I should probably be somewhat polite and check if there is a redirection before doing it, so I updated the .htaccess as follows:

RewriteEngine On
RewriteBase /
RewriteCond ${redirs:%{REQUEST_URI}} !=""
RewriteRule ^(.*)$ ${redirs:/$1} [L,NE]

That turns out to be approximately what every other example of this particular gimmick looks like. And it mostly works. Except if the URL to be redirected contains spaces.

I spent the better part of an hour muddling around in documentation and failing at my google-fu before I twigged on something I had read several times in the mod_rewrite documentation but its utility had not sunk in. I thought, maybe I can apply a map to the lookup key in the redirs map. And mod_rewrite provides a handle little escape function. A little tinkering yielded the following in the VirtualHost configuration:

RewriteMap redirs txt:/path/to/redirs/file
RewriteMap encode int:escape

and in the .htaccess file:

RewriteEngine On
RewriteBase /
RewriteCond ${redirs:${encode:%{REQUEST_URI}}} !=""
RewriteRule ^(.*)$ ${redirs:/${encode:$1}} [L,NE]

Using the final result above, one need only ensure that spaces in the names to be redirected are encoded as %20 in the redirs file. So far, it is working perfectly and I only need to update the redirs file to change the list of redirections.

GNU Build System

I just had my first real experience with the GNU build system, otherwise known as automake and autoconf.

I have been working on an operating system for the Coco (an old 8 bitter). This led me to creating a cross-assembler to do the work. This is something like re-inventing the wheel but I figured I would learn something by doing so and then I would only have me to bitch out if the assembler failed to work. Amazingly enough, I actually got the assembler to the point where it is basically usable by me so I thought, why not release it to the world.

With the concept of releasing it came a problem. I need to build a package (tarball) to distribute it as. I also need something relatively familiar for people to build it with. And I don’t want to do a lot of work later if portability issues arise. Since I rather like the way the gnu software builds, I settled on using autoconf and automake.

Initially, I’m just using it as a packaging scheme and a build system so that I don’t have to manually maintain Makefiles. And, amazingly enough, it’s working very well for that purpose. And now that I’ve got the package “autoconfiscated”, it will be easier later to add various bits and bobs to the build system for portability, among other things.

Overall, I’m quite pleased with my first foray into a real autoconf/automake package. It only took me about four hours total to take the code I wanted to release as version 1.0 and transform it into an autoconf/automake style source tree. And that time included some futzing around importing the code into a subversion repository and attaching the GNU General Public License to it.

For interested parties, you can find information in my Coco section or download it from my downloads section.

Update 2014-03-21:

A while back, I actually discussed ultimately removing autoconf and automake from lwtools. Anyone looking to add autoconf and automake to their project should read that post. It may change your mind.

Coco Project

I started a while back writing an assembler for 6809 and 6309 CPUs. While there are lots of decent assemblers for those CPUs available, none of them behave quite the way I want them to. The ones that handle forward references in a useful manner don’t support macros. The ones that support macros and forward references target a specific operating system. Many suffer phasing errors when forward references are present which is, to say the least, annoying. The one I had been using seems to handle most of the situations well enough but it doesn’t have macros and it is not open source so I can’t add them.

The upshot of all of that is that I decided to write my own assembler that will support the features I want. Today, I have it assembling a real source file from one of my projects almost correctly, with only two bytes wrong (on indexed instructions) which will, of course, require debugging. But the rest of the code is completely correct.

There’s still a long way to go to get it to be truly useful, including adding a LOADM file format for the output and implementing all the pseudo operations correctly. But, hey, it produces semi-pretty program listings.

The State of Printing in Linux

I just bought a new printer today on account of my old one has some sort of electronics fault (it will only print one or two pages after a complete cold reset and reinstall of the printer on the host computer). I picked up an HP Color LaserJet CP1518ni which I picked because it said on the box that it supports PCL and PostScript which pretty much guaranteed that it would work on my Linux box. It also connects via the network or by USB. And, it happened to be on for about 25% off.

So I got the thing home, scanned the "read this first" documents, removed all the extraneous shipping material, and connected it up to my computer. At this point, I expected a manual step or ten would be required to get the printer working.

Now, I should state at this point that I am running Ubuntu 8.04 on this particular computer. Thus, the installation instructions for Windows or Mac are useless. I only needed the printed instructions to know how to physically set the printer up. I fully expected some trials and tribulations to get things working.

Okay, so I powered up my monitor and the printer. While the printer was doing some interminable initialization task, I noticed a thing pop up on my computer screen saying it had detected the printer and installed a driver for it. The driver was for a different model but I thought, hey, it may well be correct. Once the printer finished it’s initializing and calibrating, I brought up the printer management system and printed a test page. And, wonder of wonders, it printed.

There was only one minor thing I needed to fix. The printer configuration defaulted to A4 paper which nobody uses in North America. I hear tell that most of the rest of the world does but that makes no difference to me since the readily available paper here is Letter size. Well, I tooled through the settings in the printer configuration tool and found the setting. I changed it, applied it, and printed another test page. And it worked.

So, the entire process of installing the printer was basically painless and involved no command line magic or Mad Skillz of any kind. The only problem was that, even though my time zone is North American, the paper size defaulted to the wrong thing. Still, even with A4 paper selected, the printer would be usable for the most part.

Granted, a different printer might not have worked so well. Still, that fact that any printer did is indicative of great progress in usability for Linux. I expect I would have had to jump through a few hoops with the CD that came with the printer for Windows or Mac, yet my Linux installation had a driver already and did almost all of the setup for me. Just a year or so ago, this would not have happened.

So, kudos to the Ubuntu team and all the other developers who contribute to the software that makes up Ubuntu. Keep up the good work and pretty soon we’ll have all the features of Windows and then some.And it will be implemented better.

Compiling vs Interpreting

I’ve recently started work on a program to catalogue my books and book-like items. After a large amount of thought, I settled on a GUI application even though I knew nothing at all about writing a GUI application. Once I settled on that, I did some research into programming languages and came up with PHP5 using php-gtk2.

Now programmer types might be thinking that a scripting language is a bad choice for a GUI application. I would have said the same thing. After some experimentation, I have come to the conclusion that PHP was the right choice for a couple of reasons. One is that I know the language so I do not have to put a lot of extra effort into learning it. Another major reason is that it is a high level language which will not require that I do a lot of fiddly coding under the hood just to glue everything together with string handling and so on. Having done a lot of that sort of thing with C and C++ in the past, I have no desire to do it for anything this substantial.

Once I looked at the GUI toolkits, I realized that making a GUI is about the same in any language once you get past the various incantations each language requires to instantiate a widget. Likewise between the various toolkits. Since GTK has a binding for PHP called php-gtk, I decided to use that one.

It was a daunting task initially. There is thousands of pages of documentation on programming GTK. So where did I start? Eventually, I found an entry point and got a few simple test programs running and I’m well on my way to actually accomplishing something.

what I have learned in the past few weeks is that it doesn’t really matter that the program is interpreted. It only has to run as fast as I operate which is substantially slower than the work horse on my table. Responsiveness is sufficient in the interpreted state. Also, when you get down to it, the intermediate parsing step happens once during application startup and we’re used to applications taking a few seconds to start so what’s that matter?

Still, it would be cool to have my program not require a lot of external things like the PHP interpreter to run. Or, for that matter, a bunch of bash-fu to make sure things are kosher before launching the PHP code. This is where Roadsend comes in. They produce a free PHP compiler which just now in testing supports PHP5 and they are working on php-gtk2 support. This means I will be able to compile my program into a native application and not have to worry about wonky PHP settings in the future. There will be no dependencies on the Zend PHP implementation.

This does mean, however, that I need to stick to a relatively small subset of PHP extensions. This is not likely going to be much of a restriction anyway since most PHP extensions are somewhat special purpose. As long as I have the core of the language and the standard library, I should be fine.

There is another consideration, though. The way an interpreted project is constructed is often somewhat different to the way a compiled project is constructed. This means that there may be some code structure differences. Still, that’s relatively minor.

Even with the option of a compiled version, I think I’ll stick to the Zend PHP interpreter until I get the project actually working. Then I can monkey with things and see if the Roadsend compiler is the right choice. They do say that premature optimization is the root of all evil. That may be a bit strong but the basic point does hold.

Hardware Fun

I had the opportunity today to play with a dual xeon 2.6GHz server with 12GB of RAM and a 3Ware RAID controller. Now that’s a cool bit of gear. The only hitch was that I had to break into it physically to do so. I had to remove screws that were designed to be unremovable to get inside the system and then I had to reset the CMOS data without the benefit of a jumper. Of course, none of that is particularly hard since removing the battery will reset the CMOS data and there are plenty of tools to operate on damaged screws. It just took a bit of doing.

Of course, it’s much easier to operate on hardware that hasn’t been locked down for whatever reason, but where’s the fun in that?

Before anyone asks, there was nothing illegal about breaking in to this particular server. It was merely a prerequisite for installing a new operating system on it.