I decided recently to completely rebuild the backend logic of my web site. Over the years, I’ve noticed a number of problems with managing my site. I’ve had a web site since long before dynamic content and PHP were cool. As a result, much of my content is a set of static pages. That’s all well and good, but it does create a number of problems.
The first major problem is what to do when reorganizing the site. Since I like to keep the old links functional with redirects, I end up creating a bunch of rewrite rules in Apache or placeholder PHP pages or what have you. That’s a real pain to do and easy to mess up. It would be much easier to maintain if there were to be a lookup table somewhere that is consulted for links that don’t exist and an appropriate redirect generated in that case. That can be accomplished using mod_rewrite or with some sort of PHP script in a 404 handler.
When rearranging a site, one also wants to update all the internal links to avoid having them go through the redirection process or worse, hit a 404 error. This problem is not so easy to solve as it requires a periodic validation of the site or preprocessing every page to check for outdated links and rewrite them. This is not really practical.
A content management system can solve the above two problems without too much difficulty. However, most content management systems rely on a database server in the backend. That’s overkill for content that almost never changes, especially for someone proficient with a text editor and file transfer methods. That said, it is conceivable to create a content management system using static files on the file system which then preprocesses the files for links and so on. In fact, that is exactly what I have done, along with a caching scheme so that pages don’t get reparsed every time they are loaded.
Another problem content management systems often have is that the pages they serve appear to be modified every time they are loaded. This is not ideal for web site traffic volumes as search engines and web browsers will have no way of knowing the page really hasn’t been modified. My system uses the modification time of the source file to set the last modified time in the response and it also handles the if modified since request option and returns the proper not modified status if appropriate. This means browser caching can behave normally.
Many of my pages also have a last modified time in the footer. This has historically been updated manually by me when the page is updated. This is not ideal, however, since it requires me to remember to update it. Instead, the new system provides a substitution that adds standard footer along with modification time based on the file’s modification time. After all, the computer is very good at identifying such bits of information.
So I have a content management system that handles static pages implemented. I even arranged for apache to call it for every request within a particular folder. That means that unless the content system recognizes a request, it is not returned. The content system has full control. This prevents anyone from seeing the underlying file structure that implements the site. That is not sufficient, however.
My site also has a number of sections which have well defined behaviour or require more dynamic operation. These are such things as my blog or the photo album section. To make it possible to add such sections on at any time without having to modify the global site code, I have created a module system. The main content system examines the requested item, figures out a module name for the contents, and attempts to pass the request off to that module. If the module exists, it handles the request. In fact, the static pages are implemented as a module in the content system.
And, to top things off, if no module handles the request, it is passed on to a final step that looks up the requested item in a list and determines if a redirection is needed. If a match is found, a redirection is sent back to the browser, If that last ditch effort fails, a 404 error is returned.
It’s a very ambitious project. So far, I have implemented the framework for static pages and have started converting existing static pages to it. Still, I have the blog, photo album, etc., to implement yet before I can bring the new version of the site live. Even with the amount of work involved, I think it is well worth the effort even if there is no real visible change as far as the site users are concerned. Who knows, maybe I’ll share the site code once I have it all working.