Rewrite

April 2003

About a year ago, i bought adamish.co.uk, and decided to make a home page. Unfortunately i didn't know much about HTML then, and i introduced many faults and flaws into the original design. Many design inconstistencies have also crept into this site. Now i have a bit of time on my hands, i have decided to redo the whole site. i shall call the remake process the "road map".

All of this goes to highlight it is easier to do the design process first!!

Things to do

XHTML

In essence, pages written in correct XML according to a DTD provided by the W3C. See www.w3c.org for details.

Advantages?: XHTML easier to parse by browsers, therefore used by PDAs/mobile phones. Also easier to verify correctness. Easy to extend XHTML, although most users do not use its extensibility

Differences?: Almost identifical to strict HTML 4.01, except all tags must be in lowercase and must be finished, i.e. <tag> </tag>. Tags like img must be written as <tag attribute1="" ... attributeN=""/>. Note the forward slash before the last >. Also documents must have a <?xml ?> thing at the begining.

Conclusion: It would be quite easy to convert to XHTML since my style of coding is to already use lowercase mostly complete tags. Adding a header to each file would be a trivial scripting matter.

Logging system

Update Jan 2004: Home-brew logging system has now been replaced with awstats.

Although the hosting company i use provide a logging system it is not particularly good and does not provide addresses of visitors. i have currently upgraded most of my site to use standard header and footer inc files. i hope to insert code into one of these files to record requests for the particular page, along with the referer, IP, host name, date and time etc.

i think i'll use some kind of XML structure, since this will be easy to parse by a web interface. Times will be stored using a UNIX date-stamp integer, this will allow dates to be converted into any format with ease.

i hope to also include a mechanism to record how long is spent on each page by looking at the time until the next request from the same address and also filtering erroneous results generated by this process by applying some sensible heuristic.

July 13th: Logging system finished. It provides logging of all pages apart from the log system itself. It records referer, date, time, hostname and the page being requested. It also, where possible calculates the time spent on each page by the difference in time between requests from the same host.

July 17th: Completed a searchstring retrieval system. . This provides a list of search strings extracted from the HTTP_REFERER data. It is quite simplistic but works for most search engines. It works in the following way:

Convert all pages to PHP

i wrote the following script to rename all the .html files as .php files. The script also changes any .html entries in the any files to .php. This will hopefully keep the hyperlinks intact.

#!/bin/bash

for i in `find . -regex '.*\.php\|.*\.html'`
{
        sed -e 's/\.html/.php/' $i > tmp;
        mv tmp $i;
        mv $i `echo $i | sed -e 's/\.html/.php/'`;
}

i also included the following line in my .htaccess file to forward requests to html pages as requests to the same named php page. This will help to stop broken links from search engines or other external sources. i will remove this forwarding support after google updates its index.

RedirectMatch 301 (.*)\.html$ http://www.ghostofashark.co.uk$1.php

Server side includes for headers and footers

i used the php include() function to include a header.inc and a footer.inc file. This will help to maintain constitency. Instead of having a predesigned header navigation bar, the navigation bar is computed on demand by php code in the header.inc file.

i also included a head.inc file in the <head></head> section of each file. This allows for more standardisation. Also a no robots META tag is included to prevent certain sections of the site being archived by google. This included my blog, and music and film listings.

Style sheet

i have improved my style sheet to be attempt to be more logical and less complex.

i define the core HTML tags, i.e body, h1,h2,h3,ul etc. i turn off borders for img.

To make text clearer to read, i now use justified sans-serif paragraphs that are limited to 40% page width.

p tags are indented by 5%, all other div tags, ul,ol etc, are indented by 10%. Global body margins are at 10% both sides.

There are also the following additional classes.

For tables, i defined th. To make long lists in tables more readable, i provided classes for odd, even, and highlighted rows.

i also now include it as an inline style section in the head of every file. This will help to reduce the server load and keep each page self contained.