14 @ oct

404 personalized error page thanks to a simple shell script

Thanks to a few commands, I search for terms found in the non-existing URI in the website content.
I also provide a sitemap, represented as a tree, thanks to [my hacked lstre]e(Hacked_lstree.html).

Here is the code used to retrieve the keywords from the URI:

 74 uri=${REQUEST_URI}
 75 uri=$(echo $uri | sed 's/\.html//')
 76 uri=$(echo $uri | sed 's/_/\//')
 77
 78 until [ "x$tag" = "x" ]
 79 do
 80   tag=`echo ${uri} | cut -d '/' -f $ntag`
 81   ntag=$(($ntag+1))
 82   
 83   [ "x$tag" = "x" ] && break
 84
 85   echo "<br/><b style=\"font-size: 0.8em\">Research of the \"$tag\" keyword:</b><br/>"
 86   for url in $(grep -i -R $tag . | grep -v '<meta' | cut -d ':' -f 1 | uniq | grep '.html')
 87   do
 88     echo "<a href=\"/$url\">$url</a><br/>"
 89   done
 90 done

It is based on the filesystem. Keywords are extracted from the requested URI. These keywords are search into the html files, but we don’t care of meta tags (because meta keywords are generally set for a whole website, and a “Linux” or “bsd” keyword would be to generic and get too much results as it is present in the meta keywords tags of my website’s html pages).