14 @ oct
404 personalized error page thanks to a simple shell script
Thanks to a few commands, I search for terms found in the non-existing URI in the website content.
I also provide a sitemap, represented as a tree, thanks to [my hacked lstre]e(Hacked_lstree.html).
Here is the code used to retrieve the keywords from the URI:
74 uri=${REQUEST_URI}
75 uri=$(echo $uri | sed 's/\.html//')
76 uri=$(echo $uri | sed 's/_/\//')
77
78 until [ "x$tag" = "x" ]
79 do
80 tag=`echo ${uri} | cut -d '/' -f $ntag`
81 ntag=$(($ntag+1))
82
83 [ "x$tag" = "x" ] && break
84
85 echo "<br/><b style=\"font-size: 0.8em\">Research of the \"$tag\" keyword:</b><br/>"
86 for url in $(grep -i -R $tag . | grep -v '<meta' | cut -d ':' -f 1 | uniq | grep '.html')
87 do
88 echo "<a href=\"/$url\">$url</a><br/>"
89 done
90 done
75 uri=$(echo $uri | sed 's/\.html//')
76 uri=$(echo $uri | sed 's/_/\//')
77
78 until [ "x$tag" = "x" ]
79 do
80 tag=`echo ${uri} | cut -d '/' -f $ntag`
81 ntag=$(($ntag+1))
82
83 [ "x$tag" = "x" ] && break
84
85 echo "<br/><b style=\"font-size: 0.8em\">Research of the \"$tag\" keyword:</b><br/>"
86 for url in $(grep -i -R $tag . | grep -v '<meta' | cut -d ':' -f 1 | uniq | grep '.html')
87 do
88 echo "<a href=\"/$url\">$url</a><br/>"
89 done
90 done
It is based on the filesystem. Keywords are extracted from the requested URI. These keywords are search into the html files, but we don’t care of meta tags (because meta keywords are generally set for a whole website, and a “Linux” or “bsd” keyword would be to generic and get too much results as it is present in the meta keywords tags of my website’s html pages).