Getting your archives into search engines like Google

Note: The solutions described here might not work on your system.

Nucleus creates archives dynamically on users requests. The URL is then of the form index.php?archive=2001-09&blogid=1. Unfortunately, Google and other search engines don't like to index pages with a question mark in it, or with too much arguments. This is because their spiders might get trapped going too deep.

Two solutions are listed below. They're not guaranteed to work, however (wether they work or not depends on the webserver configuration)

  1. Fancy URLs
  2. mod_rewrite

Fancy URLs

Nucleus v2.0 has a new option in the global settings 'URL mode'. Setting it to 'Fancy URL' mode, and performing the steps below, will make your URLs look like http://example.org/item/1234 instead of http://example.org/index.php?itemid=1234. Search engines like these URLs better.

Installation steps:

  1. Copy all files from the /extra/fancyurls directory (except for index.html) to your main nucleus dir (that's where your index.php and action.php file are)
  2. If you have an already existing .htaccess file (most ftp-programs don't show hidden files by default, so don't start uploading it without checking your server). If you do, download that old one first, and copy the contents of the new .htaccess file (from the fancyurls folder) in your old one, and upload that...
  3. Edit the fancyurls.config.php file so that $CONF['Self'] points to your main directory.
    NOTE: this time, and only this time, the URL should NOT end in a slash
  4. Also edit the $CONF['Self'] variable in your index.php, if you don't want to end up with index.php/item/1234 urls when people come via that way
  5. Enable 'Fancy URLs' in the Nucleus admin area (nucleus management / edit settings)
  6. Off you go!

When it doesn't work (e.g. you receive an Internal Server Error): bad luck... Remove the files again (don't forget the hidden file .htaccess) and reset the Fancy URLs setting in the admin area.

mod_rewrite

This second possible solution will only work on servers running Apache, and when you have the right to do so. What we will do is 'disguise' the archives as regular HTML pages

Create a file called .htaccess (leading dot!) with the following contents:

RewriteEngine On
RewriteRule ^archive-([0-9]+)-([0-9]+)-([0-9]+).html+ index.php?archive=$2-$3&blogid=$1
RewriteRule ^item-([0-9]+).html+ index.php?itemid=$1
RewriteRule ^archivelist-([a-z]+).html+ index.php?archivelist=$1

Now upload this file to the directory that contains index.php and config.php. Open your browser and try to open archive-1-2001-09.html. If it works, continue to read. If you get a 500 error (internal server error), it does not work on your server, so delete the .htaccess file.

Now all you have to do is to update the link to your blog archives into archivelist-shortblogname.html and make the following changes to your archivelist item template:

<a href="archive-<%blogid%>-<%year%>-<%month%>.html">...</a>

And now, wait until Google comes spidering again...

Tips & Suggestions