Friday, July 11, 2008

Use Magnolia and Google Webmaster Tools to restructure your website

I am in the process of relaunching We have rewritten most of the content and changed much of the site structure; in addition the top-level "en" directory has been dropped since all content is in English anyways for the time being. In other words, not a single URL of the previous site would still work on the new site.

How many broken links will this change result in? If you do it right, none.

Here is what I did:

1. Register for Google Webmaster Tools.
If you already have an account with Google, just add the Tools here:, otherwise register first.

2. Verify your site
Add your site to the Webmaster tools. To make sure that you have the right to do anything with a website, Google will request that a proof. The easiest way is to use the option where you create a new page with a special name, like "google14b6bc12345b6e7d.html". This is trivial with Magnolia, simply copy the name, log into our AdminCentral, click "create page, paste the name and activate the page. Back at Google Webmaster tools, click "verify now" and you are done. 

You can delete the page after that.  If you delete the verification page, Google will notice and request a new verification. So you will need to keep it. To make that fact less annoying, you can do a URIredirect (as described below), so the page doesn't show up in your AdminCentral tree. Another option is to log into the public site instead of the author and create the page there; you will probably forget about it but should it ever be deleted you can always create a new one.

3. Now check which – and how manypages point to you

Check the section "Pages with external links". This will give you a list of links to your site, and tell you ho many people link to that page. This makes it easy for you to understand which pages are most important. I focused on any page that has more than 10 links pointing to it.

Now you know which are the important pages. But what to do about it? Enter Magnolia's "virtualURIMapping". This functionality allows you to map prett much anything to anything else. I will only use pretty basic stuff, but you can even use regular expressions in your redirect config.

4. Magnolia's virtualURIMapping
In our case, we have created a custom module that holds the extensions we needed for the new site, and I will add the virtualURIMapping configuration there. You will find an example in the module "adminInterface", but virtualURIMapping's can be declared in any module configuration. (Note: at the moment, creating a new virtualURIMapping folder will not be automatically detected, so you need to restart your instance. Once it is there, you can add mappings to your heart's content without restarting).

Now. all you need to do is add a mapping for each of the "important" previously available URL's as shown in Google Webmaster "Pages with external links". Typically you will want to copy an existing node (example above: the 3-6 node) and paste it instead of creating the entries manually. This is a fast and straightforward process. Right-click on a contentNode, select "copy" from Magnolia's context menu, and left-click on the parent folder to paste a copy into the folder.

Then copy the path from Googles "external links" page and redirect or forward it to the new location. In the end, this will look similar to our setup:

In the image above you see three different ways to use the mapping:
  1. you can "forward" or "redirect". Forward is server-side and can be thought of as a virtually mirrored page, in other words, the client will not know that /3-6.html is really /home/3-6.html
  2. You can map anything below a folder like I did for "/en/* ". The "*" is the wildcard here, matching anything.
  3. You can map a specific page, like I did for "/en/about-magnolia.html". Note that here the formURI includes the ".html" ending. This also means that e.g. "/en/about-magnolia.print" will not be redirected by this mapping.
There are many more options for Magnolia's virtualURIMapping, but this has shown some of the most often used. You can see it in action once we launch the 3.6 site, scheduled for Monday 21. Just try and see where it takes you!

No comments:

Post a Comment