Your Guide to Website Design and Management

Text Size:
small_A.gif small_A.gif
Bookmark and Share

Rating: 0.0/5 (0 votes)

Configuring and Administering Your Server >>

Canonicalization Prevention Guide

"For people who make websites" - A List Apart Magazine explores the design, development, and meaning of web content, with a special focus on web standards and best practices.
HTML Validator is a Mozilla extension that adds HTML validation inside Firefox and Mozilla. The number of errors of a HTML page is seen in the form of an icon in the status bar when browsing. The details of the errors are seen when looking the HTML source of the page.

The extension is based on Tidy and OpenSP. Both algorithms were originally developed by the Web Consortium W3C. Both algorithms are embedded inside Mozilla/Firefox and makes the validation locally on your machine, without sending HTML to a third party server.
This project aims to create an archive of user contributed clip art that can be freely used.
Starting at the beginning, this reference explains everything you need to know about using core JavaScript. It assumes you have the following basic background: a general understanding of the Internet and the World Wide Web and a good working knowledge of HTML. An excellent resource.
Edit your images on the fly online with Splashup, a web-based image editor that integrates with Flickr, Facebook, and Picasa. Splashup offers up a surprising array of image editing tools, far beyond the usual crop of resize and contrast-- you can also edit multiple images, play with filters and layers, use a variety of brushes, and more. Splashup is one of the best image editors in a long line of image editors; i.e., Picnik, Pixoh, and Resizr, to name just a few.[Lifehacker Annotation]
This website will let you:
  • Create an XML sitemap format that can be submitted to Google to help them crawl your website better.
  • Create a Text sitemap to submit to Yahoo.
  • Create a ROR sitemap, which is an independant XML format for any search engine.
  • Generate an HTML site map to allow human visitors to easily navigate on your site.
Clearspring's free Launchpad widget builder lets you easily turn your website's content into a widget which site visitors can use to place your content on all the major social media sites (MySpace, FaceBook, Google, hi5, Live, Yahoo, Wordpress, Blogger, etc.). The service also provides tracking and analysis.
This site features online text and html changing, modifying, converting tools designed to save you time making web pages or preparing text for web publication. If you've ever needed to capitalize sentences or convert line breaks to <p> or <br /> then this site can save you needless manual labor. There are other useful tools as well, like the one to uncompress html to make it readable and the ones to uppercase or lowercase text. Basically, the most common tasks that someone who works in an office or does freelance web development might encounter. Most of the tools have been created using javascript so you should be able to change large amounts of text as the processing is done on your computer instead of being limited by a server script.
You've downloaded and configured your Apache server and are ready to move on to the next project. Can it really be left to fend for itself in a darkened room?

Yes. To some degree, anyway. On the other hand, completely ignoring your Apache installation would be foolhardy.
The Wikipedia entry for Sender Policy Framework (SPF).
The Wikipedia entry for DomainKeys.

Canonicalization, a computer science term, is used, especially by Google, to refer to the process of picking the best url when there are several choices. From an SEO perspective, it is important to avoid creation of duplicate content due to potential negative effects in search engine rankings.

Specific code snippets to resolve the problem in different situations have been widely discussed, but there is no single place that contains a list of different methods. User wige at WebProWorld created a forum thread, Canonicalization Prevention Guide, to list ways of eliminating the two main types of canonicalization, www vs non-www duplication, and / vs /index.html canonicalization.

I reprint the code snippets for using .htaccess and PHP below but recommend you visit forum thread to get more detailed information or if you run into any problems.

Using .htaccess

Requirements: Apache Server with mod_rewrite enabled

www vs non-www
Add the following code after the RewriteEngine on directive:

RewriteCond %{HTTP_HOST} !^www.yourdomain.com
RewriteRule (.*) http://www.yourdomain.com/$1 [R=301,L]

Or, alternatively:

RewriteCond %{HTTP_HOST} ^yoursite.com [NC]
RewriteRule ^(.*)$ http://www.yoursite.com/$1 [L,R=301]

Remove /index.html or index.hp from requests for the root of a folder
Add the following code to after the RewriteEngine on directive:

RewriteRule ^/index.(php|html)$ http://www.yourdomain.com/ [R=301, L]
RewriteRule ^(.*)/index.(php|html)$ http://www.yourdomain.com/$1/ [R=301, L]

Or, alternatively:

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9} /.*index.html? HTTP/
RewriteRule ^(.*)index.html?$ http://www.yoursite.com/$1 [R=301,L]

Using PHP

You can accomplish the same canonicalization goals shown above for .htaccess in PHP instead by adding a code snippet to the beginning of your scripts. The following code must be the first thing in the script, before any output is sent to the browser. Note that this code should work even if there is an internal mod_rewrite or other URL mapping or aliasing in place.

<?php
if ($_SERVER['HTTP_HOST'] != 'www.yourdomain.com') {
   // First correct the domain issue
   header('Location: http://www.yourdomain.com'.$_SERVER['REQUEST_URI'], 301);
   exit();
}
if (eregi('/index.(html|htm|php)$', $_SERVER['REQUEST_URI'])) {
   // Then correct the directory root issue
   $redirect = 'http://www.yourdomain.com'.eregi_replace('/index.(html|htm|php)', '/', $_SERVER['REQUEST_URI']);
   header('Location: '.$redirect, 301);
   exit();
}
?>



Text Size:
small_A.gif small_A.gif
Bookmark and Share