Canonicalization, a computer science term, is used, especially by Google, to refer to the process of picking the best url when there are several choices. From an SEO perspective, it is important to avoid creation of duplicate content due to potential negative effects in search engine rankings.
Specific code snippets to resolve the problem in different situations have been widely discussed, but there is no single place that contains a list of different methods. User wige at WebProWorld created a forum thread, Canonicalization Prevention Guide, to list ways of eliminating the two main types of canonicalization, www vs non-www duplication, and / vs /index.html canonicalization.
I reprint the code snippets for using .htaccess and PHP below but recommend you visit forum thread to get more detailed information or if you run into any problems.
Using .htaccess
Requirements: Apache Server with mod_rewrite enabled
www vs non-www
Add the following code after the RewriteEngine on directive:
RewriteCond %{HTTP_HOST} !^www.yourdomain.com
RewriteRule (.*) http://www.yourdomain.com/$1 [R=301,L]
Or, alternatively:
RewriteCond %{HTTP_HOST} ^yoursite.com [NC]
RewriteRule ^(.*)$ http://www.yoursite.com/$1 [L,R=301]
Remove /index.html or index.hp from requests for the root of a folder
Add the following code to after the RewriteEngine on directive:
RewriteRule ^/index.(php|html)$ http://www.yourdomain.com/ [R=301, L]
RewriteRule ^(.*)/index.(php|html)$ http://www.yourdomain.com/$1/ [R=301, L]
Or, alternatively:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9} /.*index.html? HTTP/
RewriteRule ^(.*)index.html?$ http://www.yoursite.com/$1 [R=301,L]
Using PHP
You can accomplish the same canonicalization goals shown above for .htaccess in PHP instead by adding a code snippet to the beginning of your scripts. The following code must be the first thing in the script, before any output is sent to the browser. Note that this code should work even if there is an internal mod_rewrite or other URL mapping or aliasing in place.
1 2 3 4 5 6 7 8 9 10 11 12 13 | <?php if ($_SERVER['HTTP_HOST'] != 'www.yourdomain.com') { // First correct the domain issue header('Location: http://www.yourdomain.com'.$_SERVER['REQUEST_URI'], 301); exit(); } if (eregi('/index.(html|htm|php)$', $_SERVER['REQUEST_URI'])) { // Then correct the directory root issue $redirect = 'http://www.yourdomain.com'.eregi_replace('/index.(html|htm|php)', '/', $_SERVER['REQUEST_URI']); header('Location: '.$redirect, 301); exit(); } ?> |
Click to Add the First »