.htaccess Tutorial

Sometimes you may wish to make a change to your server configuration without changing the configuration for the whole server. To do so you can add a small file called .htaccess (which stands for hypertext access). The dot is important in this file name and the name is ALWAYS the same regardless of the change you make.

Before deciding to make a .htaccess file, first check your server to see if the it already exists. Since the .htaccess file is often considered a “hidden” file, make sure that whatever program you use to check your server (e.g., a FTP program) allows the display of hidden files. The important thing to realize is that your webhost or other server software you are using may already be using a .htaccess file and if you overwrite this you could cause some problems.

So, why would you ever need this method of server modification? The two most common situations are (1) when you want a change to be effective for only one directory and (2) when you don’t have access to make server configuration changes directly.

Common Uses for .htaccess Files

One very common use of the .htaccess file is to limit access to a directory. That is to say, to password-protect directories. Typically, the control panel that your webhost offers will have a feature to setup password protection for directories so you probably will never need to do so manually. Typically, after creating a protected directory the following lines will appear in that directory’s .htaccess file:

The AuthName is what the user will see when they’re prompted for a password – something to the effect of “Enter the username for Jon Doe’s files”.

Note that the .htaccess file works in conjunction with a .htpasswd file which contains the actual password (encoded) and is specified by the AuthUserFile line. Again, if you use a control panel, this will be created for you, if not you will need a htpasswd program.

Normally .htaccess applies to an entire directory. With the <Files> directive you can restrict it to specific files:

Other common uses for the .htaccess file include:

  • Custom Error Documents
    Some sites establish site wide 404 error pages and many control panels allow you to change this to suit your needs. In fact, I recommend you do that or at least purposely enter an invalid page to verify that the default 404 page meets your needs (rather than just advertising for your webhost). 404 handlers can be created for any directory. The format is:

    ErrorDocument 404 /path/404.html

    You can also have error documents created by CGI:

    ErrorDocument 404 /path/cgi-bin/error/error?404

  • URL Redirecting
    There are two very common situations where you might want to redirect pages via the .htaccess file.

    First, if you want to make sure that all traffic goes to a www site name (e.g. http://webmastersherpa.com vs. http://www.webmastersherpa.com). This can be important because many search engines treat the two as different sites and index them separately. This can lead to duplicate content issues and possibly lower rankings in the results. The following example will force a permanent redirect from the non-www version of your website to the www version of your website:

    There are a few special characters in the line above that merit explanation as they are used regularly:

    • The caret (^) signifies the start of an URL, under the current directory. This directory is whatever directory the .htaccess file is in. You’ll start almost all matches with a caret.
    • The dollar sign ($) signifies the end of the string to be matched. You would use this so your rules won’t match the part(s) of longer URLs.
    • The backslash () is used by Apache to treat what comes after it as a normal character. It is useful especially before characters that have special meaning in regular expressions. In the line above, the dot or period (.) in .com is a special character so we need to “escape” it using the backslash.
    • The (.*) and the $1 are also regular expressions. Basically .* says match anything. By putting it in parenthesis, we can then use the $1 to tell the server to reproduce whatever was found in those parenthesis (anything entered in this case). If you use multiple statements in parenthesis, you would us $2 for the second occurrence, $3 for the third, etc.
    • You will notice that sometimes there are bracketed characters at the end of a statement. Most often it is an [L] or [R] after rewrite statements. The [L] stands for last rule and instructs the server to stop rewriting after the preceding directive is processed. The [R], which is used above, stands for redirect. It is most useful when you want your visitor to know a redirect has occurred. By forcing a new HTTP request for the new page the browser will load up the new page as if it was the page originally requested, and the location bar will change to show the URL of the new page. In the example above, we have specified the type of redirect as a “301” which is also known as a permanent redirect. We could have written it as [R=permanent,L] with no difference. See below for the types of redirects. The [NC] above stands for “No Case” and it defines the associated argument as case-insensitive.

    The second common situation is when you update your website and rename pages. In such a case, you may (should) want to redirect the old pages to the new ones so that search engines and other sites that may link to those old links will not frustrate users (and will help immensely with your search engine ranking). Note that several different types of redirection are possible:

    • permanent – the resource has moved permanently (also called “301”)
    • temp – it has temporarily moved elsewhere (also called “302”)
    • seeother – the resource has been replaced
    • gone – it has been permanently removed

    Example usage:

    Finally, note that PHP can perform redirects as well, though they must be done before outputting any text to the browser. The following code illustrates a simple permanently moved URL PHP redirect:

  • Rewriting the URL
    Unlike Redirect, with a URL rewrite the client is unaware of any server-side rewriting of the URL. A common use of URL rewriting is when you use a database-driven website and would like to use readable URLs rather than complex query string URLS. For example, to display

    http://www.webmastersherpa.com/tools/cleanup/

    rather than

    http://www.webmastersherpa.com/index.php?page=tools&code=cleanup

    the following would work:

    A useful thing to note about redirects is that you can use regular expressions to do complex matching rather than having to manually redirect every individual URL. For example, let’s assume your natural URL structure for content pages is found on pages that look like:

    http://www.webmastersherpa.com/index.php?page=content&id=1

    To rewrite those to a page that looks like:

    http://www.webmastersherpa.com/content/1/

    You could use the following:

    Finally, note that rewrites can be conditional. For example, you could do a rewrite only if the file could not be found:

    RewriteCond is very powerful. You can test on environment variable values:

  • Enabling server-side includes
    Server-side includes are a type of HTML comment that directs the Web server to dynamically generate data for the Web page whenever it is requested. This isn’t that important when you use a scripting language like PHP which can accomplish the same task, but just the same here is the format to use:

  • Restricting documents
    The .htaccess file provides a number of different ways to restrict documents, including by host address, by browser type, and by HTTP Basic credentials. For example, to restrict access to only those visitors coming from a certain website, you could use the following format:

  • Others
    These are not used frequently so I won’t discuss in detail:

    • Modifying the Environment variables
    • Adding new MIME types
    • Enable/disable AllowOverride
    • set the server timezone
    • enable file caching
    • force media downloads
Like this content? Why not share it?
Share on FacebookTweet about this on TwitterShare on LinkedInBuffer this pagePin on PinterestShare on Redditshare on TumblrShare on StumbleUpon
There Are No Comments
Click to Add the First »