I will be using some terms in the rest of this book that you may or may not already be familiar with. Read through this section before continuing to be sure.
- Spider / robot (bot) – A spider or robot (usually just called bot) is a program that is used to visit websites to gather information (though some bots could be written to perform some actions on the site as well). Some of these are not very nice, as they are used by spammers and hackers to gather personal information and check for site vulnerabilities. However, many bots are very welcome as they are run by the search engines and are used to index your site in the search engine databases. Later on, we’ll talk about specific ways you can control access of your site to these bots.
- Open Source (sometimes open-source) – open source typically refers to software that has been made available to the public without a fee (for non-commercial uses at least). Open source software is often the product of disperses volunteer programmers (e.g., the Linux operating system), but that is not always the case. There are a large number of VERY useful open source projects to make a webmaster’s life easier and I will point out many of them when appropriate.
- Script – A script is just a synonym for programming code. Usually it refers to a complete program that accomplishes some function(s).
When I am discussing URLs the use of mydomain (e.g., http://www.mydomain.com, mydomain.com) should be interpreted as “replace mydomain with whatever domain name you have registered and are using”. Likewise, if I use something like dirname (e.g. http://www.mydomain.com/dirname/) please read that as “replace dirname with whatever name you are using for the relevant directory where your files are found”.
When I am discussing making changes to code or file names and locations, I will display the code lines in a separate font that looks like:
or for larger blocks of code, like this:
code line 1
code line 2
Programming languages and scripts (like PHP) make use of functions. Some are built-in to the language (like a print statement which outputs text) and some are user-created. Invoking either of these kinds of functions is referred to as calling (call) the function. So, for example, if you have a function called fetchit, I would write that a script/page calls fetchit.
Before you get into too much effort configuring your server, setting up scripts, etc. you will need to know where certain key files reside on your server. Generally speaking, there are two ways to describe a file location – its path and its URL. The URL is what you already are familiar with: http://www.domain.com/dirname/filename. The path is a more server-specific way of describing the location with backslashes. It would look something like /usr/local/apache/htdocs/dirname/filename.
I recommend you know the locations (typically paths) for the following files (I have included the locations on my server since many times these locations are *standard* across many systems).
Raw access logs: