Apache Htaccess

apache

What is the purpose of the .htaccess file?

.htaccess files provide a way to make configuration changes on a per-directory basis. .htaccess files (or "distributed configuration files") provide a way to make configuration changes on a per-directory basis. A file, containing one or more configuration directives, is placed in a particular document directory, and the directives apply to that directory, and all subdirectories thereof.

.htaccess files should be used in a case where the content providers need to make configuration changes to the server on a per-directory basis, but do not have root access on the server system. In the event that the server administrator is not willing to make frequent configuration changes, it might be desirable to permit individual users to make these changes in .htaccess files for themselves. This is particularly true, for example, in cases where ISPs are hosting multiple user sites on a single machine, and want their users to be able to alter their configuration.

However, in general, use of .htaccess files should be avoided when possible. Any configuration that you would consider putting in a .htaccess file, can just as effectively be made in a <Directory> section in your main server configuration file.

Why should we avoid using the .htaccess file if we have access to the main configuration file?

You should avoid using .htaccess files completely if you have access to httpd main server config file. Using .htaccess files slows down your Apache http server. Any directive that you can include in a .htaccess file is better set in a Directory block, as it will have the same effect with better performance.

There are two main reasons to avoid the use of .htaccess files. The first of these is performance. When AllowOverride is set to allow the use of .htaccess files, httpd will look in every directory for .htaccess files. Thus, permitting .htaccess files causes a performance hit, whether or not you actually even use them! Also, the .htaccess file is loaded every time a document is requested.

Further note that httpd must look for .htaccess files in all higher-level directories, in order to have a full complement of directives that it must apply. (See section on how directives are applied.) Thus, if a file is requested out of a directory /www/htdocs/example, httpd must look for the following files:

/.htaccess
/www/.htaccess
/www/htdocs/.htaccess
/www/htdocs/example/.htaccess

And so, for each file access out of that directory, there are 4 additional file-system accesses, even if none of those files are present. (Note that this would only be the case if .htaccess files were enabled for /, which is not usually the case.)

In the case of RewriteRule directives, in .htaccess context these regular expressions must be re-compiled with every request to the directory, whereas in main server configuration context they are compiled once and cached. Additionally, the rules themselves are more complicated, as one must work around the restrictions that come with per-directory context and mod_rewrite. Consult the Rewrite Guide for more detail on this subject.

The second consideration is one of security. You are permitting users to modify server configuration, which may result in changes over which you have no control. Carefully consider whether you want to give your users this privilege. Note also that giving users less privileges than they need will lead to additional technical support requests. Make sure you clearly tell your users what level of privileges you have given them.

Note that it is completely equivalent to put a .htaccess file in a directory /www/htdocs/example containing a directive, and to put that same directive in a Directory section <Directory "/www/htdocs/example"> in your main server configuration.

However, putting this configuration in your server configuration file will result in less of a performance hit, as the configuration is loaded once when httpd starts, rather than every time a file is requested.

How can we disable the use of .htaccess file?

The use of .htaccess files can be disabled completely by setting the AllowOverride directive to none:

AllowOverride None

How can we rename the .htaccess file?

If you want to call your .htaccess file something else, you can change the name of the file using the AccessFileName directive. For example, if you would rather call the file .config then you can put the following in your server configuration file:

AccessFileName ".config"

How can we protect the content of a folder?

Put into the .htaccess file:

deny from all

How can we prevent access to the content of a folder?

deny from all

How can I prevent access to the content of a folder for a specific IP addresses?

order allow,deny
deny from XXX.XXX.XXX.XXX
allow from all
allow from all
deny from 145.186.14.122
deny from 124.15

What is the proper permission for the .htaccess file?

644

Which directories does the .htaccess file apply to?

An .htaccess file will affect the directory it is placed in and all sub-directories.

How can I specify the index file?

DirectoryIndex welcome.html welcome.php

How can we add comments to your .htaccess file?

Use the # character at the beginning of the line.

How can I do redirect based on the user agent string?

RewriteCond %{HTTP_USER_AGENT} ^.*iPad.*$
RewriteRule ^(.*)$ http://yourdomain.com/folderfortablets [R=301]
RewriteCond %{HTTP_USER_AGENT} ^.*Android.*$
RewriteRule ^(.*)$ http://yourdomain.com/folderfortablets [R=301]

How can I prevent hot linking?

Concerned about hotlinking or simply want to reduce your bandwidth usage? Try experimenting with:

Options +FollowSymlinks
RewriteEngine On
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www.)?domainname.com/ [nc]
RewriteRule .*.(gif|jpg|png)$ http://domainname.com/img/hotlink_f_o.png [nc]

How can we override the default mime type (force users to download files rather than view them in the browser)?

AddType application/octet-stream .csv
AddType application/octet-stream .xls
AddType application/octet-stream .doc
AddType application/octet-stream .avi
AddType application/octet-stream .mpg
AddType application/octet-stream .mov
AddType application/octet-stream .pdf

or you simplify this as:

AddType application/octet-stream .avi .mpg .mov .pdf .xls .mp4

How can we support nice URLs?

RewriteEngine on
RewriteRule ^content-([0-9]+).html$ content.php?id=$1

If the user types /content-someNumbers.html, the above code will internally transform it to content.php?id=…

How can we redirect to HTTPS / SSL?

RewriteEngine On
RewriteCond %{HTTPS} !on
RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI}
RewriteEngine On
RewriteCond %{HTTPS} off
RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI}

How can we activate SSI?

AddType text/html .html
AddType text/html .shtml
AddHandler server-parsed .html
AddHandler server-parsed .shtml
AddHandler server-parsed .htm

How can we disable or enable Directory browsing?

# disable directory browsing
Options All -Indexes
# enable directory browsing
Options All +Indexes

How can we change the charset and language headers?

AddDefaultCharset UTF-8
DefaultLanguage en-GB

How can we block unwanted users?

If you want to block unwanted visitors from a particular website or range of websites you could use:

<IfModule mod_rewrite.c>
 RewriteEngine on
 RewriteCond %{HTTP_REFERER} website1.com [NC,OR]
 RewriteCond %{HTTP_REFERER} website2.com [NC,OR]
 RewriteRule .* - [F]
</ifModule>

This seems to be odd that we would want to block users, but we may have to do this to comply with some law or to dodge a security attack.

How can we block unwanted user agents?

<IfModule mod_rewrite.c>
SetEnvIfNoCase ^User-Agent$ .*(bot1|bot2|bot3|bot4|bot5|bot6|) HTTP_SAFE_BADBOT
SetEnvIfNoCase ^User-Agent$ .*(bot1|bot2|bot3|bot4|bot5|bot6|) HTTP_SAFE_BADBOT
Deny from env=HTTP_SAFE_BADBOT
</ifModule>

How can we block access to a range of files?

If you want to protect particular files, or even block access to the .htaccess file, try customising the following code:

<Files privatefile.jpg>
 order allow,deny
 deny from all
</Files>

<FilesMatch ".(htaccess|htpasswd|ini|phps|fla|psd|log|sh)$">
 Order Allow,Deny
 Deny from all
</FilesMatch>

How can I display a custom 404 page?

ErrorDocument 404 error.html

And you can extend this like so:

ErrorDocument 400 /400.html
ErrorDocument 401 /401.html
ErrorDocument 403 /403.html
ErrorDocument 404 /404.html
ErrorDocument 500 /500.html
ErrorDocument 502 /502.html
ErrorDocument 504 /504.html

How can I remove the need for www in the URL?

RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} ^www.yourdomain.com [NC]
RewriteRule ^(.*)$ http://yourdomain.com/$1 [L,R=301]

With the above code, if the user type www, he will be redirected to the same page without the www.

Example of RedirectMatch:

< IfModule mod_alias.c >
    RedirectMatch 301 /sitemap\.xml$ http://your-site.com/sitemap.xml
    RedirectMatch 301 /sitemap\.xml\.gz$ http://your-site.com/sitemap.xml.gz
< /IfModule >

How can we redirect invalid non-existent page to a valid page?

< IfModule mod_alias.c >
RedirectMatch 301 ^/search/$ http://your-site.com/
RedirectMatch 301 ^/tag/$ http://your-site.com/
RedirectMatch 301 ^/category/$ http://your-site.com/
< /IfModule >

Example of URL rewriting:

< IfModule mod_rewrite.c >
RewriteCond %{REQUEST_URI} ^/feed/ [NC]
RewriteCond %{HTTP_USER_AGENT} !(FeedBurner|FeedValidator) [NC]
RewriteRule .* http://feeds.feedburner.com/Your-Site-Userame [L,R=301]
< /IfModule >

How can we prevent the end-less craw of robots.txt?

Most website owners and administrators aren’t aware that the internet is full of malicious “spider” applications which endlessly crawl web servers, looking in every single directory of a robots.txt file with specific search engine instructions. This is done largely to slow down a website’s load times and exploit any security risks that the site might be vulnerable to, and it’s a huge problem for a large number of unsuspecting WordPress website operators. This problematic “perpetual crawl” can be avoided by telling the .htaccess file to specifically list where the site’s robots.txt file is stored, thereby preventing the endless crawling of external spiders. They’ll be told to look in one place, get what they need, and get out. That’s a far better approach. Here’s what it looks like:

< IfModule mod_rewrite.c >
RewriteBase /
RewriteCond %{REQUEST_URI} !^/robots.txt$ [NC]
RewriteCond %{REQUEST_URI} robots\.txt [NC]
RewriteRule .* http://your-site.com/robots.txt [R=301,L]
< /IfModule >

Remember that a robots.txt file should always be placed in the root directory of a website. This file controls not only what information can be seen, but also which search engines can crawl a website and which directories should be excluded. This is an essential way of walling off subdomains, add-on domain folders, and other information which should be crawled separately and indexed away from the main domain name that serves a website.

How can we eliminate the endless craw for favicon?

The same malicious spiders which crawl through every one of a server’s directories looking for a robots.txt file are also well-known to do the “perpetual crawl” when looking for so-called favicon images.

< IfModule mod_rewrite.c >
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_URI} /favicon?\.?(gif|ico|jpe?g?|png)?$ [NC]
RewriteRule (.*) http://your-site.com/favicon.ico [R=301,L]
< /IfModule >

How can we set HTTP headers in .htaccess:

<FilesMatch "\.(html|htm|js|css)$">
FileETag None
<IfModule mod_headers.c>
Header unset ETag
Header set Cache-Control "max-age=0, no-cache, no-store, must-revalidate"
Header set Pragma "no-cache"
Header set Expires "Wed, 11 Jan 1984 05:00:00 GMT"
</IfModule>
</FilesMatch>

<FilesMatch "\.(html|htm)$">
<IfModule mod_headers.c>
Header set imagetoolbar "no"
</IfModule>
</FilesMatch>

How does the AllowOverride directive influence the .htaccess file?

This directive specifies, in categories, what directives will be honored if they are found in a .htaccess file. If a directive is permitted in a .htaccess file, the documentation for that directive will contain an Override section, specifying what value must be in AllowOverride in order for that directive to be permitted.

For example, if you look at the documentation for the AddDefaultCharset directive, you will find that it is permitted in .htaccess files. (See the Context line in the directive summary.) The Override line reads FileInfo. Thus, you must have at least AllowOverride FileInfo in order for this directive to be honored in .htaccess files.

How can I check to see if a particular directive is allowed inside an .htaccess file?

If you are unsure whether a particular directive is permitted in a .htaccess file, look at the documentation for that directive, and check the Context line for ".htaccess", and then read the AllowOverride directive above.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License