Create redirect and rewrite rules into .htaccess on Apache webserver

When using the Apache web server, .htaccess files (also called “distributed configuration files”) are used to specify configuration on a per-directory basis, or more generally to modify the behavior of the Apache web server without having to access virtual hosts files directly (this is usually impossible for example, on shared hosts). In this tutorial we see how we can establish URL redirections and rewriting rules inside .htaccess files.

In this tutorial you will learn:

  • How .htaccess files work
  • How to setup URL rewriting rules in .htaccess files using the RewriteRule directive
  • How to setup URL redirection rules in .htaccess files using the Redirect and RedirectMatch directives

Create redirect and rewrite rules into .htaccess on Apache webserver

Create redirect and rewrite rules into .htaccess on Apache webserver

Software requirements and conventions used

Software Requirements and Linux Command Line Conventions
Category Requirements, Conventions or Software Version Used
System Distribution independent
Software Apache web server
Other No other requirements needed
Conventions # – requires given linux-commands to be executed with root privileges either directly as a root user or by use of sudo command
$ – requires given linux-commands to be executed as a regular non-privileged user

Should you use .htaccess files?

As we already mentioned, the use of .htaccess files are not recommended if you can operate on virtual host configuration files directly, since it slows the Apache web server (when the AllowOverride directive is used to allow the use of .htaccess files, the web server scans every directory searching for them). In some situations, however, the use of .htaccess files is the only solution.

The set of directives that can be used in .htaccess files are established in the main site configuration via the AllowOverride directive, inside a <Directory> stanza; for example, to allow the use of all possible directives we would write something like:

<Directory /path/to/directory>
   AllowOverride All
</Directory>

The instructions will be applied to .htaccess files found in the specified directory and all its subdirectories.

For the directives we will use in this tutorial to work, the mod_alias and mod_rewrite Apache modules must be enabled.

Redirections (mod_alias)

As specified before, in our .htaccess files we may want to specify some redirection rules, so that when a URL is requested, the client is redirected to another one.

We have basically two ways to perform the operation: using the Redirect or the RedirectMatch directives. What is the difference between the two? The former let us establish a redirection based on plain and simple URL matches; the former does basically the same thing but is more powerful, since with it we can use regular expressions.

The “Redirect” directive

Let’s see some examples of the use of the redirect directive. Suppose we want to redirect our entire site:

Redirect 301 / https://url/to/redirect/to


The one above is a quite “extreme” example. Let’s analyze the syntax. As the first thing we specified the directive: Redirect.

The second thing we provided is the HTTP code to be used for the redirection: this can be provided either as a numerical status or in the form of a string.
A few examples:

HTTP CODE KEYWORD
301 permanent
302 temp
303 seeother
410 gone

In the previous example we configured a permanent redirection since we used the 301 HTTP code. An equivalent of that would be:

Redirect permanent / https://url/to/redirect/to

The type of redirection can be omitted altogether: when it is the case, the 302 code (temporary redirection) it is used by default.

The third argument we provided in the rule is the absolute path of the “original” resource that should be matched. In this case we used / which is the root of the site, since we want to redirect it completely. Here the scheme and host part of the URL must be omitted.

The fourth argument is the “new” URL the user should be redirected to. In this case, as we did in the example above, we can use a complete URL, including scheme and host, or omit them and use just a path: in the latter case, it would be considered as part of the same original site. This argument is mandatory if the redirection status specified is between 301 and 399, but it must be omitted if the status provided is not in that range. This makes sense: imagine we use a 410 status to signal that the resource is gone: it would have no sense to specify a redirection URL. In that case we would simply write:

Redirect 410 /path/of/resource


The “RedirectMatch” directive

With the “Redirect” directive we can specify the path of the URL to be redirected, but it must match plain and simple, as it is specified. What if we want to perform something more complex, as for example to redirect requests for all files with the .html extension? In those cases, we can use the RedirectMatch directive, and use a regular expression. Let’s see an example:

RedirectMatch 301 (.*)\.html$ $1.php

In the example above we redirected all the requests for .html files on our site to files with the same name and path, but with the .php extension. Let’s analyze the rule.

As always the first thing we provided is the directive, in this case RedirectMatch. After that, as we did before, we provided the HTTP code to be used for the redirection; then, and this is the interesting thing, we used the (.*)\.html$ regular expression.

To those of you already familiar with regex this should be immediately clear, but let’s see how it works: The . (dot) in the regular expression matches all characters: it is followed by the * which establish that the previous expression should be matched 0 or more times. The expression is enclosed in parenthesis, so it is grouped, and the part of the URL that matches it can be referenced later via the $1 variable (multiple groups can be used – they are ‘named’ progressively, so for example to match the second group we can use $2). After the part of the expression enclosed in parenthesis we specified that the path should end in .html: you can see we escaped the . with a backslash for it to
be matched literally. Finally we used $ to match the end of the line.

As argument for the redirection URL we used $1.php. As we already explained the $1 is used to reference the part of the URL which matched the regular expression between parenthesis (which is the complete path minus the .html extension), so what we are doing here is basically use the same path but with the .php extension.

URL rewriting (mod_rewrite)

URL rewriting rules can be both transparent or visible by the user. In the first case the user requests a page, and the server, internally, translates the request on the base of the provided rule in order to serve the resource: the user doesn’t notice what’s happening, since the URL in its browser doesn’t change. In the second case, instead, we practically achieve a complete redirection visible by the user.

Let’s start with the first case. If we want to use URL rewriting, the first thing we have to do (in this case in our .htaccess file) is to write the following directive:

RewriteEngine on

The RewriteEngine directive, as the name suggests, is needed to modify the state of the Apache rewrite engine. In the example above, we enabled it; to disable it, instead we must write:

RewriteEngine off


Just as an example, suppose we have a resource called page.html in our server, which used to be reached by the plain and simple URL: http://localhost/page.html. Now imagine that for some reasons we renamed the html file, to newpage.html, but for obvious reasons we want our clients to still be able to reach the resource with the old URL (perhaps they have stored it in their browser bookmarks). What we could do is to write the following, very
simple rule:

RewriteEngine on
RewriteRule ^page\.html /newpage.html

The syntax of the rule is very similar to the one we used for the RedirectMatch directive: first we have the directive itself, RewriteRule, than we have the pattern used to for the URL matching: it must be a regex. After it, we have the substitution string, which is used to replace the original URL.

There a fourth element which can be used in the definition of a RewriteRule are the flags, which are used to modify the behavior of the web server when a certain rule is matched.

Let’s see an example: with the rule we set above, as we already said, no redirection happens: the URL in the browser address bar doesn’t change. If we want a redirection to happen we have to add the R flag to the expression:

RewriteEngine on
RewriteRule ^page\.html /newpage.html [R]

Flags are provided between brackets: in this specific case the R flag causes the rule to be interpreted as a redirect. It’s even possible to specify the type of redirection that should take place, by specifying the related HTTP code, for example:

RewriteRule ^page\.html /newpage.html [R=301]

Another common things that URL rewriting is used to, is to “beautify” URLs, for SEO purposes. Let’s say, for example we have a PHP script which retrieves from a database a certain product by its id provided as a query parameter in
the URL, for example:

http://localhost/products.php?id=1

To make the resource available at the http://localhost/products/1 URL, we could write the following rule:

RewriteEngine on
RewriteRule ^products/([0-9]+)$ /products.php?id=$1

With the [0-9] regex we match all digits, and with the + we say that the previous expression must match 1 or more times for the rule to be executed. The matched expression is enclosed in parenthesis, so we can reference the matched part of the URL in the “destination” string, by using the $1 variable. This way, the id of the product we provide in the “beautified” URL, becomes the value of the id variable in the query string.

Rewrite conditions

We just saw how, for a rewrite rule to be applied the regular expression must match the URL provided by the user. In the last example we saw how the http://localhost/products/1 url can be rewritten internally to http://localhost/products.php?id=1. But what if the path specified by the new url references a “real” file existing on the server? What if, for example, /products/1 is a regular file, and we want it to be served as it is? In cases like this we can use the RewriteCond directive.

With the RewriteCond directive, we specify a condition that should be respected for the URL rewriting to take place. In this case, for example, we may want to establish that if the products/1 file exists on the server, the redirection
should not take place. We would write:

RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^products/([0-9]+)$ /products.php?id=$1

We use the RewriteCond directive, before the RewriteRule. The first thing we passed to the directive is the test string that should be matched. In this context we can use a series of predefined server variables, like %{REQUEST_FILENAME}:
it references the the full local filesystem path to the file or script matching the request.

Here we cannot provide a complete list of all the available variables, which you can find by visiting the Apache mod_rewrite documentation.

After the “test string” we specify the condition that should be matched: in this case we used !-f to specify that for the rewrite URL to be applied, the file or script matching the request should not be a regular file existing on the server (-f matches a regular file, and ! inverts the result).

The one above, is a really simple example of a RewriteCond directive: more than one can be provided before the RewriteRule directive: all of them should match for the latter to be applied.

Conclusions

In this article we saw how can we specify URL redirections and URL rewriting rules into .htaccess files when using the Apache Web Server. We saw some very easy examples of the use of the Redirect, RedirectMatch and RewriteRule directives and how can we use them to achieve specific behaviors. This was meant just as an introduction to said subjects, so please take a look at the official documentation pages for the mod_alias and the mod_rewrite modules for a more in-depth knowledge.



Comments and Discussions
Linux Forum