In any way, using regex locations like that is more often than not unnecessary and should be avoided whenever possible. Each additional regex location has an extra performance impact, especially on a heavily loaded system. Due to the regex locations priority over the prefix ones, every request that will be eventually processed by your root location is checked for matching every regex location pattern before that. Having a big number of regex locations can significantly impact on overall server performance.
nginx request processing workflow
TL;DR First of all, I want to explain the nginx request processing mechanism. The request processing phases description from the development guide can be very helpful for anyone who want to figure it out. Take a look at it, since I'm going to refer on those phases during the explanation. I'll try to briefly explain the most important parts.
Every request can traverse through several location blocks during it processing. It can be passed from one location to another due to the rewrite ... last
directive being triggered at the NGX_HTTP_REWRITE_PHASE
(first explicit way), according the last argument of try_files
directive when all the checks failed at the NGX_HTTP_PRECONTENT_PHASE
(second explicit way) or via the index
directive at the beginning of the NGX_HTTP_CONTENT_PHASE
(implicit way). Actually, there is at least one more way for the request to change location via the error handler declared using error_page
directive, however I'm not going to dig into this one since it is out of the scope for this answer.
Nevertheless every request, if not being terminated earlier with the rewrite
/return
directives or due to the access restrictions, finally ends up reaching the NGX_HTTP_CONTENT_PHASE
in some location and uses that exact location settings, especially the content handler. It does not inherit any settings from the locations it is already passed through. Content handler can be specified explicitly (some examples are http_proxy_module
using the proxy_pass
directive, http_fastcgi_module
using the fastcgi_pass
directive, etc.) or the it will be attached implicitly using the http_static_module
(I'm gonna call it static content handler).
What do I mean by "implicit internal redirect via the index
directive"? Well, every location that is using static content handler has a NGX_HTTP_PRECONTENT_PHASE
handler equal to try_files $uri $uri/ =404
directive (if not being specified explicitly using try_files
directive with some other parameters). Every location has an index
directive, either declared explicitly, inherited from previous configuration level, or having default value of index index.html
. That is, assuming we didn't have any explicitly defined index
directive at the server
or http
levels, the following location blocks are equal:
location / {
index index.html;
try_files $uri $uri/ =404;
}
location / {
try_files $uri $uri/ =404;
}
location / {}
The $uri/
parameter of the try_files
directive allows further request processing inside the current location if an $uri
is an existing physical directory under the location root
(an actual file path that is checked upon the local filesystem will be concatenation of $document_root
and $uri
strings). Later, during the NGX_HTTP_CONTENT_PHASE
, an index
directive can cause an internal redirect if an index file is present inside this directory. A quote from the index
directive documentation:
It should be noted that using an index file causes an internal redirect, and the request can be processed in a different location. For example, with the following configuration:
location = / {
index index.html;
}
location / {
...
}
a /
request will actually be processed in the second location as /index.html
.
Summarizing all this up, if you have a configuration like
location / {
index index.php;
try_files $uri $uri/ /index.php$is_args_args;
add_header X-Content 'static';
}
location ~ \.php$ {
...
fastcgi_pass ... # fastcgi_module handler used here for content generation
}
- every request for the PHP file will go straight to the PHP handler due to the regex pattern
\.php$
match, since the regex locations have a greater priority over the prefix ones (unless a prefix location declared using ^~
modifier);
- every request for the non-existing file or directory will go to the PHP handler due to the internal redirect to
/index.php
issued by try_files
directive;
- every request for the existing directory containing
index.php
file will go to the PHP handler due to the internal redirect to $uri/index.php
issued by index
directive.
That is, only the existing non-PHP static file will ever have a chance to get that X-Content: static
header in HTTP response.
Configuration examples for web applications
PHP-driven applications
To add an expiration date for every static file, you can use the following configuration:
location / {
index index.php;
expires 30d;
try_files $uri $uri/ /index.php$is_args$args;
}
location ~ \.php$ {
... php-fpm configuration
}
As already being said, this won't affect any PHP file or "virtual" application route in any way.
Applications driven by JavaScript frameworks (Angular/React/Vue/etc.)
That kind of web applications usually being served using a configuration similar to the following one:
location / {
try_files $uri /index.html;
}
It is the index.html
file that's taking the role of a "route controller" here. To exclude it, as well as any "virtual" application route, from the caching policy you can split that location in two:
location / {
expires 30d;
try_files $uri /index.html;
}
location = /index.html {
try_files $uri =404;
}
Additionally, you can define custom cache policies for different assets directories, for example:
location / {
expires 30d;
try_files $uri /index.html;
}
location /assets/ {
expires 90d;
try_files $uri =404;
}
location = /index.html {
try_files $uri =404;
}
However when it comes to JavaScript-driven web applications, you may want to exclude javascript files from caching too, and here we go to our next and the last part of the answer.
Different cache policies for the different file types
Ok, you said, that's all very interesting, but I need a different cache policies for different file types, and there are static files that I don't want to be cached at all, e.g. javascript ones. That means I still have to use regex locations for those file types, right?
No, most probably not.
It is very sad that there are so many examples all over the Internet, including those from respected sources, like deployment recommendations from official web applications documentation, that use this approach. There is another, much more efficient way to do the same.
You can evaluate your caching policy (as well as many other location settings) from the MIME type value that will be sent by nginx as the Content-Type
response HTTP header. That value will be taken from the mime.types
file included in your main nginx.conf
configuration file (usually located at /etc/nginx
directory) and is available to you via the $sent_http_content_type
nginx internal variable. Let's take the nginx deployment configuration recommended by Joomla documentation and make it somewhat more performant. Instead of
server {
...
location / {
try_files $uri $uri/ /index.php?$args;
}
...
location ~* \.(ico|pdf|flv)$ {
expires 1y;
}
location ~* \.(js|css|png|jpg|jpeg|gif|swf|xml|txt)$ {
expires 14d;
}
}
you can use the map
block to evaluate expires
directive value (MIME types used here taken from nginx 1.21.4 default mime.types
file and may change in the future; check your own mime.types
file for the actual values):
map $sent_http_content_type $expires {
image/x-icon 1y;
application/pdf 1y;
video/x-flv 1y;
application/javascript 14d;
text/css 14d;
image/png 14d;
image/jpeg 14d;
image/gif 14d;
application/x-shockwave-flash 14d;
text/xml 14d;
text/plain 14d;
default off;
}
Then instead of those three locations shown above you can use this single non-regex location:
server {
...
location / {
expires $expires;
try_files $uri $uri/ /index.php?$args;
}
...
}
One can ask here, what's the matter about removing 6 configuration lines and adding 15? Is our config only getting bigger? Or even more interesting, hey, I can reduce that map
block using regex pattern to include every image type the following way:
map $sent_http_content_type $expires {
image/x-icon 1y;
~^image/ 14d;
...
}
Yes, you can. Don't do it. This isn't about the configuration size, but about performance, which should be your primary goal (at least I think so). When the map
table contains only strings, it is became a hash table internally, with the O(1) evaluating time complexity. When you add a regular expressions to it, it gets split in two parts, a hash table of fixed values and a list of regex patterns. If there will be no exact match with the hash table part, the source value will be matched upon all the regex patterns from the list one by one, until the first match is found or the whole list is finished. That is, if performance is really your primary goal, you won't do it (however for this particular case it would be some performance benefit due to replacing two regex matching operations with a single one).