2

I'm currently hosting a couple small static websites using Nginx 1.10.2 on Fedora 25 Server Edition x86_64.

I've configured Nginx to assume the .html for requests with no file extension (try_files), and to redirect (permanent rewrite) to the .html-less version of a URL if .*\.html is requested.

I also have custom error pages. As far as I can tell so far, the error_page directive doesn't cooperate well with rewrites, because I'm getting stuck with redirect loops for pages that should return a normal error message... I believe this is all the pertinent configuration:

server {
    [...]

    try_files  $uri  $uri.html  $uri/index.html  =404;

    error_page  400  /errors/400.html;
    error_page  401  /errors/401.html;
    error_page  403  /errors/403.html;
    error_page  404  /errors/404.html;
    error_page  408  /errors/408.html;
    error_page  500  /errors/500.html;
    error_page  503  /errors/503.html;

    # Remove "index" from URL if requesting the front page.
    rewrite  "^/index(\.html)?$"    "/"    permanent;

    # Strip .html file extension, trailing slash, or "index.html".
    rewrite  "/(.+)(\.html|/|/index.html)$"  "/$1"  permanent;

    [...]
}

Here's what I think Nginx is doing:

  1. client requests /fake-page
  2. look up /fake-page, /fake-page.html, or /fake-page/index.html.
  3. when none of those match, internally redirect to show Error 404 page.
  4. but then /errors/404.html gets stripped of .html with permanent flag, causing a 301 user redirection.

I've tried several variations on that last rewrite line, even putting it in a location ^~ /errors/ {} block (which I think should mean that the rewrite only applies to URLs that are not under the /errors/ directory). But everything I've done has resulted in an Error 404 permanently redirecting to the 404 page, which then does not return the actual 404 status --- or it ends up stuck in a redirect loop.

3 Answers3

1

I would suggest that you wrap the rewrites inside location blocks otherwise it is difficult to control their global reach.

This example seems to work for the snippet you published in your question:

server {
    root /path/to/root;

    location / {
        # Remove "index" from URL if requesting the front page.
        rewrite  "^/index(\.html)?$"    "/"    permanent;

        # Strip .html file extension, trailing slash, or "index.html".
        rewrite  "/(.+)(\.html|/|/index.html)$"  "/$1"  permanent;

        try_files  $uri  $uri.html  $uri/index.html  =404;
    }

    error_page  400  /errors/400.html;
    error_page  401  /errors/401.html;
    error_page  403  /errors/403.html;
    error_page  404  /errors/404.html;
    error_page  408  /errors/408.html;
    error_page  500  /errors/500.html;
    error_page  503  /errors/503.html;

    location /errors/ {
    }
}
Richard Smith
  • 11,859
  • 2
  • 18
  • 26
  • Okay, I wrapped all `rewrite`s in a `location / {}` block as you suggested. But I'm still having the issue where navigating to `/fake-page` does a 301 redirect to the Error 404 page, rather than simply returning the error 404 status. –  Feb 12 '17 at 21:45
1

This was a tough nut but finally this works as requested:

server {

    [...]

    root /path/to/root;
    set $docroot "/path/to/root";
    error_page  400  /errors/400.html;
    error_page  401  /errors/401.html;
    error_page  403  /errors/403.html;
    error_page  404  /errors/404.html;
    error_page  408  /errors/408.html;
    error_page  500  /errors/500.html;
    error_page  503  /errors/503.html;

    location = /errors/404.html {
        root $docroot;
        internal;
    }

    location ~ ^/index(\.html)?$
    {
        return 301 "/";
        break;   
    }

     location ~ ^/$
    {
        try_files  $uri  $uri.html  $uri/index.html  =404;
        break;
    }

    location ~ ^/(.+)(\.html|/|/index.html)$
    {
        if (-f $docroot/$1) {
            return 301  "/$1";
            break;
        }

        if (-f $docroot/$1.html) {
            return 301  "/$1";
            break;
        }

        if (-f $docroot/$1/index.html) {
            return 301  "/$1";
            break;
        }

        try_files missing_file  =404; 
    }

    location ~ ^/(.+)(?!(\.html|/|/index.html))$
    {
        try_files  $uri  $uri.html  $uri/index.html  =404;    
    }

    [...]
}

I'll try to expand this with comments a bit later ;)

Anubioz
  • 3,597
  • 17
  • 23
  • Haven't had a chance to test this yet, but those `if` lines make me nervous. The Nginx wiki is [pretty emphatic](https://www.nginx.com/resources/wiki/start/topics/depth/ifisevil/) about avoiding `if` whenever possible... –  Feb 13 '17 at 08:05
  • @Niccolo there is no other way to accomplish what you want, since you'd have to check if file exists before making decision whether to do the redirect or to show an error page... – Anubioz Feb 13 '17 at 13:09
  • nginx: [emerg] unknown "docroot" variable –  Feb 14 '17 at 20:54
  • 1
    Oh, never mind, I missed the part where you added `set $docroot...` to the configuration. It's kind of ugly, but maybe *what I want Nginx to do* is ugly and complicated... Your solution does exactly what I asked for. So, thanks for your help! –  Feb 14 '17 at 21:12
1

I've found a simpler solution, inspired by @Anubioz's answer.

server {
    [...]

    # This part's pretty common.
    # Assume ".html" or ".../index.html" if the original request doesn't
    # match any real files. Return error 404 if still can't find any
    # matching files.
    try_files  $uri  $uri.html  $uri/index.html  =404;

    # Match any resulting HTTP status codes with the custom error pages
    # that I designed.
    error_page  400  /errors/400.html;
    error_page  401  /errors/401.html;
    error_page  403  /errors/403.html;
    error_page  404  /errors/404.html;
    error_page  408  /errors/408.html;
    error_page  500  /errors/500.html;
    error_page  503  /errors/503.html;

    # Make sure the paths to the custom error pages are not transformed
    # before sending the actual status codes to a request.
    # These error pages are for *internal* use only.
    location = "/errors/400.html" {
        internal;
    }
    location = "/errors/401.html" {
        internal;
    }
    location = "/errors/403.html" {
        internal;
    }
    location = "/errors/404.html" {
        internal;
    }
    location = "/errors/408.html" {
        internal;
    }
    location = "/errors/500.html" {
        internal;
    }
    location = "/errors/503.html" {
        internal;
    }

    # Remove "index" from URL if requesting the front page.
    location ~ "^/index(\.html)?$" {
        return  301  "/";
    }

    # Strip .html file extension, trailing slash, or index,
    # forcing the "pretty" URL, if user requests a ".html" URL.
    location ~ "^/(.+)(\.html|/|/index|/index.html)$" {
        return  301  "/$1";
    }

    [...]
}

The location blocks containing the internal directives keep the error pages from being touched by the usual URL processing. I did try matching just the /errors/ directory so I wouldn't have to repeat the internal directive over and over, but Nginx matches incoming requests to rewrite rules basically sorting by specificity of the rule selector: Only exact matches to each custom error page had sufficient specificity to grap the error pages before Nginx could start the URL rewriting process on them.

(My use of Nginx jargon might be a little off, but this is what seems to me to be happening. And this configuration does exactly what the code comments say, which is what I'd wanted from the beginning.)