43

I'm currently writing a web application, and my client asked me if it would be possible to suggest a valid URL to the user when they accidentally write a typo in the URL bar, an example of this would go like this:

  • Bob navigates to 'https://www.example.com/product'
  • The web server is unable to find the route '/product', but knows that the route '/products' does exist
  • The web server suggests Bob to navigate to '/products' instead
  • Bob navigates to '/products' and continues browing the website

This example would cause Bob to have a better user experience.

However, it led me to wonder if this is considered bad practice, as the server might expose URL's the admin of the website might not want to show publicly.

Paradoxis
  • 892
  • 7
  • 15
  • Isn't it a self-answered question? – techraf May 19 '16 at 09:19
  • @techraf I know it could be seen as secutity through obscurity, but I wanted to know if it could be considered bad practice – Paradoxis May 19 '16 at 09:21
  • 1
    It's not security through obscurity, it just depends on requirements. There are cases when such a hint would be beneficial, there are cases when it should be avoided. – techraf May 19 '16 at 10:20
  • 2
    Why not offer search results instead? – Arminius May 19 '16 at 10:57
  • 39
    Keep a blacklist of routs that should never be suggested (such as anything starting with `admin`. Alternatively, keep a whitelist of URLs that are OK to expose. – Anders May 19 '16 at 11:57
  • 1
    Note that https://httpd.apache.org/docs/2.4/mod/mod_speling.html is a thing though I think it's more targeted for files versus websites with more dynamic routes – Foon May 19 '16 at 16:16
  • I've personally had times where I typed in `http://www.example.com/help/` (notice trailing slash), when the web admin had hoped I typed `http://www.example.com/help` with no trailing slash, and I was greeted by a directory listing of `/var/www/html/help` for example. – Mark Stewart May 19 '16 at 16:27
  • Is this actually an issue for you though? How often is bob navigating to `/product` as opposed to clicking a bookmark, or clicking a link from somewhere else? Forget about security, this just seems like a lot of work for a very small benefit. – David says Reinstate Monica May 20 '16 at 17:52
  • One potential pitfall: Unless you can think of a smarter way to implement this, it seems like it would be exponential in the number of substitutions or deletions you search, i.e. O(L^n) where L is the length of the string and n is the number of subs/dels. Be careful to set the limit low enough to avoid making DoS easy. – Tyler May 20 '16 at 01:07
  • 2
    I support Anders answer, though I would like to add something. If simply exposing a URL would result in a security problem, then you **already have** a security problem right now. Security through obscurity is **not** advisable practice. That being said, it does of course look unprofessional when unaccessible admin-URLs are suggested to a stranger visiting the site. – Potaito May 21 '16 at 10:00

7 Answers7

63

If Bob is trying to type products and mistypes product, he already knows there's a URL in the website for products and so you're not telling him anything he doesn't know. If you don't suggest URLs that shouldn't be public, you won't have any issues.
Why use a 404 message though, and not do an immediate redirect?

David Glickman
  • 1,344
  • 1
  • 9
  • 17
  • I like the idea of the redirect. And i just want to add that wordpress by default redirects the user to the correct site: example.com/pag is redirected to example.com/page – Lukas May 19 '16 at 09:49
  • 54
    +1. There's only one detail: "Why not use an immediate redirect?". There could be multiple matches. Just think of the auto-correct function of your smartphone - even small typos may produce entirely unintended results. Same applies for the problem with the immediate redirect – Paul May 19 '16 at 09:49
  • Exactly what @Paul says, there could be multiple URL's with simmillar names – Paradoxis May 19 '16 at 10:02
  • 8
    Bob and human beings "who already knew URL" might not be the only clients accessing the web server. – techraf May 19 '16 at 10:29
  • 27
    @Paul mod_speling handles this case - it redirects if there is one match, and offers 300 Multiple Choices if there are more than one. – Niet the Dark Absol May 19 '16 at 11:13
  • @NiettheDarkAbsol of course there's the option that differs between the situation of finding one possible match vs multiple. But OP didn't differ. – Paul May 19 '16 at 11:15
  • 8
    "Why use a 404 message though, and not do an immediate redirect?" - because the suggestion might be wrong. Maybe there's more than one paths that are similar to what user typed? – Tomáš Zato - Reinstate Monica May 19 '16 at 14:45
  • 3
    @DavidGlickman [This is why.](http://www.catb.org/jargon/html/D/DWIM.html) – Mason Wheeler May 19 '16 at 14:55
  • 2
    `[why] not do an immediate redirect?` More technical reasons (which have been mentioned) aside, this is a *terrible* UX choice, even if the redirect is correct and there was only one possible suggestion. Unexpected redirects aren't something most people want to be subjected to unless absolutely necessary. – Jules May 19 '16 at 16:34
  • Wikipedia used to do a delayed redirect, but not anymore: http://en.wikipedia.org/Example. – Kevin May 19 '16 at 16:41
  • It could hurt SEO to do an immediate redirect. Google flagged this as an issue we needed to fix. (Multiple URLs pointing to the same content.) I am unsure if we are penalized for it but if Google sees that as a problem worth mentioning you should take that into consideration. – Bacon Brad May 19 '16 at 16:53
  • @NiettheDarkAbsol I love the ironic spelling of that module but at the same time it really irritates my inner OCD. – Cave Johnson May 19 '16 at 21:13
  • @Paul "Products (disambiguation)" – Michael May 19 '16 at 21:44
  • @baconface: You can [work around that](https://support.google.com/webmasters/answer/139066?hl=en). In particular, an HTTP 301 redirect would be appropriate for a situation where "URL A is a non-canonical equivalent of URL B." – Kevin May 20 '16 at 03:44
  • 1
    @Paul An HTTP 300 Multiple Choices could be of use maybe – Thomas May 20 '16 at 06:15
  • @Thomas It doesn't look to me like that really applies here. That would apply more to, if there were an HTML5 version of the page as well as a legacy version for a browser that doesn't support HTML5. – sig_seg_v May 20 '16 at 08:34
  • @Andrew At least that one has a reason for being, while the Referer header has no excuse :D – Niet the Dark Absol May 20 '16 at 10:24
  • 2
    Another advantage of a “did you mean X?” page over an auto-redirect: it gives the user the information that their url was not correct (which they may not notice, if you auto-redirect). This allows them to correct their url, in case they’re sharing it, linking it on their own site, etc. – PLL May 20 '16 at 15:41
13

However, it led me to wonder if this is considered bad practice, as the server might expose URL's the admin of the website might not want to show publicly.

  1. This suggests that the feature is implemented by checking a list of all possible valid URLs (a list the server may not even have or be easily able to get), to include non-public ones, and comparing the requested URL to them.
  2. This suggests that there are URLs which are a secret. While there may be some valid use cases for this, in general, pages that you don't want people to access should not allow them to access them by guessing or knowing the URL. However, from a User Experience perspective, it might be annoying to the user to have a URL suggested to them that they can't really access.

The feature could easily be implemented by having an explicit list of URLs of public pages that it compares the requested URL to instead. It might be reasonable, if there is only one match, to directly redirect the user (using a 302 Found HTTP code) to the proper URL, or to a search page. If there are multiple matches, it might be reasonable to present them in a list with a 300 Multiple Choices HTTP code.

Random832
  • 231
  • 1
  • 4
  • 2
    I agree with using status codes like 302, not 200. If you make a page saying "here are some more-valid URLs I found...", and that page is 200, then you risk web spiders (like Googlebot) learning about such a page, and indexing it. Suddenly, /products/ may generate tons of new "mirror" URLs. – TOOGAM May 20 '16 at 03:40
  • 3
    @TOOGAM Yes, always use the correct HTTP status code. Add helpful page content to the response if you want to, but never return a "file not found" response as a HTTP 200 response. Doing so will break all kinds of things! – user May 20 '16 at 08:11
12

I would say that keeping a URL secret is not really the best security practise. You may have some links, whether it's hidden, or generated by Javascript, that will show the admin URL or whatever to anyone who takes a look at it. This is even more true for SPA (Single Page Application) applications I think.

I don't think there is any point of hiding URLs of navigation, if you're sure you did make your job to protect those URLs, you're fine.

I would say that having this functionality to develop would make you being more aware of the security of those URLs.

Walfrat
  • 406
  • 2
  • 12
3

I think the best answer here is to treat the necessary activity as a 301 (permanent) redirect.

If you can anticipate misspellings and common issues and catch those in your webserver configuration (whether Apache, Nginx, or IIS), the entire activity should be completely transparent to the user.

In your web application you could add some additional handling to alert the user that they have been redirected if you want. I've seen this done with a kind of unobtrusive alert overlay which disappears after a few seconds. I can't recall where I saw it though.

1

In whatever script you use to determine the URLs to offer as suggestions, filter out any admin URLs.

Micheal Johnson
  • 1,746
  • 1
  • 10
  • 14
1

This isn't a security issue. A 404 status is intended to inform your visitor that they have requested a resource that the server does not know about. It's very reasonable to include some help in the response. For example, many servers offer search functionality on their 404 page. If you can offer useful suggestions, you are only helping. (Of course, offering 'admin' URLs probably isn't helpful.)

If your server is insecurely configured, then you should address that problem, but you need to do that regardless of whether you try to provide a more helpful 404 response.

0

Suggesting an "actual" URL would only be a security risk if it (and the source HTML - "view source") identified the backend technology (CMS etc.) and whether the system/CMS had any security flaws.

For example, if I ran a WordPress site and saw the traditional URL formats of http://example.com/2016/05/20/my-article and the source code contained the wp-content directories as resource URLs (CSS, JS, pictures etc.) then it wouldn't matter how complicated my URLs were - the backend has been revealed and it would only be a matter of time before a hacker found your admin URLs.

On the other side of that, if my URLs only contained simple friendly URLs and my source code had not-very-easily-identified directories or markup then it would be very difficult to find the system/CMS in the background and it would be much harder to attack.

Obviously, as many have said, hiding admin URLs would "help" and you could implement such mechanisms as .htaccess to prevent some access by IP (for your admin pages etc.) but there'll never be 100% protection.

Essentially, the less your site gives away, the more secure (IMO) it has become; and suggesting real URLs (that aren't your admin ones) has no effect on your site security.

Kinnectus
  • 101
  • 1