When using nginx and maps it is possible to rewrite mutiple URL's with a map file. What is problematic is when the URL contains special characters. I have been breaking my head trying to get this right, and hope this Question / Solution might save others from becoming gray hair.
Let's set the scenario.
A Linux server (Debian/Ubuntu) running standard nginx. DNS pointing to this server that resolves to a server config. A Map that contains no duplicate entries with incoming and outgoing URL's (resolvable)
The map setup would contain the following:
map $host$request_uri $rewrite_uri {
include /<path to file filename>;
}
the map file itself contains one entry per line terminated with a semicolon.
example.com/Böhme https://anotherexample.org/SomeWeirdPath/Böhme;
The server config for this mapping to work
server {
listen 443 ssl http2;
ssl_certificate /<absolute path to crt file>;
ssl_certificate_key /<absolute path to key file>;
server_name example.com;
proxy_set_header X-Forwarded-For $remote_addr;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
ssl_dhparam <absolute path to Diffie Hellman key>;
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains";
server_tokens off;
if ($rewrite_uri) {
rewrite ^ $rewrite_uri redirect;
}
rewrite ^ <default URL> redirect;
}
I have simplified the config of this server config so we can concentrate on the map settings. The config assume that the domain will be using SSL and the certificate is valid. The if statement will only execute if the $host$request_uri is in the list with a $rewrite_uri, otherwise the last rewrite will be executed.
The Question
How do I transform the $request_uri so that nginx understand it correctly? The map file contains the value in UTF8, but it seems that nginx wants the $request_uri URL-Encoded and in Hexadecimal.
$request_uri as in the mapfile
example.com/Böhme
$request_uri URLEncoded as per Browser
example.com/B%C3%B6hme
$request_uri as I think nginx wants it
example.com/B\xC3\xB6hme
I can't seem to find a system package that has this feature, but I think I am starting to re-invent the wheel here.
I would need to:
create a function that will URL encoding the list, as per How to decode URL-encoded string in shell?
function urldecode() { local i="${*//+/ }"; echo -e "${i//%/\\x}"; }
and then use Octal dump as per Convert string to hexadecimal on command line, so the map bucket is created in memory with the correct values for the if statement test.
It's starting to feel like rocket science, and I can't believe that nobody else hasn't solved this problem before, I just can't seem to find a solution.