1

Take the following blog page as a case:

http://www.roney.com.br/2010/06/20/estados-do-brasil-um-pais-que-precisa-se-unir/

!Careful, it has tons of youtubes embedded, so is a slow load! It is a Brazilian web page, written in Portuguese but hosted (according to the blog's owner) on a USA webhost.

Of interest are the "Pronúncia" links where they link to file names containing non ascii characters. Look at the second one (for Pará): the link as I write is to www.roney.com.br/wp-content/uploads/2010/06/par%E1.mp3 (unless he changes it out from under me in the future :)!))

As you see he has it coded, but what you don't know is what he actually named it on his file system or what system config they have.

If I click it in my Firefox browser I get their 404 page. He claims those links are working for Brazilian visitors. I thought this was a 100% server thing, i.e. either the server will serve it or it won't. Just for laughs I set the preferred language to Portuguese in my Firefox but as I suspected, it didn't make any difference.

Anyone care to offer any insight on how this might work in Brazil but not in USA or what things I would tweak on my own workstation so that they would serve for me too.

1 Answers1

1

The problem lays in the URI encoding. Here it is encoded as iso-8859-1 (latin-1) (and then percent-encoded), but RFC 3986 states that it should be encoded as UTF-8 (and then percent-encoded).

Source:

More info about percent-encoding on wikipedia.

The actual RFC 3986.

Solution:

To give you an idea on how to solve this, you can do something like this in PHP.

<?php
echo urlencode(utf8_encode(urldecode('par%E1.mp3')));
?>

Note that if you put the whole URI, slashes (/) will be encoded also, making the URI invalid.

Weboide
  • 3,275
  • 1
  • 23
  • 32
  • This does solve the issue at hand, i.e. if you do that trick on the file name then substitute it in the URI you can make that website serve the file. But my question really is what software is configured different between Brazil and USA and if I wanted to serve files with non ascii what would I have to do so they work everywhere. – Colleen Kitchen Jun 23 '10 at 03:56