3

The problem: I want to serve CRLF from files encoded as LF

I have an apache2 httpd linux-based web-server configured to serve up (large) log files (*.log) generated by a linux-based simulator.

These log files have the Unix-style LF ending rather than Windows-style CRLF. CRLF also turns out to be the standard for text files in the http protocol.

When I view them using a browser in Windows they are loaded into Notepad.exe and all the text is (incorrectly) on the same line; unless I rename the logs on the server from *.log to *.txt.

It seems to be that Microsoft Windows is probably handling *.txt specially, and converting the endings as they arrive.

Given these clues, how can I change the set-up so client users see the files correctly, regardless of their platform/browser.


More problem detail: why I can't do the obvious thing

Analysing the logs, I find that .txt is served as mime-type text/plain and .log as text/x-log, but switching .log to text/plain using SetType didn't solve the problem.

In a production system, I will not easily be able to change the files to end in .txt.

The logs are too numerous and large for me to want to convert using (e.g. unix2dos) and save another copy. Also that would force me to manage an additional cache of converted files that would need to be invalidated, cleared up etc, or to change the original files, which may break other systems that consume them.

  1. Is there an Apache httpd configuration parameter that tells Windows/Internet Explorer (and other OS/browser) that it needs to expand LF to CRLF as it arrives?
  2. OR Can I somehow tell Apache to replace LF with CRLF on the fly as it serves it up?

What I have tried

I have looked at the bundled Apache mod_mime module and its directives AddType and AddCharset but these don't fix the problem, or even claim to.

The Apache documentation is quiet on the line-ending issue.

The MIME documentation on the text type says the content must be in CRLF format.

It also appears that line-ending is not considered by the charset encoding standards.

Alex Brown
  • 161
  • 1
  • 6

2 Answers2

3

It seems like you're going to a lot of trouble to work around a (severe IMHO) limitation in Notepad. Would it be possible to install a smarter text editor on the system, e.g. Notepad++?

Gerald Combs
  • 6,331
  • 23
  • 35
  • Agree. While it's interesting that `text/plain` is supposed to be served with CRLF endings, I don't really thing Apache should be getting involved in converting them; I'd rather it serve what it's given to serve and leave RFC conformity to whatever's generating the files. Get a better editor; I'll throw [Scite](http://www.scintilla.org/SciTE.html) on the recommendation pile. – Shane Madden Sep 15 '11 at 19:35
  • 1
    `text/plain` *must* be served with `CRLF` line endings, according to the spec. I can't demand my users use a different text editor. The generating process is generating the correct line-endings for the host machine - unix. – Alex Brown Sep 15 '11 at 23:06
3

The (imperfect) solution I have settled on is to use Apache's mod_ext_filter:

ExtFilterDefine logwin mode=output cmd=/usr/bin/unix2dos intype=text/x-log
AddOutputFilter logwin .log
# Note that apache2 defines .log as having mime-type text/x-log by default.

Essentially, this says that for any file ending in .log, it should be passed through a line-ending converter before delivery to the client.

This is not a great solution for heavily loaded machines as fork-ing unix2dos is slower than having apache process it internally. It also requires conversion for each read of the file, which is inefficient.

Sadly, the apache foundation has not provided a built-in mod filter for this scenario, and I don't have time to write/maintain one.

However, I don't expect high load on this machine, so measured by engineering effort, this is a good solution.

  • Conforms to mime specs for text/plain by encoding CRLF on the wire
  • Requires no special action at the client end. While I'm happy to use a better text editor, I can't expect all users to change over
  • Reconfiguration easy if I need to move webservers
  • Doesn't require the log generator to use a non-native (non-unix) line-end format
Alex Brown
  • 161
  • 1
  • 6