2

Problem

We regularly break our files lines endings and things stop working without us noticing.

Bash complains about "invalid option" or ": command not found" as described here: http://thinkinginsoftware.blogspot.ca/2012/11/linux-server-cries-for-linux-desktop.html

I'm concerned this could break other text files as well (conf, crons...)

How we break it (I suppose)

We are a group of people using Windows, Mac or Linux to edit Linux files on one server. We edit these files manually (ssh + vi/nano or localy + ftp). Sometimes we copy/paste and I think this is what's causing the issue. Yes, sometimes we don't test our changes for not-so-good reasons: the same script works on the replicate server, the change is just indenting some lines, etc. I agree this should be addressed.

Using Chef/Puppet-like solutions is not planned.

Update

TLDR copy-paste is not an issue, FTP is.

I did some testing with copy/paste Windows line endings CRLF on Windows + Notepad++ + PuTTY + nano and vi. It looks like the CR (^M) character is filtered, only LF gets pasted to the files. Thanks ewwhite for making me doubt about the copy/paste theory!

I transferred a CRLF-ended file via FTP using FileZilla, option "Send mode" to automatic. The CRLF are preserved. I wonder if FileZilla could convert them to LFs.

Mitigation

We can't ban non-Linux OSes nor forbidding copy-paste.

I thought of those solutions:

  1. Build a cron.minutely that runs dos2unix or sed on all scripts. Cons: we need to maintain a list of "modifiable text files", as I don't want it to run on /
  2. Use a text editor that would support additional commands after file change. Cons: could break files that legitimately use non-Linux line endings, doesn't work when we ftp scripts.
  3. Use a trigger system like http://inotify.aiken.cz/?section=incron&page=about&lang=en. Cons: ?

Pros of #2 and #3: we could also use these to add a final blank line for programs that need it.

Using bash, version 4.2.37(1)-release

Related questions on ^M (CRLF)

Edit: I got one downvote, could you please explain why?

pyb
  • 216
  • 2
  • 7
  • 3
    "We can't ban non-Linux OSes" - that's a problem. You have to force some sort of standardization (IMHO) or you'll never solve this problem. Doesn't have to be OS standardization, it can be "you don't edit these files in Windows, you only edit them in Linux", but something. This isn't a technical problem, this is a political / social problem, and trying to come up with a technical solution is doomed to failure. – John Jun 12 '14 at 17:31
  • Copy-paste should not cause this issue... I'd really try to isolate the specific behaviors that cause the problem. – ewwhite Jun 12 '14 at 17:33
  • 1
    Ban text editors which can't deal with this intelligently. And ensure that the text editors being used are appropriately configured. – Michael Hampton Jun 12 '14 at 17:36
  • Thanks John, we don't edit those files that often, so we should be able to use a Linux VM when necessary. – pyb Jun 12 '14 at 18:54
  • Can you convince people to start using git or another DVCS for publishing. The setup hooks that verify the files have the correct line endings and so on? – Zoredache Jun 12 '14 at 20:24
  • Actually those files are versionned on git, but I believe we set `core.autocrlf` to `true` so we could be free to edit these we any text editor. That's a good point! We could enforce LF on git and ask people to use editors that support these LF. – pyb Jun 13 '14 at 15:21

1 Answers1

4

I have to deal with this on occasion with some legacy systems. Sometimes the files retained in the organization's source control (Borland Starteam) were set to the wrong linefeed configuration.

But working in a number of cross-platform environments, copy/paste should not cause this issue alone. Try to identify the trends based on the output from the following and deal with the worst offenders appropriately.

Periodically search for files with DOS linefeeds.

find /var/www -not -type d -exec file "{}" ";" | grep CRLF

Example:

# find /ppro/bin -not -type d -exec file "{}" ";" | grep CRLF
/ppro/bin/compile/save/srcfix.c: ASCII C program text, with CRLF line terminators
/ppro/bin/compile/bldtag.c: ASCII Pascal program text, with CRLF line terminators
/ppro/bin/compile/bldtag.c.sav: ASCII Pascal program text, with CRLF line terminators
/ppro/bin/compile/dbcsum2.c: ASCII Pascal program text, with CRLF line terminators
/ppro/bin/hphw/print_sv.c: ASCII text, with CRLF line terminators
/ppro/bin/linuxhw/dhcpd.conf: ASCII text, with CRLF line terminators
/ppro/bin/linuxhw/dhcpd.conf.mult_subnet: ASCII text, with CRLF line terminators

Then BURN them!!

Remember, that dos2unix on some systems will modify permissions...

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • Thanks for the command, I did not think of using `file` to discriminate CRLFs. Did you use `-not -type d` so it would match files and symlinks as well? – pyb Jun 12 '14 at 18:59
  • Why `-not -type d` rather than `-type f`. Usually when doing such searches, you don't want to follow symlinks. If you do want to follow symlinks, why not use `-type f -o -type l`? I guess you still don't want to touch device inodes or named pipes or sockets, if any of those would be showing up. – kasperd Jun 15 '14 at 20:06
  • This is maddening. You can't fix it with modified shebang! The issue is the shebang has the fraudulent /r in it. This was helpful! – TamusJRoyce Feb 28 '19 at 17:11