-2

We are changing the way we handle servers: From pet to cattle.

In the particular case I have about 100 servers which have a file called:

/etc/rsyslog.d/local.conf

I have no clue which version is the correct one. I did some testing most are equal, but not all.

I would like to go the democratic way: The most common version of all 100 config files gets elected to be the canonical version.

Next step is to look at the files which are different.

I have some shell scripting knowledge, and could help myself without asking.

But I think my solution would be dirty.

How would you find the canonical version and then try to manage the different config version?

guettli
  • 3,113
  • 14
  • 59
  • 110
  • 3
    I can see the temptation, but honestly, I would probably spend some time on trying to review the configs and work out what makes the most sense given the new approach. If you really did want to proceed with your approach, then I guess you could collect all 100 files into one place, and then score config lines, rather than files, democratically - a line present in less than 50% of configs would be out, etc. A few lines of python/perl/whatever would probably do the job. – iwaseatenbyagrue Mar 02 '17 at 12:11
  • 1
    I vote to migrate to StackOverflow, since the question is mostly about comparing files. – mzhaase Mar 02 '17 at 12:14

2 Answers2

5

Look at your old question Compare 20 files with diff, not 2. Your solution together with parts of mine will easily show you the number of files for each unique version:

md5sum tmp/crontab-* | cut -d ' ' -f 1 |  sort | uniq -c 

will show you the number each hash appears.

After that, you have to look manually into the other files to decide which of the differences you need to incorporate into your config management. After all, even when herding cattle, each animal is individual up to a point.

(Edit: Inserted the cut to make the lines unique for counting)

Sven
  • 97,248
  • 13
  • 177
  • 225
1

If you were interested in collecting lines rather than complete files, then the following snippet should provide some help, IF you have a python interpreter around:

import glob
from collections import Counter

line_counter = Counter()

for cfg in glob.iglob('**.conf'):

    with open(cfg,'rb') as cf:
        line_counter.update(set(cf.readlines()))

for line,score in line_counter.most_common():
    print "#{} {}".format(score,line)
iwaseatenbyagrue
  • 3,588
  • 12
  • 22