File sharing between a cluster of linux machines

0

I have a cluster of 6 machines. The other day we had a power cut and the fully functioning cluster turned off. Now we have turned it back on its no longer functioning and the person who was knew what he was doing has left. We are using openlava job scheduler on a linux machine (or set of machines) and it seems I have got this to work again. The problem is that the folder which was shared among our users is no longer visible for any of the computers but the host. Before the power cut I was able to log on and see my files by logging on to any of the computers. My question is how can I once again restore these settings?

Thanks

------------Edit----------------

So I have found some more information. It appears the folder is a NFS mounted folder. The computer that hosts the folder, lets call it 2, is not the machine that seems to be hosting the rest of the cluster, lets call this one 1. So I can get access to 1 and I can see the files I want to share on 1 but it doesn't look like I have access to 2 since its passworded.

and I haven't found any 'ini' files, but I can access all of the machines (except 2) so I don't think the dns is down.

CiaranWelsh

Posted 2015-07-26T10:32:34.900

Reputation: 525

Answers

0

This is funny, because recently at my work we had the same scenario. Except we had the guys who knew what they were doing... I will work you through a general idea of what you might need to do.

Sounds like your shared directories are not mounted properly, sometimes the mount scripts would be located in "/etc/init.d". Look there for auto mount scripts, if they are none. You might need to mount the directories yourself by using the mount command.

If you are not able to log in to some of the computers anymore, it sounds like you may have a dns server down. Try restarting the dns server if this is the case.

Hope that helps!

Josh Jobin

Posted 2015-07-26T10:32:34.900

Reputation: 229

Hi Josh, thanks for responding. I've made some edits in the post above. I can't find any init.d files – CiaranWelsh – 2015-07-26T12:07:33.730

Okay, I have a couple questions: the directories you used to have access to, can you see the highest directory but not access it? For example, if you had 2 mounted on all computers under "/shared", can you see "/shared" but it gives you an error when trying to access it? Also, are you able to log into computer 2 at all? – Josh Jobin – 2015-07-26T13:35:58.907

So I can access the folder on all machines but the folder is empty (except computer 1 which has the files). And no, I cannot log on to computer 2. It says access denied when I try. I have a sudo password but I suspect that its not at the highest level or something. – CiaranWelsh – 2015-07-26T13:41:19.887

0

Usual case in such situation - it is filesystem was marked as dirty and refuse to mount some of resources that supposed to be shared. Use fsck to get status of filesystem(s) and repair it with fsck

Alex

Posted 2015-07-26T10:32:34.900

Reputation: 5 606

Using fsck tells me that my file is mounted and proceeding will kill and damage the system (So I obviously did not proceed). – CiaranWelsh – 2015-07-26T13:42:57.947

to repair file system you need umount them first. Do it either from single user mode or simply un-mount, fix, mount again in multitasking environment. Power loss usually broke filesystem(not really broke, but transaction(s) is not finished and this stage means for OS as dirty and need to be cleaned). AFAIK, that the only way to fix your problem. – Alex – 2015-07-26T14:20:57.893