Questions tagged [grid]

27 questions
8
votes
4 answers

Is it dangerous to have several parallel jobs create the same directory at the same time?

Is it dangerous to have several parallel jobs create the same directory using mkdir -p? (This is under Linux.) In my case, I send many jobs to a SUN grid to process them in parallel, and some of these jobs start by creating a certain directory foo.…
user9474
  • 2,368
  • 2
  • 24
  • 26
5
votes
1 answer

Overlapping queues on Sun Grid Engine?

We would like to have an SGE-based compute cluster with a queue that gives access to all nodes for the computational staff, and a second cluster queue that gives access to, say, half the nodes for occasional (but heavy) use by other staff. We want…
Alex Reynolds
  • 453
  • 2
  • 9
  • 20
4
votes
2 answers

What is the difference between h_rss and h_vmem in Sun Grid Engine (SGE)?

So far as I understood, mem_free can be specified to submit a job in a host that has the memory free = mem_free, whereas h_vmem is the hard limit of the memory up to which the job can consume and if the job reaches the h_vmem, the job crashes? I…
GP92
  • 599
  • 2
  • 6
  • 25
4
votes
2 answers

Are there cluster resource schedulers abstraction layers?

I'm writing an application that could potentially be run on any cluster resource scheduler (SGE, LSF or SLURM to name a few of them), using very basic functionalities. I'm wondering if a framework/abstraction layer does exist for interacting with…
nicoulaj
  • 1,155
  • 2
  • 10
  • 12
4
votes
2 answers

GNU Queue - alternatives

I'm tried to build a grid-cluster based on CentOS. All the machines will have a somewhat similar structure (some with more processors than others) and I will just need to push jobs to a queue and have then run on the available nodes. One job per CPU…
Frankie
  • 419
  • 1
  • 6
  • 19
4
votes
2 answers

How to manage processes-to-CPU cores affinities?

I use a distributed user-space filesystem (GlusterFS) and I would like to be sure GlusterFS processes will always have the computing power they need. Each execution node of my grid have 2 CPU, with 4 cores per CPU and 2 threads per core (16…
Philippe
  • 283
  • 1
  • 7
3
votes
5 answers

Productive uses for an idle web server?

I have been renting a Windows 2008-based Virtual Server from 1&1 for the past few years. It is located in a well-connected server farm, has 10 gigs of space, and unlimited Traffic. I used to host my Subversion repos on it, but have started doing…
Pekka
  • 2,158
  • 3
  • 19
  • 32
2
votes
1 answer

Access Denied on NVIDIA GRID 7.2 Driver

I am trying to set up an NVIDIA Tesla T4 GPU and use its RTX functionality in a raytracing application (Bakery for Unity3D). But every time I launch the app, Bakery tells me it could not find the OptiX library. I believe to have tracked it down to…
2
votes
0 answers

Specify a GPU to use at launch

I am currently working with an Azure GPU VM (NV6 using M60 Nvidia Graphic card) I'm doing my benchmark on this VM without any issue for the moment. Now I'm doing the same benchmark on a NV12 which has 2 GPU (or at least Windows server sees it as 2…
Turgal
  • 121
  • 1
2
votes
1 answer

Alternative of GlusterFS for Hardware

When you look at my question title then you might think I am going to ask for alternative of GlusterFS in the space of storage,but what I am looking for the hint is basically I want to find software that can do the same like gluster but for…
jakarta512
  • 127
  • 8
1
vote
0 answers

Registering oracle 10g db with 12c grid

Hi I have oracle grid infrastructure 12c installed for stand alone server and I'm trying to register oracle database 10g with CRS, but I suspect there was syntax change because srvctl fails when calling srvctl in 10g home outputing usage hint. full…
1
vote
1 answer

Best practices for backup on a massively parallel grid system

I work in the research group of a large company. We do a lot of work on a grid processing system with many nodes (More than 200, I'm not sure exactly how many) and several harddrives. More than 1000TB of data. Most of this data can be re-produced,…
Brian Postow
  • 182
  • 1
  • 10
1
vote
2 answers

How to build a cluster where users have power to remove/add?

I work in a company where each engineer/scientist has a pretty high end desktop machine. 80% of the time, they are not pumping it to full capacity... This makes me sad. I want to be able to install some software on each of our machines which should…
engineerchuan
  • 407
  • 7
  • 14
1
vote
1 answer

Centralized cron for grid cluster

I have a grid cluster. (It's running SGE, but I don't think that's relevant.) All the machines are intended to be able to drop out and come back at any time without any significant issue. However, my users need the ability to run cron jobs. Right…
wfaulk
  • 6,828
  • 7
  • 45
  • 75
0
votes
1 answer

How to create a SGE queue which can only be assigned jobs manually?

I have 5 nodes in an SGE cluster. I'd like to make it so one of those nodes can only be used by a specific queue, "test.q". I can remove that node from hostlist on all of my other queues, and set hostlist to just that host in test.q. However, when I…
Daniel
  • 113
  • 1
1
2