Questions tagged [parallel-computing]
18 questions
8
votes
2 answers
IBM GPFS : very slow to remove files recursively
To delete files recursively in our IBM GPFS cluster, we use simple unix command like :
rm /my/directories -fr
However deletions are very long to be done.
Problem is that our distributed apps (Spark-based) took like one hour to be done. But then, it…
Klun
- 93
- 5
6
votes
1 answer
How can I tell the maximum threads my server can run?
Here is the machine spec:
CPU(s): 20
Thread(s) per core: 1
Core(s) per socket: 10
Socket(s): 2
Based on what I've read so far, these numbers mean that I can run 20 parallel jobs because I have 20 CPUs.
However, how…
tera_789
- 163
- 1
- 1
- 5
4
votes
2 answers
Scan the full filesystem in parallel with clamscan
I run a clamav scan weekly on my servers. There is one server with a raid6 cluster of 30TB of disk space where the scan take more than 24h to run.
So I wonder how can I run clamscan on the whole filesystem, taking advantage of the several cores the…
azmeuk
- 165
- 1
- 14
3
votes
0 answers
Transparently farm processes out to cluster
Situation:
At work, we have an in-house tool for data crunching. When a job is triggered, it starts multiple copies of itself in separate processes and communicates with them in order to crunch in parallel. It is currently set up to use 4 parallel…
Oliver Hawker
- 31
- 1
3
votes
3 answers
Execute Shell loop in parallel, but only N workers
We have more than 100 git repos, and sometimes I want to grep over all.
To update the repos I use this:
for repo in *; do (cd $repo; git checkout master; git pull); done
This is quite slow.
How to speed it up?
Running all updates at once would…
guettli
- 3,113
- 14
- 59
- 110
2
votes
1 answer
R scripts that only run sequential, any way to parallelize its execution to multi cores?
i have some R scripts, which only can run in sequential way, cannot be broken into chunks or any parallel library for R or any other language cannot be used.
Is there any way i can distribute the Sequential execution of code to multiple cores or may…
Farhan
- 4,210
- 9
- 47
- 76
2
votes
1 answer
Multliple core overload above 100% on Centos7 Supermicro Server
I am running Centos 7 (3.10.0-514.26.2.el7.x86_64) on a supermicro H8QG6 board with 4 AMD 6276 cpus (16 cores), for a total of 64 cores. I use it for scientific computing, and usually everything runs smoothly, as in the htop first image.
Then,…
ehyG
- 51
- 5
2
votes
3 answers
GNU parallel doesn't fully utilize my CPUs
I'm running a command like this on my 36 core server (EC2 c4.8xlarge/Amazon Linux).
find . -type f | parallel -j 36 mycommand
The number of files to process is ~1,000,000, and it takes dozens of minutes. It should run 36 processes simultaneously.…
aosho235
- 63
- 4
2
votes
1 answer
How to improve the efficiency of gnu parallel to read from a compressed stream?
Is another question extended from the previous one [1]
I have a compressed file and stream them to feed into a python program, e.g.
bzcat data.bz2 | parallel --no-notice -j16 --pipe python parse.py > result.txt
The parse.py can read from stdin…
Ryan
- 5,341
- 21
- 71
- 87
1
vote
1 answer
Execute script with different argument on AWS instances
I have a script which takes multiple arguments and i need to run this script on multiple instances in parallel on AWS. For example, for sake of simplicity, if i have three instances in AWS, i would like to run the following:
On instance-a: script.sh…
Technext
- 147
- 2
- 7
1
vote
2 answers
Execute command then parallelize other commands after completion
I'm looking for a single inline command to execute /bin/first and then when it completes execute the following /bin/p1, /bin/p2, /bin/p3 in parallel after.
Justin
- 5,008
- 19
- 58
- 82
1
vote
1 answer
Application caps at 20% utilization per core no matter how many threads are launched
I'm trying to use a Dell Poweredge R900 I got secondhand as a compute pool (4x quadcores gives me 16 cores to run simulations with). It's running windows server 2008 R2 enterprise at the moment.
I'm running custom .net code, and can specify the…
PhysicsNinja
- 11
- 1
1
vote
1 answer
GNU parallel multpile sshlogin nodes behind NAT
Is it possible to have multiple remote nodes behind a NAT with GNU parallel?
Suppose some of a GNU parallel cluster exists behind a NAT (which may or may not be accessible only via a single IPv4 address through an ISP operating only IPv4) relative…
Mr Purple
- 135
- 7
0
votes
1 answer
How to launch a PBS job with hybrid MPI/Openmp
I would like to understand how a GROMACS job launched on my SGI cluster with PBS/Torque using a hybrid parallelization MPI/OpenMPI, works.
The cluster is hyper-threading enabled and each node has 16 physical cores (32 logical).
What I expect: use…
Gabriel Cretin
- 101
- 2
0
votes
1 answer
how to execute bash script with args in parallel on all the remote server if the remote server in the from of pair
Please read carefully problem first and then give solution...................
we are having 3 files in directory
myscript.sh
(the myscript.sh file contain many function and starts/stop to all-pair/single-pair from serverlist.txt file ,the…
ramesh
- 1
- 2