Questions tagged [parallel-computing]

18 questions
8
votes
2 answers

IBM GPFS : very slow to remove files recursively

To delete files recursively in our IBM GPFS cluster, we use simple unix command like : rm /my/directories -fr However deletions are very long to be done. Problem is that our distributed apps (Spark-based) took like one hour to be done. But then, it…
Klun
  • 93
  • 5
6
votes
1 answer

How can I tell the maximum threads my server can run?

Here is the machine spec: CPU(s): 20 Thread(s) per core: 1 Core(s) per socket: 10 Socket(s): 2 Based on what I've read so far, these numbers mean that I can run 20 parallel jobs because I have 20 CPUs. However, how…
4
votes
2 answers

Scan the full filesystem in parallel with clamscan

I run a clamav scan weekly on my servers. There is one server with a raid6 cluster of 30TB of disk space where the scan take more than 24h to run. So I wonder how can I run clamscan on the whole filesystem, taking advantage of the several cores the…
azmeuk
  • 165
  • 1
  • 14
3
votes
0 answers

Transparently farm processes out to cluster

Situation: At work, we have an in-house tool for data crunching. When a job is triggered, it starts multiple copies of itself in separate processes and communicates with them in order to crunch in parallel. It is currently set up to use 4 parallel…
3
votes
3 answers

Execute Shell loop in parallel, but only N workers

We have more than 100 git repos, and sometimes I want to grep over all. To update the repos I use this: for repo in *; do (cd $repo; git checkout master; git pull); done This is quite slow. How to speed it up? Running all updates at once would…
guettli
  • 3,113
  • 14
  • 59
  • 110
2
votes
1 answer

R scripts that only run sequential, any way to parallelize its execution to multi cores?

i have some R scripts, which only can run in sequential way, cannot be broken into chunks or any parallel library for R or any other language cannot be used. Is there any way i can distribute the Sequential execution of code to multiple cores or may…
Farhan
  • 4,210
  • 9
  • 47
  • 76
2
votes
1 answer

Multliple core overload above 100% on Centos7 Supermicro Server

I am running Centos 7 (3.10.0-514.26.2.el7.x86_64) on a supermicro H8QG6 board with 4 AMD 6276 cpus (16 cores), for a total of 64 cores. I use it for scientific computing, and usually everything runs smoothly, as in the htop first image. Then,…
ehyG
  • 51
  • 5
2
votes
3 answers

GNU parallel doesn't fully utilize my CPUs

I'm running a command like this on my 36 core server (EC2 c4.8xlarge/Amazon Linux). find . -type f | parallel -j 36 mycommand The number of files to process is ~1,000,000, and it takes dozens of minutes. It should run 36 processes simultaneously.…
2
votes
1 answer

How to improve the efficiency of gnu parallel to read from a compressed stream?

Is another question extended from the previous one [1] I have a compressed file and stream them to feed into a python program, e.g. bzcat data.bz2 | parallel --no-notice -j16 --pipe python parse.py > result.txt The parse.py can read from stdin…
Ryan
  • 5,341
  • 21
  • 71
  • 87
1
vote
1 answer

Execute script with different argument on AWS instances

I have a script which takes multiple arguments and i need to run this script on multiple instances in parallel on AWS. For example, for sake of simplicity, if i have three instances in AWS, i would like to run the following: On instance-a: script.sh…
1
vote
2 answers

Execute command then parallelize other commands after completion

I'm looking for a single inline command to execute /bin/first and then when it completes execute the following /bin/p1, /bin/p2, /bin/p3 in parallel after.
Justin
  • 5,008
  • 19
  • 58
  • 82
1
vote
1 answer

Application caps at 20% utilization per core no matter how many threads are launched

I'm trying to use a Dell Poweredge R900 I got secondhand as a compute pool (4x quadcores gives me 16 cores to run simulations with). It's running windows server 2008 R2 enterprise at the moment. I'm running custom .net code, and can specify the…
1
vote
1 answer

GNU parallel multpile sshlogin nodes behind NAT

Is it possible to have multiple remote nodes behind a NAT with GNU parallel? Suppose some of a GNU parallel cluster exists behind a NAT (which may or may not be accessible only via a single IPv4 address through an ISP operating only IPv4) relative…
Mr Purple
  • 135
  • 7
0
votes
1 answer

How to launch a PBS job with hybrid MPI/Openmp

I would like to understand how a GROMACS job launched on my SGI cluster with PBS/Torque using a hybrid parallelization MPI/OpenMPI, works. The cluster is hyper-threading enabled and each node has 16 physical cores (32 logical). What I expect: use…
0
votes
1 answer

how to execute bash script with args in parallel on all the remote server if the remote server in the from of pair

Please read carefully problem first and then give solution................... we are having 3 files in directory myscript.sh (the myscript.sh file contain many function and starts/stop to all-pair/single-pair from serverlist.txt file ,the…
1
2