How to stop a Linux process for later execution swapping-out its memory

19

6

I want to stop a long running process so it does not consume any CPU or physical memory resources, with the intention of resuming the same process in the future.

I know the CPU part is achievable using SIGSTOP and SIGCONT signals but is it possible to page-out (swap-out in the case of process dirty pages) immediately the private RSS memory of a (stopped) process?

idelvall

Posted 2016-11-14T14:55:42.747

Reputation: 253

1What is the intention behind this? Do you want to ensure that the process resumes more quickly? Or do you want to prevent sensitive data to be written to disk? Or something else? If we know the intention, we might be able to give better answers. – oliver – 2016-11-14T18:13:41.633

13The OS will do this automatically. There's really no reason to do anything specific. – David Schwartz – 2016-11-14T19:39:27.093

@oliver I am creating a batch scheduler (https://github.com/brutusin/wava). The current implementation offers non-preemptive scheduling but I want to move to a preemptive one (being able of stop running jobs) to graceful avoid some deadlock situations when all running jobs depends on queued jobs. I need exactly the behavior asked, continuing stopped processes (not creating new ones from a checkpoint)

– idelvall – 2016-11-14T22:43:33.540

1@DavidSchwartz that is a risky assertion – idelvall – 2016-11-14T22:44:54.610

@idelvall Then it sounds like you don't want to do anything special to memory. – David Schwartz – 2016-11-14T22:48:39.933

I didn't feel necessary to give more details on my actual problem, furthermore I didn't want the question to sound like i wanted to promote my project but given your comments I can tell you that my scheduler main concern is to guarantee that running jobs memory doesn't never exceed physical memory limits in order to avoid paging and swapping. Stopping a running process without freeing its RSS memory would not make room for other queued processes to start – idelvall – 2016-11-14T22:58:44.067

@DavidSchwartz I forgot to mention you in my last comment. this is my real issue: https://github.com/brutusin/wava/issues/13

– idelvall – 2016-11-15T09:07:36.737

@idelvall Your question still makes no sense. You are asking how to force the paging out of memory while claiming that your objective is to avoid paging! Do you want to make sure to page everything or do you want to avoid paging things? – David Schwartz – 2016-11-15T10:13:46.647

yes I expressed wrong, what i want is to avoid paging performed by the OS, especially for processes unmanaged by the scheduler. – idelvall – 2016-11-15T10:46:14.803

Does posix_fadvise(POSIX_FADV_DONTNEED) work on /proc/<PID>/mem? vmtouch might have a command-line option to do that, so you might not need to write code yourself to experiment. (I'd suggest using strace to make sure vmtouch is making the system calls you expect.) Does madvise(MADV_DONTNEED) work to trigger page-out of anonymous pages (easiest to try from within a running process, since madvise works on memory addresses, not file positions)?

– Peter Cordes – 2016-11-15T21:27:22.307

Near-duplicate of http://superuser.com/questions/128385/how-to-tell-linux-to-explicitly-swap-out-main-memory-of-a-suspended-process, but that doesn't have much of an answer. It just suggests raising /proc/sys/vm/swappiness to encourage the kernel to page out more aggressively.

– Peter Cordes – 2016-11-15T21:37:16.537

I would look into the top answer on here: http://unix.stackexchange.com/questions/87908/how-do-you-empty-the-buffers-and-cache-on-a-linux-system

– pycvalade – 2016-11-17T18:12:53.267

Answers

11

You might look into a technique called checkpoint/restore. This will allow you to take a running process and save its state to a set of files, then restore it at a later time.
To use it, start by installing the criu [git,wiki] program (yum install criu or apt install criu).

To checkpoint a running process, create an empty directory to hold its files and cd into that directory.

mkdir /var/tmp/checkpoint
cd /var/tmp/checkpoint

Now checkpoint the running process. In this case I'm using the --shell-job since I have my process running in a shell with an associated tty.

criu dump -t 404 --shell-job

404 is the pid of the process I want to checkpoint. When I do this I see my running process get killed and my /var/tmp/checkpoint directory get populated with a set of files needed to restore it.

To restore the process, I make sure I'm in the directory with the checkpoint files and do a restore.

cd /var/tmp/checkpoint
criu restore --shell-job

The process will pick up where it left off in the terminal where this was run. If I kill this running process and run criu restore --shell-job again, the process will revert back to the checkpoint and start up again.

Hope this helps.

virtex

Posted 2016-11-14T14:55:42.747

Reputation: 1 129

4This doesn't do what the OP claims they want to do. Try it -- there will be no reduction in memory used. It will just switch from process private memory to disk cache (due to writing out the set of files). It just makes an extra save step and an extra restore step, and the same memory is used (and ejectable) the same way. In fact, it may make things worse as some memory gets duplicated due to generating everything new to write out. – David Schwartz – 2016-11-14T19:40:52.857

heh, good point @David, especially if /tmp is tmpfs (backed by memory/swap space). If you checkpoint to a normal disk-backed filesystem, you can then use vmtouch -e to evict the pages from the pagecache, but it still uses extra RAM temporarily. (Unless criu has an option to do direct i/o (with O_DIRECT)...)

– Peter Cordes – 2016-11-14T22:05:04.233

1It's hard to know if this is what the OP wants or not because the OP asks for a specific solution rather than explaining what problem he's trying to solve. This might be the perfect answer or it might be useless to him, we can't tell. – David Schwartz – 2016-11-14T22:08:43.853

I have not looked at it in detail yet but, it seems that the restored process is a new process (different pId), and this is not exactly what i need... – idelvall – 2016-11-14T23:04:46.330

@Peter, that's a really good point about /tmp using tmpfs on some installations. I updated the solution to use /var/tmp instead. – virtex – 2016-11-15T16:17:52.287

1@idelvall: That's how most flavours of checkpoint/restore work. One major use-case is to save progress in a calculation across reboots. – Peter Cordes – 2016-11-15T19:02:37.557

@PeterCordes yeah it makes sense for that scenario, but not exactly what i am looking for for mine – idelvall – 2016-11-15T21:04:11.823