How are PID's generated?

43

8

On *nix, PIDs are unique identifiers for running processes. How are PID's generated? Is it just an integer which gets incremented or a more complex structure such as a list? How do they get recycled? By recycling I mean that, when a process terminates, it's PID will eventually be reused by another process.

Giovanni Funchal

Posted 2010-04-26T14:08:59.810

Reputation: 533

http://lwn.net/Articles/10238/ – None – 2010-04-26T14:48:51.320

Answers

39

As wikipedia says,

Under Unix, process IDs are usually allocated on a sequential basis, beginning at 0 and rising to a maximum value which varies from system to system. Once this limit is reached, allocation restarts at zero and again increases. However, for this and subsequent passes any PIDs still assigned to processes are skipped.

so it's really a very simple policy for "generation", just increment a counter, and "recycling", just wrap the number around at a max value and keep incrementing until you find a number that was assigned to a process that has finished and has been removed from the process table.

Some Unix implementations such as AIX use a policy that's less simple, see e.g. this FAQ.

Alex Martelli

Posted 2010-04-26T14:08:59.810

Reputation: 1 298

Thanks for the answer. By the way, what is exactly this AIX policy "that's less simple"? – None – 2010-04-26T14:38:11.313

1@Helltone, I don't think AIX documents exactly what policy it uses (so it could change at any release), but you could think of it as a random number generation in the appropriate range (which gets repeated until a PID is generated that's not currently in use). – Alex Martelli – 2010-04-26T14:49:37.943

This algorithm seems a bit problematic to me. How do you ensure that you do not run into deadlock? And isn't there a performance problem? – None – 2010-04-26T14:56:04.750

1The kernel is in control and need not lock anything, so how could it deadlock? Yes, there is a small performance price to pay (a small extra overhead at fork time -- say a couple dozen machine instructions for a congruential PRNG or /dev/urandom read, vs many fewer for a counter-increment), but that's always the case for measures intended to improve security (check the CPU overhead of HTTPS communication vs plain HTTP for example;-). – Alex Martelli – 2010-04-26T15:26:32.923

I meant livelock (while(true);), sorry, I was quick answering ;-) – None – 2010-04-26T16:34:04.113

I usually see PID numbers very small (in the range of 2000 - 5000) even if my machine is ON for 30-40 days. Does that mean that there were only 5000 processes forked in last 40 days? that looks small number to me. – Jack – 2010-04-26T17:51:38.947

@Jack, I guess it's just an effect of the recycling -- your kernel might be built with a small max-number of processes (on my workstation at work, I'm using a standard ubuntu hardy heron kernel and I'm regularly seeing PIDs up to 30,000 or so even though I turn the machine off every weekend). – Alex Martelli – 2010-04-26T21:54:19.917

11

It varies.

Most systems simply keep a count of the last PID generated, add one (wrapping at a maximum number such as 65535 or a bit smaller - often the wrap occurs at 65000 or even 60000), and check that the number is not currently in use (repeating if the PID is still in use - so PID 1, the kernel, is still there and doesn't get 'reissued').

Other security minded systems generate a number at random and check that it is not in use.

At any given time, it is guaranteed that all PID numbers are unique.

Jonathan Leffler

Posted 2010-04-26T14:08:59.810

Reputation: 4 526

9

As to the recycling part of the question, one thing to keep in mind is that a pid does not become available as soon as the process with that pid terminates. The pid does not become available until the parent of that process collects the termination status of its child via some form of the wait() system call. A child that is terminated but whose parent has not issued a wait is called a zombie and will usually show up in a ps as defunct. It is possible for an ill-behaved parent to starve the system of pids if it launches children and does not wait() for them.

If the parent of a process dies before it collects the status of a child, that is ok. The child is inherited by init who will make sure a wait() is issued and the pid is recycled.

frankc

Posted 2010-04-26T14:08:59.810

Reputation: 261

This is a very important detail. Without it even a simple myprog & followed by wait $! would be UB. – Andreas – 2019-09-11T13:31:11.787

3

They're sequence numbers and wrap round (at an OS-specific value) if the system is up for long enough. Numbers are never reused unless they're free at the point of fork().

Donal Fellows

Posted 2010-04-26T14:08:59.810

Reputation: 225