It was mentioned that JPEG should not be used between image creation and redaction of sensitive contents, because compression artifacts around the redacted area may leak information. Given how this lossy format works, this makes sense. Is there any public research on this subject?
The core of the issue is that, for a lossy format like JPEG, the pixels are not entirely independent of each other, and each pixel has a certain relation to the value of the neighboring pixels. This relation is called the DC coefficient, and applies to the 8x8 pixel blocks that are mostly independent.
From lcamtuf (Nitter link) describing this phenomenon:
PSA: If you're redacting text in an existing JPEG image file (e.g., a scan or a photograph of a document), you should probably maintain a margin of 8 or more pixels between the black bar and the underlying text.
The reason is that JPEG compression is a lossy algorithm that operates on 8x8 pixel blocks, and that barely perceptible content-dependent compression artifacts are present as a halo extending up to 8 pixels past the boundary of the text in the image you are trying to redact.