143

As a guy working in security/pentest, I regularly take screenshots of exposed passwords/sensitive information. Whenever I report these, I mask parts or complete info as in the sample given below

test_image

I often wonder, is it possible for someone to 'reverse engineer' these pics and recover the original information? If so, what should be the correct way of masking such kind of info?

I am using shutter for taking screenshots and using accompanied edit tool to add the black stroke.

EDIT:

As pointed out by some of you, my question is different from this since:

  • I am not asking about MS paint/black strokes. The image is just an example to better explain the question
  • I have clearly asked for the correct/most secure way of producing photographic evidence.
Rory Alsop
  • 61,367
  • 12
  • 115
  • 320
xandfury
  • 1,351
  • 3
  • 10
  • 19
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackexchange.com/rooms/76574/discussion-on-question-by-spunkpike-secure-way-of-masking-out-sensitive-informat). – Rory Alsop Apr 25 '18 at 21:58
  • 1
    I have reverted to the version prior, as the answers have been posted based on the use of shutter, and removing that will invalidate them to a large degree. – Rory Alsop Apr 25 '18 at 21:58

8 Answers8

192

Yes, it can be recovered.

As long as shutter does not use layer (it almost certainly does not) and as long as the black is really all black (it must not be transparent), it is enough.

The picture that you provided uses some amount of transparency, see here:

enter image description here

All I had to do is use the Fill tool in MS Paint. If I used some algorithm that would take the jpg compression into account, I could probably get better results.

Solution:

Use an editor that does not make the block transparent. Make sure layers are not used. Make sure change history is not stored (to allow undo) in the file. I believe MS paint + bitmap format satisfy all the requirements. Most editors combined with bitmap (BMP) format without compression should satisfy these requirements, but I can not confirm this.

Remove the data. You can do so in many editors by selecting it and pressing delete or Ctrl + X. Then apply redaction graphics, whether black box, or anything else.

DO NOT use JPEG (jpg) or other lossy formats anywhere from capture until redaction. JPEG may leave artifacts that may convey information about the deleted pixels. This may also apply to other lossy formats, use lossless formats if possible. Using any format after the image is redacted is fine.

As lossless formats may also retain some information, if they are not completely re-encoded after the edit, it is recommended to either only use pure bitmap format with no compression before redacting, or to change the format after redacting.

Double check:

You can double check no compression is used in BMP format by checking the file size. The size should be larger than color_depth / 8 * width * height (resolution in pixels, color depth usually 24). Note that this check will not reveal transparency and artifacts caused by lossy compression, so make extra sure you did not use lossy format at any point.

It may also be useful to post a specific question about your proposed setup here, so you can see additional opinions and recommendations. It is hard to give definitive answer, that would work in general for all platforms, formats and editors as they all have their specific caveats.

Peter Harmann
  • 7,728
  • 5
  • 20
  • 28
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackexchange.com/rooms/76393/discussion-on-answer-by-peter-harmann-secure-way-of-producing-photographic-evide). – Rory Alsop Apr 22 '18 at 09:15
  • 22
    It doesn't have to be black. It can be any color as long as it's fully opaque. – Pikamander2 Apr 23 '18 at 03:20
  • 3
    your first paragraph makes it sound like it can be recovered only if it's not transparent – Aequitas Apr 24 '18 at 02:57
  • Does PNG have such artifacts? Just curious, because mspaint saves in PNG format by default these days, as does the Windows built in screenshot tool – user13267 Apr 24 '18 at 03:04
  • 3
    @user13267 PNG is lossless compression and therefore, by definition, _has no artifacts_, because an artifact is the result of a lossy compression algorithm throwing away some data and degrading the image quality for a smaller image file. JPG, on the other hand, _does_ have artifacts, which can be used to determine redacted pixels because the original artifacts are still there and depended on the pixels you blocked out, so you could theoretically extract some information. – Nic Apr 24 '18 at 05:06
  • 1
    @NicHartley Being lossless doesn't *necessarily* mean it doesn't give information that you want to keep secret. As a proof of concept, suppose we took the original image matrix, reshaped it into an array of byte values, and then determined a Huffman coding for that. Then we black out the secret bits without recalculating the Huffman trees. This is entirely lossless, but the compression table itself gives us information about the frequency of values in the original image. – Ray Apr 24 '18 at 17:28
  • @Ray and the plot thickens. By the way, you guys criticized me when I wrote bitmap format :D Horrible this is :D So what formats are safe then? I guess it will be better to just white-list a few that are OK. – Peter Harmann Apr 24 '18 at 17:33
  • @PeterHarmann It isn't the format itself that's necessarily unsafe. It's the program. In my example, if saving the image recomputed the Huffman trees, then there wouldn't be any information leakage. The same goes for png. Let M be the original image matrix, R(x) be a function that redacts the image without changing it to another format, and E be the encoder. If E(R(M)) is *not* bitwise identical to R(E(M)), then you're leaking information somewhere. If they are equal, then you're probably okay, but I'm not sure how to prove it. – Ray Apr 24 '18 at 18:26
  • @Ray that is not what I meant. Can we get a list of common formats that are safe under all circumstances? So they must not support layers, they must not support change history (undo) and they must not use compression that may be implemented "badly". Hope I am not forgetting anything. Sort of better sure than sorry policy. – Peter Harmann Apr 24 '18 at 18:27
  • @PeterHarmann Any format with a compression table that isn't completely fixed (and thus independent of the image) could potentially be implemented in a way that allows for this sort of attack. So...probably bitmap? I'm not familiar enough with image formats to say. Or any format that only stores a single flat image, so long as you make all the changes to the image *before* doing any compression. – Ray Apr 24 '18 at 18:36
  • Using a image format with two few colors (like gif) may also lead to artifacts depending upon the technique used to do the reduction. – Guy Schalnat Apr 24 '18 at 20:22
  • @Ray You're right, of course. I didn't realize at the time I was implying something about PNG _not_ having 'hidden' information about the blacked-out pixels, but re-reading my comment, I was. I _meant_ to just write about compression artifacts, and how they can be used; I wasn't trying to imply that PNG or other formats couldn't have weaknesses based on the same idea. With that said -- RAW does no compression, and is therefore probably safe, and I'd be surprised if most editors didn't recompute the table per save (though not _that_ surprised) especially at higher compression levels. – Nic Apr 24 '18 at 20:27
  • 2
    @PeterHarmann It seems like the safest thing is to 1) redact, 2) convert the image from one format to another, and 3) then to delete the original in a secure way, such as with the `shred` command on Linux. – jpaugh Apr 24 '18 at 21:15
  • 2
    @jpaugh I'd add one small thing to that: Use _lossless_ formats. While they can have artifacts (See Ray's comments above) the artifacts are less likely to persist when converted into another file format. – Nic Apr 24 '18 at 22:45
  • @NicHartley where exactly? I said to use lossless formats and then re-format, if other than BMP without compression is used. Was I unclear somewhere? BMP was just my recommendation, as it also does not support layers, so less space to make a mistake and I know the format. – Peter Harmann Apr 24 '18 at 22:49
  • @PeterHarmann I wasn't responding to your answer. Please notice the ping I used at the _very beginning_ of my comment. – Nic Apr 25 '18 at 00:47
  • @NicHartley sorry, missed it. – Peter Harmann Apr 25 '18 at 01:18
  • 2
    @PeterHarmann No worries, happens to everyone. I was trying to reinforce that point in your answer because it's that important :) – Nic Apr 25 '18 at 01:22
  • @NicHartley The trouble is, even lossless formats leak [some information](https://security.stackexchange.com/users/104065/ray), sometimes. At the point that you're re-encoding it, the artifacts in the original might not matter, but IDK. – jpaugh Apr 25 '18 at 22:23
80

You don't even need to use an image editor in this case to recover the "redacted" text. Simply zooming in on the image is enough to read it.Visible redacted text

So I would say that yes, it most certainly is possible to recover the original text.

ke4ukz
  • 757
  • 4
  • 3
  • 3
    This would be easier to see if it was more closely cropped. – Michael Hampton Apr 20 '18 at 05:45
  • @michael. Click the picture and zoom in. Shows right up on my android phone and Windows 10 computers. – IT_User Apr 20 '18 at 07:04
  • 69
    Simply leaning close to the monitor works here – Chris H Apr 20 '18 at 08:15
  • 2
    I can see it when it's on the bottom half of my monitor but not the top half. – Captain Man Apr 20 '18 at 15:01
  • @CaptainMan That means your monitor is having a case of light bleed or something similar – Ismael Miguel Apr 20 '18 at 15:16
  • 9
    This doesn't answer the question: "what should be the correct way of masking such kind of info?" – Ghoti and Chips Apr 20 '18 at 15:35
  • 2
    @IsmaelMiguel I replicated CaptainMan's observation; in my case it's to do with viewing angle. If it's on the top half and I stand up, I can make it out again. – user253751 Apr 21 '18 at 01:48
  • 1
    This depends on your monitor calibration. If your monitor crushes all nearly-black shades into indistinguishable (for your eyes), you won't be able to see the difference even with magnification. You would need to do something to boost the contrast between nearby colors, like digitally brighten the image (like @pabouk's answer) – Peter Cordes Apr 24 '18 at 03:06
71

In this case the image can be recovered very well

As others already pointed out the dark patch is not completely black. It has a transparent effect and only darkens the original image. The original image can be recovered almost completely: recovered screenshot

In this case the recovery was pretty straightforward. I needed to check the range of grey levels of the patch and re-adjust the range to the original values. I used Gimp for that. The unmodified text uses only 7 visibly distinctive levels of grey. The darkened part has retained about 6 levels (when ignoring the anti-aliased border and JPEG artifacts) so we can get almost exactly the original image. adjusting levels in Gimp

The level and curve adjustment in Gimp and similar image editors can be used to check almost invisible information in the image.

To summarize the recommendations:

  • Use an image editor which covers the area completely (non-transparently).
  • If compression artifacts (JPEG, dithered GIF etc.) surrounding the area could reveal some information, hide them too.
  • impressive! Can you unblur as well? blur radius 12, data starts with DEADBEEF: https://i.stack.imgur.com/Q6Ry9.png – NH. Apr 24 '18 at 20:02
  • @NH. Unblurring is incomparably more difficult. Here is one example of what can be done: http://www.mathworks.com/help/images/examples/deblurring-images-using-the-blind-deconvolution-algorithm.html ... In your example I think the blur radius is too large compared to the image details. Also the information outside of the blur box which was cut off will be missing for reasonable restoration. --- Here are other examples of how you can make some information in an image more visible: https://ampedsoftware.com/five-samples – pabouk - Ukraine stay strong Apr 25 '18 at 06:01
29

Yes, the text can be unmasked, either by simply zooming in or using any of the techniques - but not restricted to - pointed by pabouk and Peter answers.


I have clearly asked for the correct/most secure way of producing photographic evidence.

Completely remove any sensitive data from print-screens.


Steps

  1. Press the PRT SCR button on your keyboard (lossless capture, no artifacts);

  2. Open gimp/photoshop/paint and select new file, the default image size should be the same as your print-screen, hit CTRL+V to paste the ps into the newly created file and export it immediately as your original;

  3. Select the sensitive data using the appropriate tool on your software and hit CTRL+X to cut it;

  4. Cover the hole with a black rectangle, as you normally do (purely visual, nothing under);

  5. Export your dummy_copy as a new file (jpeg, gif, png);

  6. Keep the original untouched and share the dummy_copy.


Note

Even if any of the safe masking techniques presented on this page work at the moment, you've absolutely no guarantee that your "secret" data will remain like that tomorrow. The only way to be 100% sure is to cut/remove/destroy/erase/nuke the sensitive data from the original file and export it as new file.


Bottom Line:

enter image description here

Pedro Lobito
  • 524
  • 3
  • 13
  • 3
    There are still considerations to be made about JPEG artifacts leaking some amount of information if the area redacted is not large enough. – Peter Harmann Apr 20 '18 at 09:05
  • 1
    No, therer's no jpeg artifacts in this case. The print-screen has to be edited on ps or gimp and exported as a new file, which doesn't contain the actual data. There's nothing below the black rectangle. – Pedro Lobito Apr 20 '18 at 11:35
  • 17
    @PeterHarmann: If done **exactly** as described here, straight from `Print Screen` to image editor, there is no JPEG compression step in between. The image editor gets a pixel-perfect image to work on. – MSalters Apr 20 '18 at 12:43
  • 10
    `put a black patch on top of it`; it doesn't matter which color it is, hell it could even be white. Or yellow. Or pink. As long as it's (100%) **opaque**. @spunkpike used a (slightly) transparent patch. – RobIII Apr 20 '18 at 15:11
  • 7
    @PedroLobito None of the "cut/remove/destroy/erase/nuke" provide clear instructions on *how*. If someone paints a (slightly transparent) patch over a word that could be considered "removed" or "erased" or, from their perspective, even "nuked". So you need to be clear that the "drawing over"-method _only_ works with opaque patches; telling people to "nuke" it has no actual meaning (unless you happen to have a nuke lying around, in which case we got bigger problems). – RobIII Apr 21 '18 at 01:12
  • 1
    Also there's no need to 'export as new' for filetypes like PNG or JPEG which don't keep track of history. Most imageformats actually don't. The only thing "export as new" prevents is that stuff like cached thumbnails in thumbs.db are sidestepped but as long as only the actual image is sent there's no extra risk involved. – RobIII Apr 21 '18 at 01:15
  • And finally the step 3 in your 'tutorial' is useless if you cover it up immediately in step 4. Then why do step 3 in the first place? (Again: *do* make sure you cover up with an *opaque* patch, but that's all). You're saying it yourself: you can't recover something that doesn't exist. Overwriting pixels with other pixels makes them unrecoverable. Transparent patches don't _overwrite_ but they change/add/subtract the original pixels. – RobIII Apr 21 '18 at 01:18
  • 6
    ***"None of the "cut/remove/destroy/erase/nuke" provide clear instructions on _how_"*** This is addressed on STEP 3 via `CTRL+X`, so, when you say, "***Transparent patches don't overwrite_ but they change/add/subtract the original pixels***.", it means either that you don't know or remember what `CTRL+X` actually do, which is cutting, aka removing, destroying, erasing. Nuke was a figure of speech for "*just get rid of it*". – Pedro Lobito Apr 21 '18 at 01:57
  • 2
    Your last screenshot seems to be fake, unless you used 2 spaces before you added the "hidden word", what not many people do. Using this sidechannel, I guess nothing is below that masking – Ferrybig Apr 23 '18 at 17:12
  • @Ferrybig The screenshot was build using the steps on my answer. – Pedro Lobito Apr 23 '18 at 19:01
  • This still leaves one big problem: the length of the black bar still provides some information. In this case, it would be best to cover the whole search form so that the length of the input isn't revealed. furthermore, you should always use black boxes that are significantly bigger then the text otherwise you inadvertently reveal for example that your string did not include a letter like `g` (which goes below other letters). – Polygnome Apr 24 '18 at 07:19
  • 3
    I love the fact that you improved the search engine chosen, as well as the masking method. – NH. Apr 24 '18 at 19:50
  • @Ferrybig, Pedro himself said in a comment that there is nothing below the black rectangle. – NH. Apr 24 '18 at 20:50
11

Examples

As shown above, your example was breakable,the blacks of the redaction had variation showing the text.

Real life example

New York Times Suffers Redaction Failure, Exposes Name Of NSA Agent And Targeted Network In Uploaded PDF

This was an example of a PDF that appeared redacted, but the data could be recovered.

Ways data can leak

  • Document titles
    • Quite a simple one, but can be dangerous
  • Colour variations like the one you had
  • Blurring
    • If the data is not varied enough it can be attacked source
  • Unedited thumbnail
    • You may redact the details in the image itself, but not in the thumbnail -Metadata
    • A surprising amount of data can be stored in some image format's metadata

Defences against the above attacks

  • Titles
    • Check the filename doesn't contain anything sensitive
  • Colour variations
    • Make sure the blocked out area is all one consistent character, so that it can't be read
  • Blurring
    • Always block out, rather than blurring
  • Thumbnail and metadata
    • Re-export the image in a format that has less metadata, or use a tool to strip the metadata. Make sure the thumbnail is remade.
jrtapsell
  • 3,169
  • 15
  • 30
  • 2
    None of this answers the question of OP, which was "the correct/most secure way of producing photographic evidence." – Tom K. Apr 23 '18 at 07:46
  • "Thumbnail and metadata" - Stored "thumbnails" in the EXIF metadata can be surprisingly large and some editors neglect to regenerate these when resaving the image. (Particularly relevent if editing a previously saved JPG screenshot, although I believe it could also apply to later versions of the PNG standard also.) – MrWhite Apr 23 '18 at 21:04
  • "**Always block out, rather than blurring**" Always cut . – Pedro Lobito Apr 24 '18 at 12:29
  • 1
    A point on thumbnails - often times thumbnails aren't modified when the image is cropped - back in the days of yore, an actress from a popular technology news channel accidentally leaked a nude photo of herself when she cropped it to use as a headshot. The thumbnail retained all of the original information, despite the crop that was performed. – Monica Apologists Get Out Apr 24 '18 at 16:08
  • Why even remake the thumbnail as opposed to not storing one in the EXIF data in the first place? If you're worried about accidentally leaking data, stripping EXIF data 100% of the time seems like a no brainer. As an aside, I took a quick glance at a few different images I had sitting around on my computer and not a single one had thumbnail EXIF data. Makes me think that's more the norm and certainly doesn't give me a good reason to ever keep any EXIF data. – Kat Apr 24 '18 at 19:07
  • I would like to mention that "can be reliable" is the exact definition of unreliable. ;-) (If something never worked, it would always be reliable, in a negative way.) – jpaugh Apr 24 '18 at 21:25
8

I think you may be misusing what's meant to be a highlight tool (I'm not at any of my Ubuntu machines ATM and don't have shutter installed anyway to test). I can't quite believe that a tool meant for redacting would have such an obvious flaw as working with transparency takes more effort than not).

In the GIMP you can select an area and fill that area with solid colour (the "fill whole selection" option). Then save in a format that doesn't support layers (perhaps flattening manually first).

Here's a sample image: enter image description here

and here's an indication of the tools in the GIMP (red freehand shows rectangle select, bucket fill, and fill whole selection):

enter image description here

You can equally do this in MS paint but the GIMP is FOSS and cross-platform. Note that you should export rather than saving in GIMP's own .xcf file format, as that supports a few features that could reveal this (like layers, which actually aren't created in this approach) and is also not widely supported.

Note that I didn't save the image at any point in any format until after masking the password. This doesn't mean it's not saved locally in an undo buffer but I assume your machine is sufficiently secure for the purpose, at least as far as this question goes.

Chris H
  • 4,185
  • 1
  • 16
  • 22
  • 1
    This still leaks the length of the password, and if you cut it so close with the black box you might inadvertently reveal that there was no `g` or `q` in the string (which go way below the normal line). – Polygnome Apr 24 '18 at 07:21
  • @Polygnome, it leaks the *maximum* length, that's true. My example used an absurdly short password so a longer box would be better in reality. Also my example had no characters with descenders so I can't be absolutely sure, but my intention was to go low enough. Of course the "password" here bears no resemblance to any of my real passwords, in fact it's chosen as a challenge – Chris H Apr 24 '18 at 07:29
  • 1
    If you cover the *whole input field*, it yields the *maximum* length. But you did not. You only covered a very small portion of the input field, which puts much more constraints on the length of your actual password. Most people don't realize just how much info they give away. – Polygnome Apr 24 '18 at 07:36
  • 1
    @Polygnome It reveals an upper limit on the length of my password, i.e. *the maximum length of my password* (about 10 chars if they're not too narrow. I used 8 -- unrealistically short). Masking the entire input field is undoubtedly better as it reveals nothing (not even the maximum permissible length as the box can scroll sideways. I was basically sticking to the OP's sample but blacking it out properly. I've recreated it and it does mask descenders – Chris H Apr 24 '18 at 08:11
8

Don't provide screnshots. Provide descriptions OR sample screenshots taken at times when sensitive data is not on the screen, with description that "X appeared here".

"It's no evidence!" you may say. BUT - a screenshot that had been redacted is also not an evidence. It's just a product of your artistic skills. Your testimony of the situation usually carries more weight.

Many answers here focus on making sure that the masking rectangle is actually uniform. But that's not all. The masked text can be recovered with statistical analysis of the size of the masked area. Especially when there is unmasked text before and after, its layout says quite a lot about what was redacted. Proportional fonts are particularly prone to this technique, as characters have more or less unique width. Monospace fonts disclose less information, but the length is known with 100% accuracy.

Agent_L
  • 1,921
  • 14
  • 13
  • 1
    All other answers completely miss this, which is a very important point. length of the text is not the only thing, line height is as well (make sure all black boxes are wide enough to cover low chars like g or q to not disclose whether or no the masked text actually contained them). – Polygnome Apr 24 '18 at 07:24
  • As an IT person, I find that the primary purpose of screenshots is not to convey evidence, but exactly to replace words for communication. A written description of the screen would be tantamount to a (very lossy) image compression format. – jpaugh Apr 24 '18 at 21:29
  • @jpaugh That's why I wrote "OR sample screenshots taken at times when sensitive data is not on the screen" – Agent_L Apr 26 '18 at 08:37
0

Most consumer and professional products just put a layer over the top. Most application recover the image under the redaction is a good thing.

Wipe out the image under is often referred to as burning the redaction.

If the image had been OCRs and you are producing text then need to OCR again after burning.

You come across this in litigation where on parties produce. Also search on dDiscovery.
Guidance on Redacting Personal Data Identifiers in Electronically Filed Documents.

As far a shutter not sure but unless it specifically give you an option to burn it is probably just a layer.

paparazzo
  • 181
  • 7