1

I'm managing a client that distributes their users images and files through one source. My background is heavily in development as opposed to security, and I'm of course familiar with some tricks that tickle the development side (MD5, etc) but nothing serious or in good practices. Hence, question:

What is the best method of tagging images for a group of users so that if one user leaks the information we can tell who leaked it originally (a unique user tag of some sort generated when someone downloads)?

Manual distribution is not an option. Further, our client's service is based on the honor system, so a secure viewer is not an option either. We want to retroactively find leaks-- preventing them entirely is impossible without user inconvenience in this case, anyway.

Thanks in advance!

AndrolGenhald
  • 15,436
  • 5
  • 45
  • 50
sterbenz
  • 11
  • 2
  • 1
    This is exactly what media watermarking is for. – forest Nov 10 '18 at 03:15
  • @forest All the images already contain image watermarking, but we can't track down the exact leaker by a unique tag through the image. Some unique tag has to be both associated to a user statically and dynamically applied to any image they download as themselves. – sterbenz Nov 10 '18 at 03:23
  • Are they logged in with anything identifying, like a cookie? You could dynamically add a watermark that can be tied to the user's cookie (and thus identity) when it's downloaded. – forest Nov 10 '18 at 03:26
  • There are libraries available in various languages that would allow you to insert information into the metadata of certain image formats. That may be all you need - write an identifier to the metadata of the images for each user dynamically at request time. – Johnny Nov 10 '18 at 19:38

3 Answers3

1

What you are looking for is called "steganography" the process of hiding secret content in plain sight.

Exactly which technique or library is most useful to you depends on what you are trying to protect and you are trying to protect from. It could be something as simple as playing with the exif metadata, or something that modifies the image content like Digimarc. With a bit of research there are probably some appropriate open source tools you can bring in server-side to make the process happen dynamically.

James Snell
  • 888
  • 6
  • 8
1

For different kinds of media, there exist different techniques. A short search on Google Scholar will reveal lots of techniques for, for example, audio watermarking. Many claim to be inaudible as well as resistant to re-encoding and reasonable changes of pitch/tempo (e.g. playing music at 1.1x speed is not super annoying if you're not very familiar with the original, but the watermarking would also survive that).

As James Snell mentioned in their answer, steganography is the technique you are looking for. A common form is using the least significant bit to encode data, but this is fairly easy to remove if someone takes the effort, and will get lost if someone uses a lossy format like JPEG. Slightly easier is to embed the identifying information in metadata, but many photo editors (usable by novices) can modify this.

When you do settle on a technique to use, note that you will want to pseudononymize the data. Instead of embedding a name or even just a user ID, which someone might trace back to an actual person, you could embed the encrypted form of the name or user ID. It's not as good as being anonymous since a pseudonym still falls under PII, but to make a reasonable effort of protecting user data, it would be good practice to do this.

Luc
  • 31,973
  • 8
  • 71
  • 135
0

You're going to have to dynamically embed some wateermark that identifies the user who has requested the image. Meaning: If a user requests/downloads the image you have to take the original image and embed a watermark identifying the user into the image.

mroman
  • 555
  • 3
  • 9