2

I'm coding an app that allows registered users (anyone who registers) to upload images to my nodejs background (processing with sharpjs)..

I remove the exif section of the image for security. However I have realised that in-fact gps section of exif is extremely neat. Now I'm thinking to keep all of it.

My concern is someone might upload text such as

{
...
GPSSpeed: pornhub.com
GPSLatitudeRef: SOME EXTREME lenght text... 500000k+
...
}

Well you get my fear.

Is my fear legitimate?

My first intuition is too loop all the keys and match them to some schema (lenght, type, maybe even content) , but that would require me creating such a schema.

forest
  • 64,616
  • 20
  • 206
  • 257
Cisum Inas
  • 155
  • 5
  • 1
    Why not just use a regex to check that the GPS value fits the format you expect. – Daisetsu Dec 27 '18 at 00:15
  • 1
    So what is your worry, that people will be uploading a lot of junk data in EXIF? – forest Dec 27 '18 at 02:52
  • @forest , Junk and massive chunks of text – Cisum Inas Dec 27 '18 at 18:02
  • @daisetsu there is *alot* of gps data, some seems to be in arrays, other as integer etc – Cisum Inas Dec 27 '18 at 18:03
  • @Cisuminas So why not just limit the maximum file upload size? – forest Dec 28 '18 at 01:57
  • @forest , I do limit file upload size :) . Feel like this is slightly different.. – Cisum Inas Dec 28 '18 at 12:44
  • @Cisuminas I don't see why. If someone wants to use up their file size limit by putting in junk EXIF data, that's their problem, not yours. Note also that you can easily hide extra information e.g. by simply appending random data to the end of a file. I used to do that all the time on image boards to smuggle non-image data into image files. Removing EXIF has no effect on that. – forest Dec 29 '18 at 04:46
  • :/ Thx for the input @forest , the reason why I fear exif is because I will automatically parse the content and show it as i.e. focal length, I had no idea you could add content to the end of an image, do you know any state of art checkers for this? thank you in advance – Cisum Inas Dec 29 '18 at 22:29
  • @Cisuminas Nope, no checkers for this. Even if there were, there are a thousand other places you could hide data. You could even hide data in LSBs, making the image larger but being virtually indistinguishable from a normal image. There's no way to limit this other than limit the maximum file size. – forest Dec 30 '18 at 01:27
  • @forest I have a limit on 25mb, I assume they could still upload a image of 4mb and 21mb of fake data? – Cisum Inas Dec 30 '18 at 14:20
  • @Cisuminas But what's wrong with that? They could also just upload a 25mb image. – forest Dec 31 '18 at 03:19
  • @forest I see your point.. Anyways I ended up implementing a function per gps, that way it will be harder to fake atleast, but ofcourse they can just enter a bad title or wathever.. – Cisum Inas Jan 04 '19 at 02:59

1 Answers1

5

Information in EXIF are not free data but have data types. The GPS position information are rational data types expressed as exactly 8 bytes. This means that you cannot put arbitrary string information in it but only floating point numbers.

It might make sense to restrict these values further to sane values. Given that you only want to keep the GPS information of the EXIF metadata it makes sense to extract these, remove the original EXIF data block and write a new one with only the GPS information in it as long as they are within a sane range. There are libraries in a variety of programming languages to do this.

Steffen Ullrich
  • 184,332
  • 29
  • 363
  • 424
  • Maybe he could make use of well known libraries for handling EXIF data. Here's one written in Perl, but also has standalone executable which are capable of extracting specific GPS data from EXIF. http://owl.phy.queensu.ca/~phil/exiftool/TagNames/GPS.html – Daisetsu Dec 27 '18 at 18:34