Image file to spreadsheet

I want to take an image file (eg tiff) and convert it to a spreadsheet (csv, or whatever) such that each pixel becomes a cell with a numeric value for that pixel.

I have googled it, and everyone is trying to sell me OCR software, that is certainly not what I want.

I will be editing the image beforehand, so can convert it to any format required, and can crop it down to a managable size.

It's mainly greyscale I'm interested in, and could do the little bit of colour that I will need as 3 separate files (one per colour channel) .

I will be doing statistical analysis on the spreadsheet later. I could likelly knock somthing together in java to do it, especially given that this is a proof of concept for a (much) later programming project, but that really feels like far too much hastle for what should be a simple task.

Thingomy

Posted 2011-06-04T10:36:52.157

Reputation: 99

If you say you can do Java, why not. It's not too much of an overhead, although other programming languages could probably do it in less lines of code. I don't think you'll get a readymade solution for this. You can take a look at ImageMagick's interfaces for different programming languages.

– slhck – 2011-06-04T10:41:35.947

I'll just put this as a comment unless you ask for it as an answer, but using python and the PIL library. you can get to a pixel value in 3 lines of code 1) Import the library 2) open the file 3) getpixel. Everything else is just bookkeeping. – Dennis – 2011-06-04T11:10:09.240

Having no idea what python code even looks like let alone how to code it, I personally will give that one a miss, but I'm sure it may work well for others. I am currently in the middle of finding a java editor so I can see if I can remeber how to use it. – Thingomy – 2011-06-05T14:32:21.247

Answers

It's not too complicated. Here's an approach in Java. You'd just need to write this to CSV.. just simple writing to a file.

for (int x = 0; x < image.getWidth(); ++x)
  for (int y = 0; y < image.getHeight(); ++y) {
    int pixel = image.getRGB(x, y);
    int r = (pixel >> 16) & 0xff;
    int g = (pixel >> 8) & 0xff;
    int b = (pixel) & 0xff;
  }
}

where image is a BufferedImage. You can load a BufferedImage with ImageIO.read(File input);. You'll find this in javax.imageio.ImageIO.

slhck

Posted 2011-06-04T10:36:52.157

Reputation: 182 472

This would certainly do the trick, it's been a long time since I've done any java though... – Thingomy – 2011-06-04T11:09:31.730

I had a similar problem. I get a lot of image files containing graphs. It's a pet peeve of mine actually. People don't send me the actual data points, which is what I need to do any sort of calculation or analysis with, instead they send me a pretty picture, which is just about worthless. If the data set is small and the graph is a the output of a computer printout (like most modern ones are) ,as opposed to hand drawn it can be rather accurate to use MS paint. What you do is open the file and zoom in on the x axis. Zoom in until you can move your mouse cursor to individual pixles. There is a pair of numbers in the lower right hand corner that is the x and y pixle coordinates. Find the x and y of each numbered tick mark on the x axis at a repeatable point like just where it crosses the horizontal line or some such. If its the result of a printout or a image saved from a computer program, they should all have the same y coordinate. But if its a hand drawn graph (like you find in old journal articles, then you need to have the y coordinate because your going to need to correct for the entire image being rotated or not being drawn perfectly. Repeat this with the y axis. If the tick marks are multiple pixles wide, and you cant find a pixle right in the middle, you can estimate the center, for example if the tick mark is 4 pixles wide, and starts at pixle number 51, and ends at pixle 54, then the center is 52.5. (this may be overkill if the graph isnt drawn to that accuracy ,which is common with hand drawn sources)

Next go read values off the graph the same way, by zooming in and readying the x and y pixle coordinates. Depending on the graph, you may be able to see the data points as little symbols, or it might just be a line. You might also be able to simplify your process and just take the points where it crosses a line or some such. Nothings written in stone.

Now that you have the data, put it all in excel. Depending on what your doing and how the graph is made there are various tricks that can help. For example,if they used ridiculously thick lines, sometimes it can help to fit a neighborhood around the point of interest, then choose a line normal to that fit line and average the values that fall along that point. Other times an simple average of all x points that fall at a particular y point works better. Sometimes you can simply take all the edge points, and fit them to a curve and be done.

This may make the job far easier but I just found it yesterday and have not used it yet. Its supposed to take all the points in an image and turn them into a comma delimited list of pixle values. Then you could for example, pick out the pixles by color. Form that you should be able to process them in excel to get whatever the original data was,more or less. If the data was made from an excel graph exported to an image for example, it should be quite accurate.

https://itg.beckman.illinois.edu/technology_development/software_development/get_rgb/

mallen

Posted 2011-06-04T10:36:52.157

Reputation: 1

I would try to use free open-source SAGA GIS, load your image with Module/File/Grid/Import/Image - you will get SAGA grid which you can export as XYZ file which is in fact text file containing X and Y values and the Z grid value of each pixel. This should be waht you neet. SAGA is free. :-) http://www.saga-gis.org/en/index.html

Juhele

Posted 2011-06-04T10:36:52.157

Reputation: 2 297