Mathematica 1170 1270 1096 1059 650 528 570 551 525 498 bytes
The latest version saves 27 bytes by not requiring that the plate be "trimmed" before it is parsed. The penultimate version saved 26 bytes by using only 10 of the original 24 sample points.
z=Partition;h@i_:=i~PixelValue~#/.{_,_,_,z_}:>⌈z⌉&/@z[{45,99,27,81,63,81,9,63,45,63,9,45,45,45,63,45,45,27,45,9},2];f@p_:=h/@SortBy[Select[p~ColorReplace~Yellow~ComponentMeasurements~{"Image","Centroid"},100<Last@ImageDimensions@#[[2,1]]<120&],#[[2,2,1]]&][[All,2,1]]/.Thread[IntegerDigits[#,2,10]&/@(z[IntegerDigits[Subscript["ekqeuiv5pa5rsebjlic4i5886qsmvy34z5vu4e7nlg9qqe3g0p8hcioom6qrrkzv4k7c9fdc3shsm1cij7jrluo", "36"]],4]/.{a__Integer}:> FromDigits[{a}])-> Characters@"BD54TARP89Q0723Z6EFGCSWMNVYXHUJKL1"]
122 bytes saved through LegionMammal978's idea of packing the long list of base 10 numbers as a single, base 36 number. He pared another 20 bytes off the final code.
The jump from 528 to 570 bytes was due to additional code to ensure that the order of the letters returned corresponded to the order of the letters on the license plate. The centroid for each letter contains the x-coordinate, which reveals the relative positions of the letters along x.
Ungolfed Code
coordinates=Flatten[Table[{x,y},{y,99,0,-18},{x,9,72,18}],1];
h[img_] :=ArrayReshape[PixelValue[img, #] /. {_, _, _, z_} :> ⌈z⌉ & /@ coordinates, {6, 4}];
plateCrop[img_]:=ColorReplace[ImageTrim[img,{{100,53},{830,160}}],Yellow];
codes={{{15,13,15,13,13,15},"B"},{{15,8,8,8,9,15},"C"},{{15,13,13,13,13,15},"D"},{{15,8,14,8,8,15},"E"},{{15,8,14,8,8,8},"F"},{{15,8,8,11,9,15},"G"},{{6,6,6,6,15,9},"A"},{{9,9,15,15,9,9},"H"},{{8,8,8,8,8,15},"L"},{{9,15,15,15,13,9},"M"},{{15,9,9,9,9,15},"0"},{{9,10,12,14,10,9},"K"},{{9,13,13,11,11,9},"N"},{{8,8,8,8,8,8},"1"},{{1,1,1,1,9,15},"J"},{{15,9,15,14,8,8},"P"},{{15,9,9,9,15,15},"Q"},{{15,9,15,14,10,11},"R"},{{15,8,12,3,1,15},"S"},{{9,15,6,6,6,6},"V"},{{15,6,6,6,6,6},"T"},{{9,15,15,15,15,15},"W"},{{9,9,9,9,9,15},"U"},{{9,14,6,6,14,9},"X"},{{9,14,6,6,6,6},"Y"},{{15,3,2,4,12,15},"Z"},{{15,9,9,9,9,15},"0"},{{8,8,8,8,8,8},"1"},{{15,1,3,6,12,15},"2"},{{15,1,3,1,9,15},"3"},{{2,6,6,15,2,2},"4"},{{7,12,14,1,1,15},"5"},{{15,8,14,9,9,15},"6"},{{15,1,2,2,6,4},"7"},{{15,9,15,9,9,15},"8"},{{15,9,15,1,9,15},"9"}};
decryptRules=Rule@@@codes;
isolateLetters[img_]:=SortBy[Select[ComponentMeasurements[plateCrop[img],{"Image","Centroid"}],ImageDimensions[#[[2,1]]][[2]]>100&],#[[2,2,1]]&][[All,2,1]]
f[plate_]:=FromDigits[#,2]&/@#&/@h/@isolateLetters[plate]/.decryptRules
Overview
The basic idea is to check whether a systematic sampling of pixels from the input image matches pixels from the same location on the bonafide images. Much of the code consists of the bit signatures for each character,
The diagram shows the pixels that are sampled from the letters "J", "P","Q", and "R".
The pixel values can be represented as matrices. The dark, bold 1
's correspond to black cells. The 0
's correspond to white cells.
These are the decryption replacement rules for J P Q R.
{1, 1, 1, 1, 9, 15} -> "J",
{15, 9, 15, 14, 8, 8} -> "P",
{15, 9, 9, 9, 15, 15} -> "Q",
{15, 9, 15, 14, 10, 11} -> "R"
It should be possible to understand why the rule for "0" is:
{15, 9, 9, 9, 9, 15} -> "0"
and thus distinguishable from the letter "Q".
The following shows the 10 points used in the final version. These points are sufficient for identifying all of the characters.
What the functions do
plateCrop[img]
removes the frame and left edge from the plate, makes the background white. I was able to eliminate this function from the final version by selecting image components, possible letters that were between 100 and 120 pixels high.
isolateLetters[img]
removes the individual letters from the cropped image.
We can display how it works by showing where the cropped image, output from plateCrop
goes as input for isolateLetters
. The output is a list of individual characters.
Coordinates
are 24 evenly distributed positions for checking the pixel color. The coordinates correspond to those in the first figure.
coordinates=Flatten[Table[{x,y},{y,99,0,-18},{x,9,72,18}],1];
{{9, 99}, {27, 99}, {45, 99}, {63, 99}, {9, 81}, {27, 81}, {45,
81}, {63, 81}, {9, 63}, {27, 63}, {45, 63}, {63, 63}, {9, 45}, {27,
45}, {45, 45}, {63, 45}, {9, 27}, {27, 27}, {45, 27}, {63, 27}, {9,
9}, {27, 9}, {45, 9}, {63, 9}}
h
converts the pixels to binary.
h[img_] :=ArrayReshape[PixelValue[img, #] /. {_, _, _, z_} :> ⌈z⌉ & /@ coordinates, {6, 4}];
codes
are the signature for each character. The decimal values are abbreviations of the binary code for black (0) and White (1) cells. In the golfed version, base 36 is used.
codes={{{15, 9, 9, 9, 9, 15}, "0"}, {{8, 8, 8, 8, 8, 8}, "1"}, {{15, 1, 3,6,12, 15}, "2"}, {{15, 1, 3, 1, 9, 15}, "3"}, {{2, 6, 6, 15, 2, 2}, "4"}, {{7, 12, 14, 1, 1, 15},"5"}, {{15, 8, 14, 9, 9, 15}, "6"}, {{15, 1, 2, 2, 6, 4},"7"}, {{15, 9, 15, 9, 9, 15}, "8"}, {{15, 9, 15, 1, 9, 15},"9"}, {{6, 6, 6, 6, 15, 9}, "A"}, {{15, 13, 15, 13, 13, 15}, "B"}, {{15, 8, 8, 8, 9, 15}, "C"}, {{15, 13, 13, 13, 13, 15}, "D"}, {{15, 8, 14, 8, 8, 15}, "E"}, {{15, 8, 14, 8, 8, 8},"F"}, {{15, 8, 8, 11, 9, 15}, "G"}, {{9, 9, 15, 15, 9, 9}, "H"}, {{1, 1, 1, 1, 9, 15}, "J"}, {{9, 10, 12, 14, 10, 9}, "K"}, {{8, 8, 8, 8, 8, 15}, "L"}, {{9, 15, 15, 15, 13, 9}, "M"}, {{9, 13, 13, 11, 11, 9}, "N"}, {{15, 9, 15, 14, 8, 8}, "P"}, {{15, 9, 9, 9, 15, 15}, "Q"}, {{15, 9, 15, 14, 10, 11}, "R"}, {{15, 8, 12, 3, 1, 15}, "S"}, {{15, 6, 6, 6, 6, 6}, "T"}, {{9, 9, 9, 9, 9, 15}, "U"}, {{9, 15, 6, 6, 6, 6}, "V"}, {{9, 15, 15, 15, 15, 15}, "W"}, {{9, 14, 6, 6, 14, 9}, "X"}, {{9, 14, 6, 6, 6, 6}, "Y"}, {{15, 3, 2, 4, 12, 15}, "Z"}};
(* decryptRules
are for replacing signatures with their respective character *)
decryptRules=Rule@@@codes;
f
is the function that takes an image of a license plate and returns a letter.
f[plate_]:=FromDigits[#,2]&/@#&/@h/@isolate[plateCrop@plate]/.decryptRules;
{"A", "B", "C", "D", "E", "F", "G"}
{"H", "1", "J", "K", "L", "M", "N", "0"}
{"P", "Q", "R", "S", "T", "U", "V", "W"}
{"X", "Y", "Z", "0", "1", "2", "3", "4"}
{"5", "6", "7", "8", "9"}
Golfed
The code is shortened by using a single decimal number to represent all 24 bits (white or black) for each character. For example, the letter "J" uses the following replacement rule: 1118623 -> "J"
.
1118623 corresponds to
IntegerDigits[1118623 , 2, 24]
{0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1,
1}
which can be repackaged as
ArrayReshape[{0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1}, {6, 4}]
{{0, 0, 0, 1}, {0, 0, 0, 1}, {0, 0, 0, 1}, {0, 0, 0, 1}, {1, 0, 0,
1}, {1, 1, 1, 1}}
which is simply the matrix for "J" that we saw above.
%//MatrixForm
Another savings comes from representing the alphabet as "0123456789ABCDEFGHJKLMNPQRSTUVWXYZ"
rather than as a list of letters.
Finally, all of the functions from the long version, except h
, were integrated into the function f
rather than defined separately.
h@i_:=ArrayReshape[i~PixelValue~#/.{_,_,_,z_}:>⌈z⌉&/@Join@@Table[{x,y},{y,99,0,-18},{x,9,72,18}],{6,4}];f@p_:=#~FromDigits~2&/@(Join@@@h/@SortBy[Select[p~ImageTrim~{{100,53},{830,160}}~ColorReplace~Yellow~ComponentMeasurements~{"Image","Centroid"},Last@ImageDimensions@#[[2,1]]>100&],#[[2,2,1]]&][[;;,2,1]])/.Thread[IntegerDigits[36^^1c01agxiuxom9ds3c3cskcp0esglxf68g235g1d27jethy2e1lbttwk1xj6yf590oin0ny1r45wc1i6yu68zxnm2jnb8vkkjc5yu06t05l0xnqhw9oi2lwvzd5f6lsvsb4izs1kse3xvx694zwxz007pnj8f6n,8^8]->Characters@"J4A51LUHKNYXVMW732ZTCGSFE60Q98PRDB"]
identical, byte to byte? – YOU – 2016-08-17T10:13:25.160
@YOU I don't understand the question... – Beta Decay – 2016-08-17T10:13:55.620
8Remind me to drive through your speed trap. (My number plate contains a letter O.) – Neil – 2016-08-17T12:02:33.733
I asked with the hope of without having to parse images to understand characters, like pattern matching some location of image binaries, nevermind. I actually don't have any clue how to match them. – YOU – 2016-08-17T13:02:20.280
yes, beta, your 0 and O are exactly the same..... – None – 2016-08-17T19:22:26.027
yes, @tuskiomi, they are..... – Beta Decay – 2016-08-17T19:30:33.977
Can we expect any scaling, rotation, skewing, shading, or light letters on dark? – None – 2016-08-17T19:45:31.857
@YiminRong No, all number plates will be exactly the same as those in the images in the question – Beta Decay – 2016-08-17T19:46:15.873
are all of the images the exact same size? – Daniel – 2016-08-17T21:54:04.420
You should add "OCR" or something like that to the title of the challenge so people know what it's about. – Robert Fraser – 2016-08-17T21:55:53.707
3Yes, this question's title is pretty inaccurate. How about "OCR a British license plate"? – Lynn – 2016-08-17T22:01:08.513
3@Neil My UK number plate has both an O and a 0 and they look identical. There are of course rules to determine which is the correct interpretation, but that would be a whole other challenge. – Level River St – 2016-08-18T22:02:03.583
@LevelRiverSt Indeed, I was thinking of posting it to the Sandbox myself. – Neil – 2016-08-19T00:01:54.177
2It's too bad the characters aren't a fixed width. That could make for some very short code possibilities. – GuitarPicker – 2016-08-19T11:44:19.750
I'd like to have http://i.imgur.com/i8jkCJu.png added to the test cases, if that's possible. ;-)
– YetiCGN – 2016-08-20T11:51:23.5271@YetiCGN Your wish is my command ;) – Beta Decay – 2016-08-20T12:32:18.497
Thanks! Now I know why they are called "vanity plates". :-D – YetiCGN – 2016-08-20T12:38:07.913
Do we have to include spaces in our output or is returning just the letters (in order) enough? – Dave – 2016-08-21T10:02:35.333
@Dave No, spaces are not needed – Beta Decay – 2016-08-21T10:06:21.097