Patch the Image

115

37

In a popular image editing software there is a feature, that patches (The term used in image processing is inpainting as @mınxomaτ pointed out.) a selected area of an image, based on the information outside of that patch. And it does a quite good job, considering it is just a program. As a human, you can sometimes see that something is wrong, but if you squeeze your eyes or just take a short glance, the patch seems to fill in the gap quite well.

example by popular image editing software

Challenge

Given an image and a mask that specifies a rectangular area of the image should be patched (also as image, or any other preferred format), your program should try to fill the specified area with a patch that tries to blend in with the rest of the image. The program cannot use the information of the original image that was within the specified area.

You can assume that the patch always is at least it's width away from the sides, and it's height away from the top and the bottom of the image. That means the maximum area of a patch is 1/9 of the whole image.

Please add a brief description on how your algorithm works.

Voting

Voters are asked to judge how well the algorithms perform and vote accordingly.

Some suggestions on how to judge: (Again, thanks @mınxomaτ for the some more criterions.)

  • If you squint your eyes and the picture looks fine?
  • Can you exactly tell where the patch is?
  • How well are structures and textures from the image background and surrounding area continued?
  • How much stray false color pixels does the edited area contain?
  • Are there any uniformly colored blobs/blocks in the area that don't seem to belong there?
  • Does the edited area have any drastic color / contrast or brightness shifts compared to the rest of the image?

Validity criterion

In order for a submission to be valid, the output image must exactly match the input image outside of the specified area.

Test case

On the left the source image, on the right the corresponding mask:

flawr

Posted 2016-01-29T22:06:46.637

Reputation: 40 560

1Can we accept the mask input as text arguments (eg. inpaint.exe left top width height img.jpg)? – mınxomaτ – 2016-01-29T22:21:07.983

1Sure, the input/output format is not really that important, as it is a popularity contest where first of all the performance of your algorithm is important. – flawr – 2016-01-29T22:38:46.460

24This is a very practical challenge. It is possible the results will be better than existing algorithms used in GIMP and other open source image editing software. Fortune, Fame, and Glory could be yours! – Sparr – 2016-01-29T23:06:02.713

6@Sparr and finally ugly watermarks can be removed from downloaded media;) – Andras Deak – 2016-01-29T23:07:49.740

Can we have some clarification on whether inpainting builtins are permitted? – trichoplax – 2016-01-30T23:48:08.533

2Builtins are perfectly ok, I do doubt their going to be very popular though. – flawr – 2016-01-30T23:49:09.967

Are masks always guaranteed to be rectangular? If not, can we have some test cases with non-rectangular masks? – SuperJedi224 – 2016-01-31T03:54:25.940

No, the mask is always rectangular as specified. – flawr – 2016-01-31T10:35:09.920

Answers

143

AutoIt, VB

Introduction

This is an implementation of the Object Removal by Exemplar-Based Inpainting algorithm developed by A. Criminisi, P. Perez (Cambridge Microsoft Research Ltd.) and K. Toyama (Microsoft) [X]. This algorithm is targeted at high-information images (and video frames) and aims to be the balance between structural reconstruction and organic reconstruction. Paragraphs of this answer contain full text quotes from the original paper (since it is no longer officially available) to make this answer more self-contained.

The Algorithm

Goal: Replace a selected (masked) area (preferably a visually separated foreground object) with visually plausible backgrounds.

In previous work, several researchers have considered texture synthesis as a way to fill large image regions with "pure" textures - repetitive two-dimensional textural patterns with moderate stochasticity. This is based on a large body of texture-synthesis research, which seeks to replicate texture ad infinitum, given a small source sample of pure texture [1] [8] [9] [10] [11] [12] [14] [15] [16] [19] [22].

As effective as these techniques are in replicating consistent texture, they have difficulty filling holes in photographs of real-world scenes, which often consist of linear structures and composite textures – multiple textures interacting spatially [23]. The main problem is that boundaries between image regions are a complex product of mutual influences between different textures. In contrast to the two-dimensional nature of pure textures, these boundaries form what might be considered more one-dimensional, or linear, image structures.

Image inpainting techniques fill holes in images by propagating linear structures (called isophotes in the inpainting literature) into the target region via diffusion. They are inspired by the partial differential equations of physical heat flow, and work convincingly as restoration algorithms. Their drawback is that the diffusion process introduces some blur, which is noticeable.

fig. 2

The region to be filled, i.e., the target region is indicated by Ω, and its contour is denoted δΩ. The contour evolves inward as the algorithm progresses, and so we also refer to it as the “fill front”. The source region, Φ, which remains fixed throughout the algorithm, provides samples used in the filling process. We now focus on a single iteration of the algorithm to show how structure and texture are adequately handled by exemplar-based synthesis. Suppose that the square template Ψp ∈ Ω centred at the point p (fig. 2b), is to be filled. The best-match sample from the source region comes from the patch Ψqˆ ∈ Φ, which is most similar to those parts that are already filled in Ψp. In the example in fig. 2b, we see that if Ψp lies on the continuation of an image edge, the most likely best matches will lie along the same (or a similarly coloured) edge (e.g., Ψq' and Ψq'' in fig. 2c). All that is required to propagate the isophote inwards is a simple transfer of the pattern from the best-match source patch (fig. 2d). Notice that isophote orientation is automatically preserved. In the figure, despite the fact that the original edge is not orthogonal to the target contour δΩ, the propagated structure has maintained the same orientation as in the source region.

Implementation and Algorithm Details

The functionality of this implementation is encapsulated in an ActiveX COM DLL which is dropped from the host program as a binary and then invoked on the fly by calling the inpainter by IID. In this specific case, the API is written in VisualBasic and can be called from any COM-enabled language. The following section of the code drops the binary:

Func deflate($e=DllStructCreate,$f=@ScriptDir&"\inpaint.dll")
    If FileExists($f) Then Return
    !! BINARY CODE OMITTED FOR SIZE REASONS !!
    $a=$e("byte a[13015]")
    DllCall("Crypt32.dll","bool","CryptStringToBinaryA","str",$_,"int",0,"int",1,"struct*",$a,"int*",13015,"ptr",0,"ptr",0)
    $_=$a.a
    $b=$e('byte a[13015]')
    $b.a=$_
    $c=$e("byte a[14848]")
    DllCall("ntdll.dll","int","RtlDecompressBuffer","int",2,"struct*",$c,"int",14848,"struct*",$b,"int",13015,"int*",0)
    $d=FileOpen(@ScriptDir&"\inpaint.dll",18)
    FileWrite($d,Binary($c.a))
    FileClose($d)
EndFunc

The library is later instantiated using the CLSID and IID:

Local $hInpaintLib = DllOpen("inpaint.dll")
Local $oInpaintLib = ObjCreate("{3D0C8F8D-D246-41D6-BC18-3CF18F283429}", "{2B0D9752-15E8-4B52-9569-F64A0B12FFC5}", $hInpaintLib)

The library accepts a GDIOBJECT handle, specifically a DIBSection of any GDI/+ bitmap (files, streams etc.). The specified image file is loaded, and drawn onto an empty bitmap constructed from Scan0 from the input image dimensions.

The input file for this implementation is any GDI/+ compatible file format containing masked image data. The mask(s) is one or more uniformly colored regions in the input image. The user supplies a RGB color value for the mask, only pixels with exactly that color value will be matched. The default masking color is green (0, 255, 0). All masked regions together represent the target region, Ω, to be removed and filled. The source region, Φ, is defined as the entire image minus the target region (Φ = I−Ω).

Next, as with all exemplar-based texture synthesis [10], the size of the template window Ψ (aka "scan radius") must be specified. This implementation provides a default window size of 6² pixels, but in practice require the user to set it to be slightly larger than the largest distinguishable texture element, or “texel”, in the source region. An additional modification to the original algorithm is the user-definable "block size" which determines the area of pixels to be replaced with a new uniform color. This increases speed and decreases quality. Block sizes geater than 1px are meant to be used with extremely uniform areas (water, sand, fur etc.), however, Ψ should be kept at max. .5x the block size (which can be impossible depending on the mask).

To not stall the algorithm on 1bit images, every time an image with less than 5 colors is received, the window size is amplified by 10x.

Once these parameters are determined, the remainder of the region-filling process is completely automatic. In our algorithm, each pixel maintains a colour value (or “empty”, if the pixel is unfilled) and a confidence value, which reflects our confidence in the pixel value, and which is frozen once a pixel has been filled. During the course of the algorithm, patches along the fill front are also given a temporary priority value, which determines the order in which they are filled. Then, our algorithm iterates the following three steps until all pixels have been filled.

Step 1: Computing patch priorities

Filling order is crucial to non-parametric texture synthesis [1] [6] [10] [13]. Thus far, the default favorite has been the “onion peel” method, where the target region is synthesized from the outside inward, in concentric layers. Our algorithm performs this task through a best-first filling algorithm that depends entirely on the priority values that are assigned to each patch on the fill front. The priority computation is biased toward those patches which are on the continuation of strong edges and which are surrounded by high-confidence pixels, these pixel are the boundary, marked by the value -2. The following code recalculates the priorities:

For j = m_top To m_bottom: Y = j * m_width: For i = m_left To m_right
    If m_mark(Y + i) = -2 Then m_pri(Y + i) = ComputeConfidence(i, j) * ComputeData(i, j)
Next i: Next j

Given a patch Ψp centered at the point p for some p ∈ δΩ (see fig. 3), its priority P(p) is defined as the product of the calculated confidence (ComputeConfidence, or C(p)) and the data term (ComputeData, or D(p)), where

, where

|Ψp| is the area of Ψp, α is a normalization factor (e.g., α = 255 for a typical grey-level image), and np is a unit vector orthogonal to the front δΩ in the point p. The priority is computed for every border patch, with distinct patches for each pixel on the boundary of the target region.

Implemented as

Private Function ComputeConfidence(ByVal i As Long, ByVal j As Long) As Double
    Dim confidence As Double
    Dim X, Y As Long

    For Y = IIf(j - Winsize > 0, j - Winsize, 0) To IIf(j + Winsize < m_height - 1, j + Winsize, m_height - 1): For X = IIf(i - Winsize > 0, i - Winsize, 0) To IIf(i + Winsize < m_width - 1, i + Winsize, m_width - 1)
        confidence = confidence + m_confid(Y * m_width + X)
    Next X: Next Y

    ComputeConfidence = confidence / ((Winsize * 2 + 1) * (Winsize * 2 + 1))
End Function

Private Function ComputeData(ByVal i As Long, ByVal j As Long) As Double
    Dim grad As CPOINT
    Dim temp As CPOINT
    Dim grad_T As CPOINT
    Dim result As Double
    Dim magnitude As Double
    Dim max As Double
    Dim X As Long
    Dim Y As Long
    Dim nn As CPOINT
    Dim Found As Boolean
    Dim Count, num As Long
    Dim neighbor_x(8) As Long
    Dim neighbor_y(8) As Long
    Dim record(8) As Long
    Dim n_x As Long
    Dim n_y As Long
    Dim tempL As Long
    Dim square As Double

    For Y = IIf(j - Winsize > 0, j - Winsize, 0) To IIf(j + Winsize < m_height - 1, j + Winsize, m_height - 1): For X = IIf(i - Winsize > 0, i - Winsize, 0) To IIf(i + Winsize < m_width - 1, i + Winsize, m_width - 1)
        If m_mark(Y * m_width + X) >= 0 Then
            Found = False
            Found = m_mark(Y * m_width + X + 1) < 0 Or m_mark(Y * m_width + X - 1) < 0 Or m_mark((Y + 1) * m_width + X) < 0 Or m_mark((Y - 1) * m_width + X) < 0
            If Found = False Then
                temp.X = IIf(X = 0, m_gray(Y * m_width + X + 1) - m_gray(Y * m_width + X), IIf(X = m_width - 1, m_gray(Y * m_width + X) - m_gray(Y * m_width + X - 1), (m_gray(Y * m_width + X + 1) - m_gray(Y * m_width + X - 1)) / 2#))
                temp.Y = IIf(Y = 0, m_gray((Y + 1) * m_width + X) - m_gray(Y * m_width + X), IIf(Y = m_height - 1, m_gray(Y * m_width + X) - m_gray((Y - 1) * m_width + X), (m_gray((Y + 1) * m_width + X) - m_gray((Y - 1) * m_width + X)) / 2#))
                magnitude = temp.X ^ 2 + temp.Y ^ 2
                If magnitude > max Then
                    grad.X = temp.X
                    grad.Y = temp.Y
                    max = magnitude
                End If
            End If
        End If
    Next X: Next Y

    grad_T.X = grad.Y
    grad_T.Y = -grad.X

    For Y = IIf(j - 1 > 0, j - 1, 0) To IIf(j + 1 < m_height - 1, j + 1, m_height - 1): For X = IIf(i - 1 > 0, i - 1, 0) To IIf(i + 1 < m_width - 1, i + 1, m_width - 1): Count = Count + 1
        If X <> i Or Y <> j Then
            If m_mark(Y * m_width + X) = -2 Then
                num = num + 1
                neighbor_x(num) = X
                neighbor_y(num) = Y
                record(num) = Count
            End If
        End If
    Next X: Next Y

    If num = 0 Or num = 1 Then
        ComputeData = Abs((0.6 * grad_T.X + 0.8 * grad_T.Y) / 255)
    Else
        n_x = neighbor_y(2) - neighbor_y(1)
        n_y = neighbor_x(2) - neighbor_x(1)
        square = CDbl(n_x ^ 2 + n_y ^ 2) ^ 0.5
        ComputeData = Abs((IIf(n_x = 0, 0, n_x / square) * grad_T.X + IIf(n_y = 0, 0, n_y / square) * grad_T.Y) / 255)
    End If
End Function

The confidence term C(p) may be thought of as a measure of the amount of reliable information surrounding the pixel p. The intention is to fill first those patches which have more of their pixels already filled, with additional preference given to pixels that were filled early on (or that were never part of the target region).

This automatically incorporates preference towards certain shapes along the fill front. For example, patches that include corners and thin tendrils of the target region will tend to be filled first, as they are surrounded by more pixels from the original image. These patches provide more reliable information against which to match. Conversely, patches at the tip of “peninsulas” of filled pixels jutting into the target region will tend to be set aside until more of the surrounding pixels are filled in. At a coarse level, the term C(p) of (1) approximately enforces the desirable concentric fill order.

As filling proceeds, pixels in the outer layers of the target region will tend to be characterized by greater confidence values, and therefore be filled earlier; pixels in the center of the target region will have lesser confidence values. The data term D(p) is a function of the strength of isophotes hitting the front δΩ at each iteration. This term boosts the priority of a patch that an isophote "flows" into. This factor is of fundamental importance in our algorithm because it encourages linear structures to be synthesized first, and, therefore propagated securely into the target region. Broken lines tend to connect, thus realizing the "Connectivity Principle" of vision psychology [7] [17].

The fill order is dependent on image properties, resulting in an organic synthesis process that eliminates the risk of “broken-structure” artifacts and also reduces blocky artifacts without an expensive patch-cutting step [9] or a blur-inducing blending step [19].

Step 2: Propagating texture and structure information

Once all priorities on the fill front (boundary) have been computed, the patch Ψpˆ with highest priority is found. We then fill it with data extracted from the source region Φ. We propagate image texture by direct sampling of the source region. Similar to [10], we search in the source region for that patch which is most similar to Ψpˆ. Formally,

, where

the distance d(Ψa, Ψb) between two generic patches Ψa and Ψb is simply defined as the sum of squared differences (SSD) of the already filled pixels in the two patches. No further analysis or manipulation (especially no blurring) is done in this step. This calculation runs in the main cycle loop and is implemented as follows:

Getting the maximum priority:

For j = m_top To m_bottom: Jidx = j * m_width: For i = m_left To m_right
    If m_mark(Jidx + i) = -2 And m_pri(Jidx + i) > max_pri Then
        pri_x = i
        pri_y = j
        max_pri = m_pri(Jidx + i)
    End If
Next i: Next j

Finding the most similar patch:

min = 99999999

For j = PatchT To PatchB: Jidx = j * m_width: For i = PatchL To PatchR
    If m_source(Jidx + i) Then
        sum = 0
        For iter_y = -Winsize To Winsize: target_y = pri_y + iter_y
            If target_y > 0 And target_y < m_height Then
                target_y = target_y * m_width: For iter_x = -Winsize To Winsize: target_x = pri_x + iter_x
                    If target_x > 0 And target_x < m_width Then
                        Tidx = target_y + target_x
                        If m_mark(Tidx) >= 0 Then
                            source_x = i + iter_x
                            source_y = j + iter_y
                            Sidx = source_y * m_width + source_x
                            temp_r = m_r(Tidx) - m_r(Sidx)
                            temp_g = m_g(Tidx) - m_g(Sidx)
                            temp_b = m_b(Tidx) - m_b(Sidx)
                            sum = sum + temp_r * temp_r + temp_g * temp_g + temp_b * temp_b
                        End If
                    End If
                Next iter_x
            End If
        Next iter_y

        If sum < min Then: min = sum: patch_x = i: patch_y = j
    End If
Next i: Next j

Step 3: Updating confidence values

After the patch Ψpˆ has been filled with new pixel values, the confidence C(p) is updated in the area delimited by Ψpˆ as follows:

This simple update rule allows us to measure the relative confidence of patches on the fill front, without image-specific parameters. As filling proceeds, confidence values decay, indicating that we are less sure of the color values of pixels near the center of the target region. Implemented here (along with all other necessary updates):

x0 = -Winsize
For iter_y = -Winsize To Winsize: For iter_x = -Winsize To Winsize
    x0 = patch_x + iter_x
    y0 = patch_y + iter_y
    x1 = pri_x + iter_x
    y1 = pri_y + iter_y
    X1idx = y1 * m_width + x1
    If m_mark(X1idx) < 0 Then
        X0idx = y0 * m_width + x0
        PicAr1(x1, y1) = m_color(X0idx)
        m_color(X1idx) = m_color(X0idx)
        m_r(X1idx) = m_r(X0idx)
        m_g(X1idx) = m_g(X0idx)
        m_b(X1idx) = m_b(X0idx)
        m_gray(X1idx) = CDbl((m_r(X0idx) * 3735 + m_g(X0idx) * 19267 + m_b(X0idx) * 9765) / 32767)
        m_confid(X1idx) = ComputeConfidence(pri_x, pri_y)
    End If
Next iter_x: Next iter_y

For Y = IIf(pri_y - Winsize - 2 > 0, pri_y - Winsize - 2, 0) To IIf(pri_y + Winsize + 2 < m_height - 1, pri_y + Winsize + 2, m_height - 1): Yidx = Y * m_width: For X = IIf(pri_x - Winsize - 2 > 0, pri_x - Winsize - 2, 0) To IIf(pri_x + Winsize + 2 < m_width - 1, pri_x + Winsize + 2, m_width - 1)
    m_mark(Yidx + X) = IIf(PicAr1(X, Y).rgbRed = MaskRed And PicAr1(X, Y).rgbgreen = MaskGreen And PicAr1(X, Y).rgbBlue = MaskBlue, -1, Source)
Next X: Next Y

For Y = IIf(pri_y - Winsize - 2 > 0, pri_y - Winsize - 2, 0) To IIf(pri_y + Winsize + 2 < m_height - 1, pri_y + Winsize + 2, m_height - 1): Yidx = Y * m_width: For X = IIf(pri_x - Winsize - 2 > 0, pri_x - Winsize - 2, 0) To IIf(pri_x + Winsize + 2 < m_width - 1, pri_x + Winsize + 2, m_width - 1)
    If m_mark(Yidx + X) = -1 Then
        Found = (Y = m_height - 1 Or Y = 0 Or X = 0 Or X = m_width - 1) Or m_mark(Yidx + X - 1) = Source Or m_mark(Yidx + X + 1) = Source Or m_mark((Y - 1) * m_width + X) = Source Or m_mark((Y + 1) * m_width + X) = Source
        If Found Then: Found = False: m_mark(Yidx + X) = -2
    End If
Next X: Next Y

For i = IIf(pri_y - Winsize - 3 > 0, pri_y - Winsize - 3, 0) To IIf(pri_y + Winsize + 3 < m_height - 1, pri_y + Winsize + 3, m_height - 1): Yidx = i * m_width: For j = IIf(pri_x - Winsize - 3 > 0, pri_x - Winsize - 3, 0) To IIf(pri_x + Winsize + 3 < m_width - 1, pri_x + Winsize + 3, m_width - 1)
    If m_mark(Yidx + j) = -2 Then m_pri(Yidx + j) = ComputeConfidence(j, i) * ComputeData(j, i)
Next j: Next i

Complete Code

Here's the run-able code, complete with the libraries' source code as comments.

The code is invoked by

inpaint(infile, outfile, blocksize, windowsize, r, g, b)

Examples are included in the form of

;~ inpaint("gothic_in.png", "gothic_out.png")
;~ inpaint("starry_in.png", "starry_out.png")
;~ inpaint("scream_in.png", "scream_out.png")
;~ inpaint("mona_in.png", "mona_out.png")
;~ inpaint("maze_in.png", "maze_out.png")
;~ inpaint("checker_in.png", "checker_out.png")

just uncomment the example you want to run using CTRL+Q.

Official Test Files

This algorithm is made to be adjusted for each image. Therefore, the default values (and also the default masks) are completely suboptimal. The default values are chosen so that every sample can be processed in a reasonable amount of time. I highly recommend to play with irregularly shaped masks and better window sizes. Click the images to enlarge!

Checkerboard

American Gothic

Maze

Mona Lisa

(terrible mask)

Scream

Starry

Real-World Examples

These all use custom hand-drawn masks.

If you have other interesting images you'd like to see included, leave a comment.

EBII Improvements

There are multiple variants of EBII out there, created by various researchers. AnkurKumar Patel caught my attention with his collection of papers [24] on various EBII improvements.

Specifically the paper "Improved Robust Algorithm For Exemplar Based Image Inpainting" [25] mentions two improvements on the weighing of the priority values.

The improvement

The effective modification is in Step 1 (see above) of the algorithm, and extends the C(p) and D(p) effect on the priority rating for this pixel using this:

In the formula for C and D given above, and are respectively the normalization factor (e.g., α = 255), the isophote vector, and the unit vector orthogonal to the front in the point p.

Further,

The priority function is defined as the weight sum of the regularized confidence term C(p) and the new data term D(p). Where α is the adjustment coefficient, satisfying 0Rp(p) is defined follows:

Where α and β are respectively the component weights of the confidence and the data terms. Note that α+β=1.

Objective scoring

What is really interesting though is that this paper contains a proposed (and simple!) method for scoring the performance if EBII algorithms. Take this with a grain of salt though, as this is a method chosen by the paper authors themselves to verify the effectiveness of the proposed variance approach and the improvement on several images.

The result evaluation is performed by comparing the PSNR (the Peak Signal-to-Noise Ratio [26]) between the restored image and original image. Generally the higher the PSNR value the larger the similarity of the repaired image to the original. The equation to calculate the PSNR is as follows:

These are the staggering 2 (two!) real-world test images they used:

The conclusion is as disappointing as the quality of the paper itself. It shows very little improvement. The main thing here is a possible object scoring method for this kind of challenge (and other image-repair challenges):

+-------+---------------+----------+
| Image | EBII Original | Improved |
+-------+---------------+----------+
|     1 |       52.9556 |  53.7890 |
|     2 |       53.9098 |  53.8989 |
+-------+---------------+----------+

Meh.

Research to be done

(Specific to EBII)

a) Pre-Processing

Everything comes down to the "Magic Erase" principle that the algorithm should "just work" for everything. My naive solution for this is a color-based amplification (see above), but there are better ways. I'm thinking of recognizing the geometric mean of all traceable texels to auto-adjust the window size and making the stamp size (also my improvement) dependent on texel- and whole-image resolution. Research has to be done here.

b) Post-Processing

The original authors already did a great job of debunking all post processing filters that come in mind. Today, I tried something else, inspired by the always uncanny Mona Lisa (thanks undergroundmonorail). If you take just the inpainted region and apply a new mask to all strange blocks of color and feed that into a despeckling algorithm, you're left with an almost perfect result. I may explore this some time in the future.


[X] — Object Removal by Exemplar-Based Inpainting by A. Criminisi, P. Perez, K. Toyama
[1] — M. Ashikhmin. Synthesizing natural textures. In Proc. ACM Symp. on Interactive 3D Graphics, pp. 217–226, Research Triangle Park, NC, Mar 2001.
[5] — M. Bertalmio, L. Vese, G. Sapiro, and S. Osher. Simultaneous structure and texture image inpainting. to appear, 2002
[6] — R. Bornard, E. Lecan, L. Laborelli, and J-H. Chenot. Missing data correction in still images and image sequences. In ACM Multimedia, France, Dec 2002.
[7] — T. F. Chan and J. Shen. Non-texture inpainting by curvature-driven diffusions (CDD). J. Visual Comm. Image Rep., 4(12), 2001.
[8] — J.S. de Bonet. Multiresolution sampling procedure for analysis and synthesis of texture images. In Proc. ACM Conf. Comp. Graphics (SIGGRAPH), volume 31, pp. 361–368, 1997.
[9] — A. Efros and W.T. Freeman. Image quilting for texture synthesis and transfer. In Proc. ACM Conf. Comp. Graphics (SIGGRAPH), pp. 341–346, Eugene Fiume, Aug 2001.
[10] — A. Efros and T. Leung. Texture synthesis by non-parametric sampling. In Proc. ICCV, pp. 1033–1038, Kerkyra, Greece, Sep 1999.
[11] — W.T. Freeman, E.C. Pasztor, and O.T. Carmichael. Learning low level vision. Int. J. Computer Vision, 40(1):25–47, 2000.
[12] — D. Garber. Computational Models for Texture Analysis and Texture Synthesis. PhD thesis, Univ. of Southern California, USA, 1981.
[13] — P. Harrison. A non-hierarchical procedure for re-synthesis of complex texture. In Proc. Int. Conf. Central Europe Comp. Graphics, Visua. and Comp. Vision, Plzen, Czech Republic, Feb 2001.
[14] — D.J. Heeger and J.R. Bergen. Pyramid-based texture analysis/synthesis. In Proc. ACM Conf. Comp. Graphics (SIGGRAPH), volume 29, pp. 229–233, Los Angeles, CA, 1995.
[15] — A. Hertzmann, C. Jacobs, N. Oliver, B. Curless, and D. Salesin. Image analogies. In Proc. ACM Conf. Comp. Graphics (SIGGRAPH), Eugene Fiume, Aug 2001.
[16] — H. Igehy and L. Pereira. Image replacement through texture synthesis. In Proc. Int. Conf. Image Processing, pp. III:186–190, 1997.
[17] — G. Kanizsa. Organization in Vision. Praeger, New York, 1979.
[19] — L. Liang, C. Liu, Y.-Q. Xu, B. Guo, and H.-Y. Shum. Real-time texture synthesis by patch-based sampling. In ACM Transactions on Graphics, 2001.
[22] — L.-W. Wey and M. Levoy. Fast texture synthesis using tree-structured vector quantization. In Proc. ACM Conf. Comp. Graphics (SIGGRAPH), 2000.
[23] — A. Zalesny, V. Ferrari, G. Caenen, and L. van Gool. Parallel composite texture synthesis. In Texture 2002 workshop - (in conjunction with ECCV02), Copenhagen, Denmark, Jun 2002.
[24] — AkurKumar Patel, Gujarat Technological University, Computer Science and Engineering
[25] — Improved Robust Algorithm For Exemplar Based Image Inpainting
[26] — Wikipedia, Peak-Signal-to-Noise-Ratio

mınxomaτ

Posted 2016-01-29T22:06:46.637

Reputation: 7 398

30This is awesome. The Starry Night is so good. Still, that Mona Lisa... – Hannes Karppila – 2016-01-31T07:12:30.207

8First let me say "oh my god this is incredible". Second: I've already commented "That Mona Lisa is some SCP shit" on another question here, but that owl actually looks like something that could appear on the SCP wiki. – undergroundmonorail – 2016-01-31T19:58:01.873

Can you rehost the images? It seems like all links are broken... – MKII – 2016-02-01T09:51:08.707

@mınxomaτ Uhm. Might be my network or something. NVM then. – MKII – 2016-02-01T10:19:58.173

3Could the quoted paragraphs you mention be made quote blocks? – trichoplax – 2016-02-01T12:20:40.880

1@trichoplax There are minor modifications in almost every sentence, they're not exact quotes. Consider the algorithm description the same as the org. paper except when it says modification or code. I don't want to clutter the formatting any more :) – mınxomaτ – 2016-02-01T12:37:16.730

2When I tried to look at something very carefully in my dreams, sometimes things turn to be exactly like this. – jimmy23013 – 2016-02-02T11:43:33.160

How long does it run? I already waited a few minutes and I'm not sure I did it right. – jimmy23013 – 2016-02-06T12:25:22.977

@jimmy23013 I made you a ZIP for convenience. Open the wef.au3 in SciTe and run it with F5. This will take a while and produce the output files in the same directory. (If in doubt, compile the script with F7 and start wef.exe using an admin console). Either way, while it is running, the console output should look like this. As you can see it takes a while :).

– mınxomaτ – 2016-02-06T12:59:44.700

It worked for the standard test cases. And it turns out that the area to be patched was too big in my images, in most cases. But it also errors for images such as a completely black one, and runs very slowly for this: http://i.stack.imgur.com/HYwXJ.png

– jimmy23013 – 2016-02-06T14:08:23.597

@jimmy23013 Well, there's almost no information in that image. This algo is made for megapixel real-word photos. :). It's also very fast with the right parameters and very slow with the wrong ones. You'd probably want to write a GUI interface for trial&error. – mınxomaτ – 2016-02-06T14:20:02.787

The hand-drawn-real-world examples blew me away – Stan Strum – 2018-04-20T17:01:55.007

45

Matlab

This is a simple interpolation approach. The idea is first mirroring what is on each side of the patch. Then those mirror image pixels are interpolated by how close they are to the corresponding edge:

The tricky part was finding a nice interpolation weights. After some playing around I came up with a rational function that is zero on all edges except the one where we mirrored at. This is then transformed by a third degree polynomial for some smoothing:

This simple approach does surprisingly well on "natural" images, but as soon as you are confronted with sharp edges, the game is over. In the american gothic example the spikes of the hay fork line up nicely with the pixel grid, which makes it look quite nice, but it would have been much worse otherwise.

So here the results:

And finally, the code:

imgfile= 'filename.png';
maskfile = [imgfile(1:end-4),'_mask.png'];
img = double(imread(imgfile));
mask = rgb2gray(imread(maskfile));
%% read mask
xmin = find(sum(mask,1),1,'first');
xmax = find(sum(mask,1),1,'last');
ymin = find(sum(mask,2),1,'first');
ymax = find(sum(mask,2),1,'last');
%% weight transformation functiosn
third = @(x)-2* x.^3 + 3* x.^2;
f=@(x)third(x);
w=@(x,y)y.*(x-1).*(y-1)./( (x+y).*(x+1-y));

for x=xmin:xmax
    for y=ymin:ymax
        %Left Right Up Down;
        P = [img(y,xmin-(x-xmin)-1,:);img(y,xmax+(xmax-x)+1,:);img(ymin-(y-ymin)-1,x,:);img(ymax+(ymax-y)+1,x,:)];
        % normalize coordinates
        rx = (x-xmin)/(xmax-xmin); 
        ry = (y-ymin)/(ymax-ymin);
        % calculate the weights
        W = [w(rx,ry),w(1-rx,ry),w(ry,rx),w(1-ry,rx)]';
        W = f(W);
        W(isnan(W))=1;
        img(y,x,:) = sum(bsxfun(@times,P,W),1)/sum(W); 
    end
end
imshow(img/255);
imwrite(img/255,[imgfile(1:end-4),'_out.png']);

flawr

Posted 2016-01-29T22:06:46.637

Reputation: 40 560

10Mona Lisa creeps me out. – Andras Deak – 2016-01-30T00:17:11.640

46Ḿ̳̜͇͓͠o̢̎̓̀ǹ̰͎̣͙a̤̩̖̞̝ͧ̈ͤͤ ̣̖̠̮̘̹̠̾̇ͣL͉̻̭͌i̛̥͕̱͋͌ş̠͔̏̋̀ạ̫͕͎ͨͮͪ̐͡ͅ – mınxomaτ – 2016-01-30T00:20:41.187

8That Mona Lisa is some SCP shit – undergroundmonorail – 2016-01-30T14:17:06.020

1The checkered image looks really cool IMHO. – ETHproductions – 2016-01-30T16:45:08.540

1I wouldn't be surprised if you won your own challenge with this. This is a really nice solution. – Alex A. – 2016-01-30T18:19:42.097

King Diamond meets Mona Lisa – Display Name – 2016-07-16T15:50:14.047

25

Mathematica

This uses Mathematica's Inpaint function. Because Mathematica itself does all the heavy lifting, this is a community wiki.

inPaint (below) is a simple adaptation of Inpaint. For colored paintings/photos, it uses the default, "TextureSynthesis", setting. If it detects that the picture is black and white (because the image data of the picture is the same as the image data of the binary form of the picture), it then binarizes the image and applies the "TotalVariation" patch. The If clause either applies Binarize or Identity to the picture. (The Identity function returns its argument unchanged.)

inPaint[picture_, mask_] :=  
 If[bw = ImageData@Rasterize[Binarize[picture]] == ImageData[picture], Binarize, Identity]@
  Inpaint[picture, mask, Method -> If[bw, "TotalVariation", "TextureSynthesis"]]

The image and mask are entered as arguments to inPaint. Partition and Grid are merely for formatting purposes.

input

The outputs have been patched. There was no retouching of the images after inPaint.

output

DavidC

Posted 2016-01-29T22:06:46.637

Reputation: 24 524

4It might be a coincidence, but I'm amazed about the performance of the labyrinth! – flawr – 2016-01-30T23:52:15.253

1

@flawr I'd throw in something like this just to mess with this solution;) (Who knows? Those black-and-whites really are baffling.)

– Andras Deak – 2016-01-31T01:51:33.183

17I don't think this should be a community wiki. – Dennis – 2016-01-31T03:16:07.897

lawr, Yes, Inpaint seems to look for symmetries across the whole black and white image. – DavidC 9 hours ago – DavidC – 2016-01-31T13:21:58.140

Are you sure the black-and-white algorithm doesn't involve sacrificing goats anywhere? How ---on Earth--- the hell does it guess the central structure of the image, if all of it is masked?? – Andras Deak – 2016-02-05T18:13:58.623

I agree with Dennis. This is worth the rep you'd gain... – mbomb007 – 2016-02-05T19:02:29.357

@mbomb007, I would consider having it treated as regular submission but I don't believe it is possible to reclassify it. – DavidC – 2016-02-05T19:29:15.257

Andras, I'm also puzzled by how it fills in unique regions of the black and white images. I'll experiment with that more systematically to see what I can find out. – DavidC – 2016-02-05T19:31:55.370

18

Python 2 and PIL

This program blends copies of regions North, South, East, and West to create replacement pixels that use colours, textures, and shading from the local image region.

The example outputs:

The code first finds the bounding box for the patch. Then for each pixel to be generated, it computes the colour of each channel (RGB) based on the weighted sum of the 4 surrounding regions.

import sys
from PIL import Image

infile, maskfile, outfile = sys.argv[1:4]
imageobj = Image.open(infile)
maskobj = Image.open(maskfile)
image = imageobj.load()
mask = maskobj.load()

assert imageobj.size == maskobj.size
W, H = imageobj.size
pixels = [(x,y) for x in range(W) for y in range(H)]
whitepart = [xy for xy in pixels if sum(mask[xy]) > 230*3]
xmin = min(x for x,y in whitepart)
xmax = max(x for x,y in whitepart)
ymin = min(y for x,y in whitepart)
ymax = max(y for x,y in whitepart)
xspan = xmax - xmin + 1
yspan = ymax - ymin + 1

def mkcolor(channel):
    value = image[(xmin-dx, y)][channel] * 0.5*(xspan - dx)/xspan
    value += image[(xmax+1 + xspan - dx, y)][channel] * 0.5*dx/xspan
    value += image[(x, ymin-dy)][channel] * 0.5*(yspan - dy)/yspan
    value += image[(x, ymax+1 + yspan - dy)][channel] * 0.5*dy/yspan
    return int(value)

for dx in range(xspan):
    for dy in range(yspan):
        x = xmin + dx
        y = ymin + dy
        image[(x, y)] = (mkcolor(0), mkcolor(1), mkcolor(2))

imageobj.save(outfile)

Logic Knight

Posted 2016-01-29T22:06:46.637

Reputation: 6 622

3This Mona Lisa is also terrifying! Are all Mona Lisas in this challenge doomed to be scary?? – undergroundmonorail – 2016-01-30T14:19:02.163

@undergroundmonorail I guess computer-generated accidental faces come right from the depths of the uncanny valley.

– Andras Deak – 2016-01-30T21:02:03.410

Where do you get PIL from? – Elliot A. – 2016-01-31T16:06:55.640

@ElliotA. My understanding is that PIL proper is dead, but it was open source and so it lives on under the name "Pillow". If you google "python pillow" you should find it. – undergroundmonorail – 2016-01-31T19:58:57.673

13

Python 3, PIL

This program uses the sobel operator, and draws lines onto the image based on that.

The sobel operator finds the angle of each edge, so any edges extruding into the unknown area should continue.

from PIL import Image, ImageFilter, ImageDraw
import time
im=Image.open('2.png')
im1=Image.open('2 map.png')
a=list(im.getdata())
b=list(im1.getdata())
size=list(im.size)
'''
def dist(a,b):
    d=0
    for x in range(0,3):
        d+=(a[x]-b[x])**2
    return(d**0.5)
#'''
C=[]
d=[]
y=[]
for x in range(0,len(a)):
    if(b[x][0]==255):
        C.append((0,0,0))
    else:
        y=(a[x][0],a[x][1],a[x][2])
        C.append(y)
im1.putdata(C)
k=(-1,0,1,-2,0,2,-1,0,1)
k1=(-1,-2,-1,0,0,0,1,2,1)
ix=im.filter(ImageFilter.Kernel((3,3),k,1,128))
iy=im.filter(ImageFilter.Kernel((3,3),k1,1,128))
ix1=list(ix.getdata())
iy1=list(iy.getdata())
d=[]
im2=Image.new('RGB',size)
draw=ImageDraw.Draw(im2)
c=list(C)
Length=0
for L in range(100,0,-10):
    for x in range(0,size[0]):
        for y in range(0,size[1]):
            n=x+(size[0]*y)
            if(c[n]!=(0,0,0)):
                w=(((iy1[n][0]+iy1[n][1]+iy1[n][2])//3)-128)
                z=(((ix1[n][0]+ix1[n][1]+ix1[n][2])//3)-128)
                Length=(w**2+z**2)**0.5
                if Length==0:
                    w+=1
                    z+=1
                Length=(w**2+z**2)**0.5
                w/=(Length/L)
                z/=(Length/L)
                w=int(w)
                z=int(z)
                draw.line(((x,y,w+x,z+y)),c[n])

d=list(im2.getdata())
S=[]
d1=[]
A=d[0]
for x in range(0,size[0]):
    for y in range(0,size[1]):
        n=y+(size[1]*x)
        nx=y+(size[1]*x)-1
        ny=y+(size[1]*x)-size[0]
        if d[n]==(0,0,0):
            S=[0,0,0]
            for z in range(0,3):
                S[z]=(d[nx][z]+d[ny][z])//2
            #print(S)
            d1.append(tuple(S))
        else:
            d1.append(tuple(d[n]))
d=list(d1)
im2.putdata(d)
#im2=im2.filter(ImageFilter.GaussianBlur(radius=0.5))
d=im2.getdata()
f=[]
#'''
for v in range(0,len(a)):
    if(b[v][0]*b[v][1]*b[v][2]!=0):
        f.append(d[v])
    else:
        f.append(C[v])
#'''
im1.putdata(f)
im1.save('pic.png')

Meanwhile, here are the sample images.

enter image description here

enter image description here

enter image description here

Mona Lisa Mona Lisa Ḿ͠oǹ̰͎̣a ̾̇Lisa Ḿ͠o̢̎̓̀ǹ̰͎̣aͧ̈ͤ ̣̖̠̮̘̹̠̾̇ͣLisa Ḿ̳̜͇͓͠o̢̎̓̀ǹ̰͎̣͙a̤̩̖̞̝ͧ̈ͤͤ ̣̖̠̮̘̹̠̾̇ͣL͉̻̭͌i̛̥͕̱͋͌ş̠͔̏̋̀ạ̫͕͎ͨͮͪ̐͡ͅ

enter image description here The area in the above image is as smooth as a cactus

It is not very good with constant color.

enter image description here

enter image description here

Magenta

Posted 2016-01-29T22:06:46.637

Reputation: 1 322

"...'cause baby you're a fiiiirewooork..." I like that totally new approach. The patches of the first two really just fit into the image! The only unsovled thing are perhaps the sharp edges of the patch. And sometimes I cannot really see where the colour of such and edge is coming from. It also seems the right side is always painted last, perhaps you can alternate which side is worked one. Another suggestion: Insted of stright lines, you could fit polynomials/bezier curves to those lines and exctend them in a not so straight way! – flawr – 2016-02-02T09:39:52.703

1Oh, and could you please add the B/W test cases? – flawr – 2016-02-02T09:43:26.103

2Looks really good on the Starry Night one. – SuperJedi224 – 2016-02-02T15:19:21.297

1Wow, this looks incredible now! The patches are still noticeable but a great new idea! My favourite so far=) – flawr – 2016-02-05T17:15:32.277

8+1 for " Mona Lisa Mona Lisa Ḿ͠oǹ̰͎̣a ̾̇Lisa Ḿ͠o̢̎̓̀ǹ̰͎̣aͧ̈ͤ ̣̖̠̮̘̹̠̾̇ͣLisa Ḿ̳̜͇͓͠o̢̎̓̀ǹ̰͎̣͙a̤̩̖̞̝ͧ̈ͤͤ ̣̖̠̮̘̹̠̾̇ͣL͉̻̭͌i̛̥͕̱͋͌ş̠͔̏̋̀ạ̫͕͎ͨͮͪ̐͡ͅ " – mbomb007 – 2016-02-05T18:53:33.340

2You win the "scariest Mona Lisa" contest, IMO. 0_o – DLosc – 2017-01-20T22:59:02.390

8

Python 2

Simple python script that creates patch using values from pixels just outside gap. It takes color values from end of pixels' row and column and calculates weighted average using distance from those pixels.

Output is not so pretty, but it's art.

img1 img2 img3 img4 img5 img6

And then, code:

IMGN = "6"

IMGFILE = "images/img%s.png" % (IMGN,)
MASKFILE = "images/img%s_mask.png" % (IMGN,)

BLUR = 5


def getp(img,pos):
    return img.get_at(pos)[:3]
def setp(img,pos,color):
    img.set_at(pos, map(int, color))

def pixelavg(L):
    return map(int, [sum([i[q] for i in L])/float(len(L)) for q in [0,1,2]])
def pixelavg_weighted(L, WL):   # note: "inverse" weights. More weight => less weight
    # colors [sum, max]
    color_data = [[0, 0], [0, 0], [0, 0]]
    for color,weight in zip(L, WL):
        for i in [0, 1, 2]: # r,g,b
            color_data[i][0] += inv_w_approx(weight) * color[i]
            color_data[i][1] += inv_w_approx(weight) * 255
    return [255*(float(s)/m) for s,m in color_data]
def inv_w_approx(x):
    return (1.0/(x+1e-10))

import pygame
image = pygame.image.load(IMGFILE)
mask = pygame.image.load(MASKFILE)

size = image.get_size()
assert(size == mask.get_size())

# get square from mask
min_pos = None
max_pos = [0, 0]
for x in range(size[0]):
    for y in range(size[1]):
        if getp(mask, [x, y]) == (255, 255, 255):
            if min_pos == None:
                min_pos = [x, y]
            max_pos = [x, y]
if not min_pos:
    exit("Error: no mask found.")
# patch area info
patch_position = min_pos[:]
patch_size = [max_pos[0]-min_pos[0], max_pos[1]-min_pos[1]]

# remove pixels from orginal image (fill black)
for dx in range(patch_size[0]):
    for dy in range(patch_size[1]):
        setp(image, [patch_position[0]+dx, patch_position[1]+dy], [0, 0, 0])

# create patch
patch = pygame.Surface(patch_size)

# take pixels around the patch
top = [getp(image, [patch_position[0]+dx, patch_position[1]-1]) for dx in range(patch_size[0])]
bottom = [getp(image, [patch_position[0]+dx, patch_position[1]+patch_size[1]+1]) for dx in range(patch_size[0])]
left = [getp(image, [patch_position[0]-1, patch_position[1]+dy]) for dy in range(patch_size[1])]
right = [getp(image, [patch_position[0]+patch_size[0]+1, patch_position[1]+dy]) for dy in range(patch_size[1])]

cpixels = top+left+right+bottom

# set area to average color around it
average = [sum([q[i] for q in cpixels])/float(len(cpixels)) for i in [0, 1, 2]]

for dx in range(patch_size[0]):
    for dy in range(patch_size[1]):
        setp(patch, [dx, dy], average)

# create new pixels
for dx in range(patch_size[0]):
    for dy in range(patch_size[1]):
        setp(patch, [dx, dy], pixelavg_weighted([top[dx], bottom[dx], left[dy], right[dy]], [dy, patch_size[1]-dy, dx, patch_size[0]-dx]))

# apply patch
for dx in range(patch_size[0]):
    for dy in range(patch_size[1]):
        setp(image, [patch_position[0]+dx, patch_position[1]+dy], getp(patch, [dx, dy]))

# blur patch?
for r in range(BLUR):
    for dx in range(patch_size[0]):
        for dy in range(patch_size[1]):
            around = []
            for ddx in [-1,0,1]:
                for ddy in [-1,0,1]:
                    around.append(getp(image, [patch_position[0]+dx+ddx, patch_position[1]+dy+ddy]))
            setp(patch, [dx, dy], pixelavg(around))

    # apply blurred patch
    for dx in range(patch_size[0]):
        for dy in range(patch_size[1]):
            setp(image, [patch_position[0]+dx, patch_position[1]+dy], getp(patch, [dx, dy]))

# save result
pygame.image.save(image, "result.png")

Hannes Karppila

Posted 2016-01-29T22:06:46.637

Reputation: 3 090

At the moment you see those horizontal/vertical stripes, perhaps you can improve it by including other directions! – flawr – 2016-01-30T23:56:30.790

I actually tried it, but I was unable to get good results, so I just decided just to blur image :D – Hannes Karppila – 2016-01-30T23:57:41.027

19Finally, a Mona Lisa that doesn't scare me to death, but looks like an arrested murderer instead. – Andras Deak – 2016-01-31T01:49:47.180

6

Mathematica

Inpaint

It just so happens that Mathematica has a built-in function that performs exactly this task, and I mean exactly:

Inpaint[image, region]

  • retouches parts of image that correspond to nonzero elements in region.

By default it uses a "best-fit texture synthesis method using random sampling" which produces good results on the paintings, but poor results for the maze and checkerboard:

enter image description here enter image description here enter image description here enter image description here enter image description here enter image description here

Playing around with the settings didn't net me an increase in quality across all of the images, so I just used the defaults (to save bytes—this is codegolf.se after all!).

2012rcampion

Posted 2016-01-29T22:06:46.637

Reputation: 1 319

23"It just so happens that Mathematica has a built-in function"... surprise, surprise;) – Andras Deak – 2016-01-30T02:04:57.800

For the maze and checker board it's better to use the "TotalVariation" method along with Binarize (to eliminate gray smudges). Try this: methods = {"TextureSynthesis", "Diffusion", "FastMarching", "NavierStokes", "TotalVariation"};g[pic_, mask_] := Join[{Labeled[Framed@pic, "Original"]}, Labeled[ Binarize@Inpaint[pic, mask, Method -> #], #] & /@ methods] – DavidC – 2016-01-30T02:07:41.000

@DavidC I tried the other methods, but only TextureSynthesis looks good on the paintings; and I don't think we're allowed to tune our settings for each individual test case. (If we could, then we could trivially supply the missing portion as a 'setting'.) – 2012rcampion – 2016-01-30T02:31:35.960

The maze and checkerboard results are really puzzling to me. Why is Mathematica's reconstruction of the missing region so irregular and asymmetric? – David Zhang – 2016-01-30T04:16:31.153

This will automatically detect whether an image is black and white and make the appropriate adjustments (binaries and "TotalVariation" method). inPaint[picture_, mask_] := If[bw = ImageData@Rasterize[Binarize[picture]] == ImageData[picture], Binarize, Identity]@ Inpaint[picture, mask, Method -> If[bw, "TotalVariation", "TextureSynthesis"]] – DavidC – 2016-01-30T05:01:09.687

@DavidC You should post that as your own answer! – 2012rcampion – 2016-01-30T06:31:59.283

In this case I think it's better to have just one answer. Maybe make this a community wiki? (By the way: This is not code-golf; it's a popularity-contest. So the size of the code is not at issue. ) – DavidC – 2016-01-30T13:49:53.627

5

Python3

This answer implements the idea in the paper "Deep Image Prior" by Ulyanov et al. (CVPR 2018) In this paper they explored the idea that the way well performing neural nets for image processing are designed closely reflects our idea of what a natural image should look like (the "prior" distribution).

They proposed a method that can be used for inpainting as well as noise and artifact removal that just uses the given image with no training on any other data. The actual concept is quite simple: The net is trained to output the desired image (for some fixed random noise as input) by penalizing only the erros outside some given mask. If you want to remove noise, you just don't need to mask anything, but just stop early in the training.

For inpainting you mask the part you want to inpaint and train until convergence. It is certainly not state-of-the-art, but I still wanted to post try it and post it here due to the simplicity of the idea and the still remarkable performance. In my experiments the inpainting of larger patches didn't turn out that well but for smaller segments the results can be a lot more convincing.

I implemented this using the popular U-Net architecture from jaxony on github. The code for training and processing the images can be found below.

Training

This is a visualization of the training process. Every frame is the state ofter a certain number of iterations:

Examples

Code

Note that on a cpu this can take hours to run for just a single image, while a good cuda enabled gpu might take a lot less time.

import torch
import numpy as np
unet = __import__('unet-pytorch')
import PIL.ImageOps
#specify device (cpu/cuda)
device = "cpu"
#specify file and size
file = 'mona'
size = 512 #pad to this size (no smaller than original image), must be divisible by 2^5
img_pil = PIL.Image.open(file +'.png').convert('RGB')
mask_pil = PIL.Image.open(file +'-mask.png').convert('RGB')

net = unet.UNet(num_classes=3, in_channels=32, depth=6, start_filts=64).to(device)
h,w = img_pil.size
pad = (0, 0, size - h, size - w)
img = PIL.ImageOps.expand(img_pil, border=pad)
img = torch.Tensor(np.array(img).transpose([2, 0, 1])[None, :, :, :].astype(np.double)).to(device)
mask = PIL.ImageOps.expand(mask_pil, border=pad)
mask = torch.Tensor((np.array(mask)==0).transpose([2, 0, 1])[None, 0:3, :, :].astype(np.double)).to(device)
mean = img.mean()
std = img.std()
img = (img - mean)/std
optimizer = torch.optim.Adam(net.parameters(), lr=0.0001)
criterion = torch.nn.MSELoss()
input = torch.rand((1, 32, size, size)).to(device)
for it in range(5000):
    if it == 1000:
        optimizer.param_groups[0]['lr'] = 0.00003
    out = net(input)
    loss = criterion(out * mask, img * mask)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
out = out.detach().cpu().numpy()[0].transpose([1,2,0])*std.item() + mean.item()
out = np.clip(out, 0, 255).astype(np.uint8)[0:w, 0:h, :]
mask_np = (np.array(mask_pil) > 0).astype(np.uint8)
img_np = np.array(img_pil)
inpaint = (img_np * (1-mask_np) + mask_np * out).astype(np.uint8)
PIL.Image.fromarray(inpaint).save('./{}_inpainted.png'.format(file))

flawr

Posted 2016-01-29T22:06:46.637

Reputation: 40 560

does Deep Image Prior use something different from U-Nets? seems like they'd get better results – ASCII-only – 2019-05-31T12:25:57.953

Also, have you tried with Deep Image Prior's code – ASCII-only – 2019-05-31T12:26:46.167

@ASCII-only They state in the paper that they do indeed mainly use the U-Net, but I cannot find the exact parameters they used. They might have used a net with a greater capacity. I only had a computer with a very limited amount of power. So I had to choose parameters that still fit into the memory and that didn't take too long to train. I'm not sure how long exactly it took but on the computer I used (only with a CPU) these images take multiple days. (If you have a spare cuda enabled GPU let me know :) – flawr – 2019-05-31T12:52:03.820

I also suspect that due to the design of the network having rectangular masks is also not ideal (and smaller masks would probably also look better), if you compare for example the first few images to the last two (that do not use rectangular masks). – flawr – 2019-05-31T12:54:50.597

4

Python with OpenCV

OpenCV has a function called inpaint. There are two types of inpainting used, I will use the Fast Marching Method. According to the documentation, the algorithm works like this:

Consider a region in the image to be inpainted. Algorithm starts from the boundary of this region and goes inside the region gradually filling everything in the boundary first. It takes a small neighbourhood around the pixel on the neigbourhood to be inpainted. This pixel is replaced by normalized weighted sum of all the known pixels in the neigbourhood. Selection of the weights is an important matter. More weightage is given to those pixels lying near to the point, near to the normal of the boundary and those lying on the boundary contours. Once a pixel is inpainted, it moves to next nearest pixel using Fast Marching Method. FMM ensures those pixels near the known pixels are inpainted first, so that it just works like a manual heuristic operation.

Here is the code*:

import numpy as np
import cv2
from matplotlib import pyplot as plt
img = cv2.imread('gothic.jpg')
b,g,r = cv2.split(img)
img2 = cv2.merge([r,g,b])
mask = cv2.imread('mask.jpg',0)
dst = cv2.inpaint(img2,mask,3,cv2.INPAINT_TELEA)
(h, w) = dst.shape[:2]
center = (w / 2, h / 2)
# rotate the image by 180 degrees
M = cv2.getRotationMatrix2D(center, 180, 1.0)
rotated = cv2.warpAffine(dst, M, (w, h))
plt.imshow(rotated)

Note how I convert the BGR to RGB for plotting reasons. Also, I rotate it. Here are the results:

Gothic

starry night scream another creepy mona lisa!

Mona Lisa returns!

line 1

checker

As you can see, it i not the best with the two color ones.

TanMath

Posted 2016-01-29T22:06:46.637

Reputation: 1 431

Mona Lisa got a facelift – Conor O'Brien – 2016-02-12T02:04:10.093

3

Java

A color averaging approach. Can probably be improved.

import java.awt.Color;
import java.awt.image.BufferedImage;
import java.io.File;
import java.util.Scanner;

import javax.imageio.ImageIO;


public class ImagePatcher{
    public static void main(String[]args) throws Exception{
        Scanner in=new Scanner(System.in);
        int white=Color.WHITE.getRGB();
        int black=Color.BLACK.getRGB();
        BufferedImage image=ImageIO.read(new File(in.nextLine())),mask=ImageIO.read(new File(in.nextLine()));
        assert(image.getWidth()==mask.getWidth()&&image.getHeight()==mask.getHeight());
        boolean bool=true;
        while(bool){
            bool=false;
        for(int x=0;x<image.getWidth();x+=2){
            for(int y=0;y<image.getHeight();y+=2){
                if(mask.getRGB(x,y)!=white)continue;
                int r=0,g=0,b=0,n=0;
                for(int dx=-1;dx<=1;dx++){
                    if(x+dx<0)continue;
                    if(x+dx>=image.getWidth())continue;
                    for(int dy=-1;dy<=1;dy++){
                        if(y+dy<0)continue;
                        if(y+dy>=image.getHeight())continue;
                        if(mask.getRGB(x+dx,y+dy)==white)continue;
                        Color c=new Color(image.getRGB(x+dx,y+dy));
                        r+=c.getRed();
                        g+=c.getGreen();
                        b+=c.getBlue();
                        n++;
                    }
                }
                if(n==0){bool=true;continue;}
                Color c=n>0?new Color(r/n,g/n,b/n):new Color(100,100,100);
                image.setRGB(x,y,c.getRGB());
                mask.setRGB(x, y, black);
            }           
        }
        for(int x=0;x<image.getWidth();x+=2){
            for(int y=1;y<image.getHeight();y+=2){
                if(mask.getRGB(x,y)!=white)continue;
                int r=0,g=0,b=0,n=0;
                for(int dx=-1;dx<=1;dx++){
                    if(x+dx<0)continue;
                    if(x+dx>=image.getWidth())continue;
                    for(int dy=-1;dy<=1;dy++){
                        if(y+dy<0)continue;
                        if(y+dy>=image.getHeight())continue;
                        if(mask.getRGB(x+dx,y+dy)==white)continue;
                        Color c=new Color(image.getRGB(x+dx,y+dy));
                        r+=c.getRed();
                        g+=c.getGreen();
                        b+=c.getBlue();
                        n++;
                    }
                }
                if(n==0){bool=true;continue;}
                Color c=n>0?new Color(r/n,g/n,b/n):new Color(100,100,100);
                image.setRGB(x,y,c.getRGB());
                mask.setRGB(x, y, black);
            }
        }
        for(int x=1;x<image.getWidth();x+=2){
            for(int y=0;y<image.getHeight();y+=2){
                if(mask.getRGB(x,y)!=white)continue;
                int r=0,g=0,b=0,n=0;
                for(int dx=-1;dx<=1;dx++){
                    if(x+dx<0)continue;
                    if(x+dx>=image.getWidth())continue;
                    for(int dy=-1;dy<=1;dy++){
                        if(y+dy<0)continue;
                        if(y+dy>=image.getHeight())continue;
                        if(mask.getRGB(x+dx,y+dy)==white)continue;
                        Color c=new Color(image.getRGB(x+dx,y+dy));
                        r+=c.getRed();
                        g+=c.getGreen();
                        b+=c.getBlue();
                        n++;
                    }
                }
                if(n==0){bool=true;continue;}
                Color c=n>0?new Color(r/n,g/n,b/n):new Color(100,100,100);
                image.setRGB(x,y,c.getRGB());
                mask.setRGB(x, y, black);
            }           
        }
        for(int x=1;x<image.getWidth();x+=2){
            for(int y=1;y<image.getHeight();y+=2){
                if(mask.getRGB(x,y)!=white)continue;
                int r=0,g=0,b=0,n=0;
                for(int dx=-1;dx<=1;dx++){
                    if(x+dx<0)continue;
                    if(x+dx>=image.getWidth())continue;
                    for(int dy=-1;dy<=1;dy++){
                        if(y+dy<0)continue;
                        if(y+dy>=image.getHeight())continue;
                        if(mask.getRGB(x+dx,y+dy)==white)continue;
                        Color c=new Color(image.getRGB(x+dx,y+dy));
                        r+=c.getRed();
                        g+=c.getGreen();
                        b+=c.getBlue();
                        n++;
                    }
                }
                if(n==0){bool=true;continue;}
                Color c=n>0?new Color(r/n,g/n,b/n):new Color(100,100,100);
                image.setRGB(x,y,c.getRGB());
                mask.setRGB(x, y, black);
            }
        }
        };
        ImageIO.write(image, "png", new File("output.png"));
    }
}

Results:

enter image description here enter image description here enter image description here enter image description here enter image description here enter image description here

SuperJedi224

Posted 2016-01-29T22:06:46.637

Reputation: 11 342

2Why do you always get lines in this particular angle? The top left corners seem to match relatively well, while the bottom right part does not match at all. – flawr – 2016-02-01T09:50:04.047

I think it has to do with the manner in which I iterate through the region. I'll probably change that eventually. – SuperJedi224 – 2016-02-01T12:53:09.867

It always looks like there are trapezoids. – ericw31415 – 2016-04-24T02:16:13.290

@ericw31415 It's an artifact of the iteration order. – SuperJedi224 – 2016-04-24T20:36:13.597