Ducttape the Ducttape

11

Your boss has found out that quite a bunch of his employees (including you) like to steal code from others. He ordered you to write him a program that he can use to find people who steal code from others.

Task:

Write a program/function that detects if somebody copied (part of) his code from somewhere else.

The program will get all the existing programs and the program that it should test from two separate inputs. Edit: Since there are no answers so far, you may use regular expressions!

Output

  • The program should then output all the stolen pieces of code, separated by either a space or a newline. (It may have a space or a newline at the end.)
  • A piece of code is considered to be stolen/copied if it consists of 10 or more successive bytes. (Sorry, Java fans!)
  • You have to output as much as possible, but if there are overlapping ones, you may ignore one of them or output both of them.

Twist:

As mentioned before, you also love duct tape coding. That means, everything in your code will be copy-pasted from a stack exchange site! You can copy whatever you want (not limited to code in code blocks) and as much as you want. (But at least 10 bytes) (Same rule as above about what is considered to be stolen/copied.) (Please do note that any answers that were posted after this question may not be used.) Please provide links to where you got your code from.

Examples:

Input:
x = document.getElementById("ninja'd"); (The first input)
y = document.getElementById("id"); (The second input)

Output:

 = document.getElementById("

Input:
foo (The first input)
foo+bar (The second input)

Output:
Nothing.

Input:
public static void main(String[] args) (The first input)
public static void main(String[] args) (The second input)

Output:

 main(String[] args)

Input:
for(var i=0; i<x.length; i++){} (The first input)
for(var i=0; i<oops.length; i++){break;} (The second input)

Output:

for(var i=0; i<
.length; i++){

or

for(var i=0; i< .length; i++){

Stefnotch

Posted 2015-11-23T18:23:32.137

Reputation: 607

1It's missing rules for which strings can be copied and how (for the code). – feersum – 2015-11-23T18:27:01.263

4Do the copied strings have to come from code blocks, or any parts of an SE answer? If it comes from a code block does it need to use the entire block, or can a substring be used? Can the strings come from either the formatted text or the Markdown source? Can code blocks newer than this question be used? Can old revisions of a question be used? – feersum – 2015-11-23T18:35:06.920

@Vɪʜᴀɴ You are free to assume whatever makes most sense for your language! (e.g. If you happen to be answering in Brainfuck, you may assume that the input will be in the ASCII range..) – Stefnotch – 2015-11-23T18:40:32.197

@Stefnotch My questions are about what code we use to write our program, not the inputs. – feersum – 2015-11-23T18:40:33.163

@Vɪʜᴀɴ Oh! I updated my question! Better? (Also, http://meta.codegolf.stackexchange.com/questions/5431/should-i-delete-my-comment-after-op-fixes-golfs-ops-code )

– Stefnotch – 2015-11-23T18:42:59.727

3You say the substrings have to length 10 or more. Can I place a substring inside another substring? (Do the substrings have to be continuous?) – Blue – 2015-11-23T20:12:47.910

@muddyfish What do you mean? – Stefnotch – 2015-11-24T17:09:21.617

@Stefnotch Say I had a substring "1234567890" and another one "qwertyuiop". Could "12qwertyuiop34567890" be in my code? – Blue – 2015-11-24T17:15:17.537

@muddyfish So, the existing programs are qwertyuiop and 1234567890. Your program is 12qwertyuiop34567890, right? Well, the overlapping part is qwertyuiop (That is the output.) – Stefnotch – 2015-11-24T17:17:34.637

No, I meant can I use 12qwertyuiop34567890 as part of my source? – Blue – 2015-11-24T17:39:04.497

@muddyfish Oh, that's a good question! My intent was that that is impossible... (Editing the question) – Stefnotch – 2015-11-24T17:42:17.257

@Doorknob I decided to allow regular expressions. – Stefnotch – 2015-11-24T17:44:24.310

1@sysreq I decided to allow regular expressions! – Stefnotch – 2015-11-24T17:45:15.120

5Easy answer: Use Unary – lirtosiast – 2015-11-24T17:51:09.453

Answers

9

Python 2, 224 bytes

from difflib import SequenceMatcher
def similar(a, b):
    return SequenceMatcher(None, a, b).get_matching_blocks()
a=raw_input()
b=raw_input()
for start, _, size in similar(a, b):
 if(size > 9):
  print a[start:start+size]

Copied from this answer:

from difflib import SequenceMatcher
def similar(a, b):
    return SequenceMatcher(None, a, b).

get_matching_blocks() is copied from this answer

a=raw_input()
b=raw_input()

is copied from this question

for start, _, size in is copied from this answer and the second occurence of similar(a, b) is copied from the same place as the first.

if(size > 9) is copied from this question.

:
    print

is copied from this question

a[start: is copied from this answer.

and finally, start+size] is copied from this questipn

Finally answered after one and a half years ...

pppery

Posted 2015-11-23T18:23:32.137

Reputation: 3 987

+1 Though, https://stackoverflow.com/questions/37386311/fuzzy-module-not-working-for-me was posted after this challenge was posted. I decided to remove that restriction, so your answer is fine. :)

– Stefnotch – 2017-07-13T15:36:19.597

1

@Stenfoch You don't need to; an earlier answer contains the same phrase

– pppery – 2017-07-13T16:14:33.027

2One thing I learned about [tag:duct-tape-coding] challenges: It is tricky to keep track of where you got all of your code. – pppery – 2017-07-13T16:19:24.337

Wow, the score of this post has been rising very quickly ... – pppery – 2017-07-14T22:49:54.387