19
2
Input:
Two strings without newlines or whitespaces.
Output:
Both input strings on separate lines, with spaces where necessary† for one of the two strings. And a third line with the characters A
, R
, M
and , representing added, removed, modified, and unchanged.
† We add spaces to either the top or bottom input string (if we have to). The goal of this challenge is to output with the least amount of changes (ARM
) possible, also known as the Levenshtein distance.
Example:
Let's say the input strings are ABCDEF
and AFBECD
, then the output would be this:
A B CDEF
AFBECD
A A RR
Here are some other possible invalid outputs as example (and there are a lot more):
ABCDEF
AFBECD
MMMMM
A BCDEF
AFBECD
A MMMR
AB CDEF
AFBECD
MAMMMR
ABC DEF
AFBECD
MMAMMR
ABC DEF
AFBECD
MMAA RR
ABCDEF
AFB ECD
MMR MA
AB CDEF // This doesn't make much sense,
AFBECD // but it's to show leading spaces are also allowed
AM A RR
None of these have only four changes however, so only A B CDEF\nAFBECD \n A A RR
is a valid output for this challenge.
Challenge rules:
- You can assume the input strings won't contain any new-lines or spaces.
- The two input strings can be of different lengths.
- One of the two input strings should remain as is, except for optional leading/trailing spaces.
- If your languages doesn't support anything besides ASCII, you can assume the input will only contain printable ASCII characters.
- The input and output format are flexible. You can have three separate Strings, a String array, a single String with new-lines, 2D character array, etc.
- You are allowed to use something else instead of
ARM
, but state what you've used (i.e.123
, orabc.
, etc.) - If more than one valid output is possible with the same amount of changes (
ARM
), you can choose whether to output one of the possible outputs or all of them. Leading and trailing spaces are optional:
A B CDEF AFBECD A A RR
or
"A B CDEF\nAFBECD\n A A RR" ^ Note there are no spaces here
General rules:
- This is code-golf, so shortest answer in bytes wins.
Don't let code-golf languages discourage you from posting answers with non-codegolfing languages. Try to come up with an as short as possible answer for 'any' programming language. - Standard rules apply for your answer, so you are allowed to use STDIN/STDOUT, functions/method with the proper parameters, full programs. Your call.
- Default Loopholes are forbidden.
- If possible, please add a link with a test for your code.
- Also, please add an explanation if necessary.
Test cases:
In: "ABCDEF" & "AFBECD"
Output (4 changes):
A B CDEF
AFBECD
A A RR
In: "This_is_an_example_text" & "This_is_a_test_as_example"
Possible output (13 changes):
This_is_an _example_text
This_is_a_test_as_example
MAAAAAAA RRRRR
In: "AaAaABBbBBcCcCc" & "abcABCabcABC"
Possible output (10 changes):
AaAaABBbBBcCcCc
abcABCab cABC
R MM MMMR MM R
In: "intf(){longr=java.util.concurrent.ThreadLocalRandom.current().nextLong(10000000000L);returnr>0?r%2:2;}" & "intf(){intr=(int)(Math.random()*10);returnr>0?r%2:2;}"
Possible output (60 changes):
intf(){longr=java.util.concurrent.ThreadLocalRandom.current().nextLong(10000000000L);returnr>0?r%2:2;}
intf(){i ntr=( i n t)(M ath.r andom ()* 10 );returnr>0?r%2:2;}
MR M MRRRRRR RRRR RRRRRR MMMRR MMMMRRR RRRRRRRR MRRRRRRRRR RRRRRRRRRR
In: "ABCDEF" & "XABCDF"
Output (2 changes):
ABCDEF
XABCD F
A R
In: "abC" & "ABC"
Output (2 changes):
abC
ABC
MM
Related – Kevin Cruijssen – 2017-10-05T12:19:44.083
If there are multiple arrangements that are the same distance, is it OK to only output one of them? – AdmBorkBork – 2017-10-05T12:28:47.883
@AdmBorkBork Yes, just one of the possible outputs is indeed the intended output (although outputting all available options is also fine). I'll clarify this in the challenge rules. – Kevin Cruijssen – 2017-10-05T12:30:59.413
@Arnauld I've removed the rule about leading spaces, so leading and trailing spaces are both optional and valid on the unmodified line. (Which means the last test case in your answer is completely valid.) – Kevin Cruijssen – 2017-10-05T14:32:18.967
Can we use lowercase instead of uppercase? – Stan Strum – 2017-10-05T18:17:13.090
Do we need to compare byte, characters, code points, or graphene clusters? (when dealing with a language that supports unicode) – Ferrybig – 2017-10-06T06:34:09.563
@StanStrum "You are allowed to use something else instead of ARM, but state what you've used (i.e. 123, or abc., etc.)" – Kevin Cruijssen – 2017-10-06T06:43:43.170
@Laikoni "The input and output format are flexible. You can have three separate Strings, a String array, a single String with new-lines, 2D character array, etc." – Kevin Cruijssen – 2017-10-06T06:43:56.703
@Ferrybig Hmm.. characters I guess. Aren't these all unique when you use a single encoding? Could you give an example of where a character could be different in character, but still might be equal in code points, graphene clusters, and/or bytes in the same encoding? – Kevin Cruijssen – 2017-10-06T06:46:22.697
@KevinCruijssen
– Ferrybig – 2017-10-06T06:53:03.173स्ते
this string is 1 graphene cluster, consisting of 4 code points and 4 characters, and is 12 bytes when represented on disk using UTF-8, or 10 bytes on disk using UTF-16 . Unicode support is harder than most people realise, and while you probably meant to support graphene clusters for this challenge, most answers probably won't support it. You might be interesting to read the exact difference here: http://utf8everywhere.org/#characters (no affiliation)1@Ferrybig Ah ok, thanks for the explanation. But as for this challenge, just supporting printable ASCII is actually already enough. If you want to support more, be my guest. But as long as it works for the test cases given I'm ok with undefined behavior for graphene clusters consisting of more than 1 character like that. :) – Kevin Cruijssen – 2017-10-06T07:09:43.350