Watson-Crick palindromes

31

5

Problem

Create a function that can determine whether or not an arbitrary DNA string is a Watson-Crick palindrome. The function will take a DNA string and output a true value if the string is a Watson-Crick palindrome and a false value if it is not. (True and False can also be represented as 1 and 0, respectively.)

The DNA string may either be in all uppercase or all lowercase depending on your preference.

Also, the DNA string will not be empty.

Explanation

A DNA string is a Watson-Crick palindrome when the complement of its reverse is equal to itself.

Given a DNA string, first reverse it, and then complement each character according to the DNA bases (A ↔ T and C ↔ G). If the original string is equal to the complemented-reverse string, it is a Watson-Crick palindrome.

For more, see this question. It is a different challenge where you must find the longest substring of a DNA string where that substring is a Watson-Crick palindrome.

Goal

This is code-golf and the shortest code wins.

Test Cases

The format is <input> = <output>.

ATCGCGAT = true
AGT = false
GTGACGTCAC = true
GCAGTGA = false
GCGC = true
AACTGCGTTTAC = false
ACTG = false

miles

Posted 2016-04-24T23:22:55.183

Reputation: 15 654

Related. – Martin Ender – 2016-04-25T06:39:00.273

3

Someone should write a program in DNA# that is also a Watson-Crick palindrome. :D (might not be possible)

– mbomb007 – 2016-04-25T16:49:43.930

Or, if you like, "a word is a Watson–Crick palindrome if it has order 2 in the free group on 2 generators" (or on n generators!). – wchargin – 2016-04-26T03:19:02.433

(I guess technically that's "order at most 2.") – wchargin – 2016-04-26T03:19:17.400

Do the true and false values need to be "consistent"? I.e., only one true output and only one false output. (As opposed to e.g. returning the input string for true, and nil for false.) – Mitch Schwartz – 2016-04-26T11:10:40.083

Extending it to truthy and falsey values might lead to answers that return [1, 2, 3] as true and [] as false but would lead to many different possible outputs. It would be better to be consistent, always returning true when true and nil when false, or some other arrangement. – miles – 2016-04-26T11:17:59.537

Is the name "Watson--Crick palindrome" a thing? It should really be "Watson--Crick--Franklin palindrome" :P

– Andras Deak – 2016-04-27T13:20:21.520

Possible Handy Hint: Odd length strings cannot be W-C Palindromes. The middle character will always transpose to something different. – Obsidian Phoenix – 2016-04-27T15:13:03.203

1@AndrasDeak According to Watsons book, Franklin was apparently mostly a thorn in their side. She repeatedly refused to hand over x-rays showing the helix (as I recall), because she refused to believe it. Its worth a read if you are interested in the discovery at any rate. – Obsidian Phoenix – 2016-04-27T15:19:00.507

@ObsidianPhoenix funny, I've even read the book (to be fair, at the age of 12), and didn't remember this aspect, only that without the diffraction data they might not have figured out the structure. I guess I do have to re-read it, thanks:) – Andras Deak – 2016-04-27T15:30:18.600

@AndrasDeak I was the same, at school and not since. My recollection could be skewed, but I seem to recall that was the case. I enjoyed it, should probably read it again some day. – Obsidian Phoenix – 2016-04-27T15:32:04.220

@mbomb007 On it. – Khuldraeseth na'Barya – 2017-12-12T19:10:49.103

Answers

27

05AB1E, 10 7 bytes

Code:

Â'š×‡Q

Explanation:

To check if a string is a palindrome, we just need to check the input with the input, with at swapped and cg swapped and then reverse it. So that is what we are going to do. We push the input and the input reversed using  (bifurcate). Now comes a tricky part. 'š× is the compressed version for creating. If we reverse it, you can see why it's in the code:

CreATinG
|  ||  |
GniTAerC

This will be used to transliterate the reversed input. Transliteration is done with . After that, we just check if the input and the transliterated input are eQual and print that value. So this is how the stack looks like for input actg :

          # ["actg", "gtca"]
 'š×       # ["actg", "gtca", "creating"]
    Â      # ["actg", "gtca", "creating", "gnitaerc"]
     ‡     # ["actg", "cagt"]
      Q    # [0]

Which can also be seen with the debug flag (Try it here).

Uses CP-1252 encoding. Try it online!.

Adnan

Posted 2016-04-24T23:22:55.183

Reputation: 41 965

4Very, er, creative... – Toby Speight – 2016-04-26T07:53:15.627

2This language has some very neat features – miles – 2016-04-26T11:11:28.253

18

Jelly, 9 bytes

O%8µ+U5ḍP

Try it online! or verify all test cases.

How it works

O%8µ+U5ḍP  Main link. Argument: S (string)

O          Compute the code points of all characters.
 %8        Compute the residues of division by 8.
           This maps 'ACGT' to [1, 3, 7, 4].
   µ       Begin a new, monadic link. Argument: A (array of residues)
    +U     Add A and A reversed.
      5ḍ   Test the sums for divisibility by 5.
           Of the sums of all pairs of integers in [1, 3, 7, 4], only 1 + 4 = 5
           and 3 + 7 = 10 are divisible by 5, thus identifying the proper pairings.
        P  Take the product of the resulting Booleans.

Dennis

Posted 2016-04-24T23:22:55.183

Reputation: 196 637

4I think Python is pretty close to competing with this answer! Compare the first nine bytes of my answer: lambda s:. That's almost the full solution! – orlp – 2016-04-24T23:58:15.370

Wait, the "How it works" part does not really explain how it works... Why residues of 8 and sums of 5?? Where are the letters complemented? – ZeroOne – 2016-04-27T06:35:52.353

@ZeroOne I've clarified that part. – Dennis – 2016-04-27T16:58:06.863

Oh, wow! That's darn clever. :) Thanks! – ZeroOne – 2016-04-27T17:03:21.123

12

Python 2, 56 45 44 bytes

lambda s:s==s[::-1].translate("_T_GA__C"*32)

orlp

Posted 2016-04-24T23:22:55.183

Reputation: 37 067

lambda s:s==s[::-1].translate("TCG_A"*99) works in Python 3 – Alex Varga – 2017-12-12T21:21:10.103

8

Perl, 27 bytes

Includes +2 for -lp

Give input on STDIN, prints 1 or nothing:

dnapalin.pl <<< ATCGCGAT

dnapalin.pl:

#!/usr/bin/perl -lp
$_=y/ATCG/TAGC/r=~reverse

Replace $_= by $_+= to get 0 instead of empty for the false case

Ton Hospel

Posted 2016-04-24T23:22:55.183

Reputation: 14 114

7

Pyth - 10 bytes

qQ_XQ"ACGT

Try it online here.

This would be 9 bytes after the bug fix which makes it non-competing: Try it online here.

Maltysen

Posted 2016-04-24T23:22:55.183

Reputation: 25 023

7

Retina, 34 33 bytes

$
;$_
T`ACGT`Ro`;.+
+`(.);\1
;
^;

Try it online! (Slightly modified to run all test cases at once.)

Explanation

$
;$_

Duplicate the input by matching the end of the string and inserting a ; followed by the entire input.

T`ACGT`Ro`;.+

Match only the second half of the input with ;.+ and perform the substitution of pairs with a transliteration. As for the target set Ro: o references the other set, that is o is replaced with ACGT. But R reverses this set, so the two sets are actually:

ACGT
TGCA

If the input is a DNA palindrome, we will now have the input followed by its reverse (separated by ;).

+`(.);\1
;

Repeatedly (+) remove a pair of identical characters around the ;. This will either continue until only the ; is left or until the two characters around the ; are no longer identical, which would mean that the strings aren't the reverse of each other.

^;

Check whether the first character is ; and print 0 or 1 accordingly.

Martin Ender

Posted 2016-04-24T23:22:55.183

Reputation: 184 808

6

JavaScript (ES6), 59 bytes

f=s=>!s||/^(A.*T|C.*G|G.*C|T.*A)$/.test(s)&f(s.slice(1,-1))

Best I could do without using Regexp was 62 bytes:

f=s=>!s||parseInt(s[0]+s.slice(-1),33)%32%7<1&f(s.slice(1,-1))

Neil

Posted 2016-04-24T23:22:55.183

Reputation: 95 035

5

Ruby, 35

I tried other ways, but the obvious way was the shortest:

->s{s.tr('ACGT','TGCA').reverse==s}

in test program

f=->s{s.tr('ACGT','TGCA').reverse==s}

puts f['ATCGCGAT']
puts f['AGT']
puts f['GTGACGTCAC']
puts f['GCAGTGA']
puts f['GCGC']
puts f['AACTGCGTTTAC'] 

Level River St

Posted 2016-04-24T23:22:55.183

Reputation: 22 049

2->s{s.==s.reverse.tr'ACGT','TGCA'} is a byte shorter – Mitch Schwartz – 2016-04-25T04:41:11.147

@MitchSchwartz wow, that works, but I have no idea what that first . is for. The code looks more right to me without it, but it is required to make it run. Is it documented anywhere? – Level River St – 2016-04-25T20:59:29.487

Are you sure you don't want to figure it out on your own? – Mitch Schwartz – 2016-04-26T10:59:09.357

@MitchSchwartz hahaha I already tried. I find ruby's requirements for whitespace very idiosyncratic. Strange requirements for periods is a whole other issue. I have several theories but all of them may be wrong. I suspect it may have something to do with treating == as a method rather than an operator, but searching by symbols is impossible. – Level River St – 2016-04-26T11:13:19.000

You suspected correctly. :) It's just a plain old method call. – Mitch Schwartz – 2016-04-26T11:18:11.483

@MitchSchwartz thanks, but that doesn't really explain why it can parse the code with the period but not without. If I get random syntax errors in future, I will try putting a period in front of operators and see what happens :D – Level River St – 2016-04-26T11:31:49.823

Parentheses after a method call (here we are looking at tr) can only be omitted under certain circumstances. Maybe it would help you to consider why 1.==1 && 2==2 is valid syntax (and evaluates to false) while 1==1 && 2.==2 is a syntax error. I don't claim to have deep knowledge of Ruby parsing, but I also don't think that it is very surprising or strange in this case. Informally, I'd say you can omit the parentheses when (1) the method is being called without arguments, or (2) the method is not "in the middle" of an expression that could have things to the right of it. – Mitch Schwartz – 2016-04-26T13:36:07.837

Or maybe this plays better to the intuition: Would you expect 1 + 5.div 3 to be valid syntax? It is comparable to s == s.reverse.tr 'ACGT','TGCA' (notice where I added spaces). – Mitch Schwartz – 2016-04-26T13:57:36.710

5

Haskell, 48 45 bytes

(==)=<<reverse.map((cycle"TCG_A"!!).fromEnum)

Usage example: (==)=<<reverse.map((cycle"_T_GA__C"!!).fromEnum) $ "ATCGCGAT"-> True.

A non-pointfree version is

f x = reverse (map h x) == x           -- map h to x, reverse and compare to x
h c = cycle "TCG_A" !! fromEnum c      -- take the ascii-value of c and take the
                                       -- char at this position of string
                                       -- "TCG_ATCG_ATCG_ATCG_A..."

Edit: @Mathias Dolidon saved 3 bytes. Thanks!

nimi

Posted 2016-04-24T23:22:55.183

Reputation: 34 639

Works with cycle "TCG_A" too. :) – Mathias Dolidon – 2016-04-26T13:50:05.430

4

Retina, 52 bytes

^G(.*)C$
$1
^A(.*)T$
$1
^T(.*)A$
$1
}`^C(.*)G$
$1
^$

CalculatorFeline

Posted 2016-04-24T23:22:55.183

Reputation: 2 608

4

Julia, 47 38 bytes

s->((x=map(Int,s)%8)+reverse(x))%5⊆0

This is an anonymous function that accepts a Char array and returns a boolean. To call it, assign it to a variable.

This uses Dennis' algorithm, which is shorter than the naïve solution. We get the remainder of each code point divided by 8, add that to itself reversed, get the remainders from division by 5, and check whether all are 0. The last step is accomplished using , the infix version of issubset, which casts both arguments to Set before checking. This means that [0,0,0] is declared a subset of 0, since Set([0,0,0]) == Set(0). This is shorter than an explicit check against 0.

Try it online!

Saved 9 bytes thanks to Dennis!

Alex A.

Posted 2016-04-24T23:22:55.183

Reputation: 23 761

4

Jolf, 15 Bytes

Try it!

=~A_iγ"AGCT"_γi

Explanation:

   _i            Reverse the input
 ~A_iγ"AGCT"_γ   DNA swap the reversed input
=~A_iγ"AGCT"_γi  Check if the new string is the same as the original input

swells

Posted 2016-04-24T23:22:55.183

Reputation: 221

3

Jolf, 16 bytes

Try it here!

pe+i~Aiγ"GATC"_γ

Explanation

pe+i~Aiγ"GATC"_γ
    ~Aiγ"GATC"_γ  perform DNA transformation
  +i              i + (^)
pe                is a palindrome

Conor O'Brien

Posted 2016-04-24T23:22:55.183

Reputation: 36 228

3

Factor, 72 bytes

Unfortunately regex can't help me here.

[ dup reverse [ { { 67 71 } { 65 84 } { 71 67 } { 84 65 } } at ] map = ]

Reverse, lookup table, compare equal.

cat

Posted 2016-04-24T23:22:55.183

Reputation: 4 989

Wow, that's a lot of whitespace!!! Is it all necessary? Also, a link to the language homepage would be useful. – Level River St – 2016-04-25T21:57:30.900

@LevelRiverSt Unfortunately, every bit of it is necessary. I'll add a link to the header. – cat – 2016-04-25T22:05:22.703

3

Actually, 19 bytes

O`8@%`M;RZ`5@Σ%Y`Mπ

This uses Dennis's algorithm.

Try it online!

Explanation:

O`8@%`M;RZ`5@Σ%Y`Mπ
O                    push an array containing the Unicode code points of the input
 `8@%`M              modulo each code point by 8
       ;RZ           zip with reverse
          `5@Σ%Y`M   test sum for divisibility by 5
                  π  product

Mego

Posted 2016-04-24T23:22:55.183

Reputation: 32 998

3

C,71

r,e;f(char*s){for(r=0,e=strlen(s)+1;*s;s++)r|=*s*s[e-=2]%5^2;return!r;}

2 bytes saved by Dennis. Additional 2 bytes saved by adapting for lowercase input: constants 37 and 21 are revised to 5 and 2.

C,75

i,j;f(char*s){for(i=j=0;s[i];i++)j|=s[i]*s[strlen(s)-i-1]%37!=21;return!j;}

Saved one byte: Eliminated parenthesis by taking the product of the two ASCII codes mod 37. The valid pairs evaluate to 21. Assumes uppercase input.

C,76

i,j;f(char*s){for(i=j=0;s[i];i++)j|=(s[i]+s[strlen(s)-i-1])%11!=6;return!j;}

Uses the fact that ASCII codes of the valid pairs sum to 138 or 149. When taken mod 11, these are the only pairs that sum to 6. Assumes uppercase input.

ungolfed in test program

i,j;

f(char *s){
   for(i=j=0;s[i];i++)                  //initialize i and j to 0; iterate i through the string
     j|=(s[i]+s[strlen(s)-i-1])%11!=6;  //add characters at i from each end of string, take result mod 11. If not 6, set j to 1
return!j;}                              //return not j (true if mismatch NOT detected.)

main(){
  printf("%d\n", f("ATCGCGAT"));
  printf("%d\n", f("AGT"));
  printf("%d\n", f("GTGACGTCAC"));
  printf("%d\n", f("GCAGTGA"));
  printf("%d\n", f("GCGC"));
  printf("%d\n", f("AACTGCGTTTAC"));
} 

Level River St

Posted 2016-04-24T23:22:55.183

Reputation: 22 049

1r,e;f(char*s){for(r=0,e=strlen(s)+1;*s;s++)r|=*s*s[e-=2]%37^21;return!r;} saves a couple of bytes. – Dennis – 2016-04-25T15:59:29.477

@Dennis thanks, I really wasn't in the mood for modifying pointers, but it squeezed a byte out! I should have seen != > ^ myself. I reduced another 2 by changing to lowercase input: both magic numbers are now single digit. – Level River St – 2016-04-25T21:54:32.853

3

Oracle SQL 11.2, 68 bytes

SELECT DECODE(TRANSLATE(REVERSE(:1),'ATCG','TAGC'),:1,1,0)FROM DUAL; 

Jeto

Posted 2016-04-24T23:22:55.183

Reputation: 1 601

2With SQL like that, I'm confident you must have written reports for some of my projects before... – corsiKa – 2016-04-25T22:20:47.817

3

J - 21 bytes

0=[:+/5|[:(+|.)8|3&u:

Based on Dennis' method

Usage

   f =: 0=[:+/5|[:(+|.)8|3&u:
   f 'ATCGCGAT'
1
   f 'AGT'
0
   f 'GTGACGTCAC'
1
   f 'GCAGTGA'
0
   f 'GCGC'
1
   f 'AACTGCGTTTAC'
0
   f 'ACTG'
0

Explanation

0=[:+/5|[:(+|.)8|3&u:
                 3&u:    - Convert from char to int
               8|        - Residues from division by 8 for each
            |.           - Reverse the list
           +             - Add from the list and its reverse element-wise
        [:               - Cap, compose function
      5|                 - Residues from division by 5 for each
    +/                   - Fold right using addition to create a sum
  [:                     - Cap, compose function
0=                       - Test the sum for equality to zero

miles

Posted 2016-04-24T23:22:55.183

Reputation: 15 654

3

Bash + coreutils, 43 32 bytes

[ `tr ATCG TAGC<<<$1|rev` = $1 ]

Tests:

for i in ATCGCGAT AGT GTGACGTCAC GCAGTGA GCGC AACTGCGTTTAC; do ./78410.sh $i && echo $i = true || echo $i = false; done
ATCGCGAT = true
AGT = false
GTGACGTCAC = true
GCAGTGA = false
GCGC = true
AACTGCGTTTAC = false

Toby Speight

Posted 2016-04-24T23:22:55.183

Reputation: 5 058

3

Julia 0.4, 22 bytes

s->s$reverse(s)⊆""

The string contains the control characters EOT (4) and NAK (21). Input must be in form of a character array.

This approach XORs the characters of the input with the corresponding characters in the reversed input. For valid pairings, this results in the characters EOT or NAK. Testing for inclusion in the string of those characters produces the desired Boolean.

Try it online!

Dennis

Posted 2016-04-24T23:22:55.183

Reputation: 196 637

3

Labyrinth, 42 bytes

_8
,%
;
"}{{+_5
"=    %_!
 = """{
 ;"{" )!

Terminates with a division-by-zero error (error message on STDERR).

Try it online!

The layout feels really inefficient but I'm just not seeing a way to golf it right now.

Explanation

This solution is based on Dennis's arithmetic trick: take all character codes modulo 8, add a pair from both ends and make sure it's divisible by 5.

Labyrinth primer:

  • Labyrinth has two stacks of arbitrary-precision integers, main and aux(iliary), which are initially filled with an (implicit) infinite amount of zeros.
  • The source code resembles a maze, where the instruction pointer (IP) follows corridors when it can (even around corners). The code starts at the first valid character in reading order, i.e. in the top left corner in this case. When the IP comes to any form of junction (i.e. several adjacent cells in addition to the one it came from), it will pick a direction based on the top of the main stack. The basic rules are: turn left when negative, keep going ahead when zero, turn right when positive. And when one of these is not possible because there's a wall, then the IP will take the opposite direction. The IP also turns around when hitting dead ends.
  • Digits are processed by multiplying the top of the main stack by 10 and then adding the digit.

The code starts with a small 2x2, clockwise loop, which reads all input modulo 8:

_   Push a 0.
8   Turn into 8.
%   Modulo. The last three commands do nothing on the first iteration
    and will take the last character code modulo 8 on further iterations.
,   Read a character from STDIN or -1 at EOF. At EOF we will leave loop.

Now ; discards the -1. We enter another clockwise loop which moves the top of the main stack (i.e. the last character) down to the bottom:

"   No-op, does nothing.
}   Move top of the stack over to aux. If it was at the bottom of the stack
    this will expose a zero underneath and we leave the loop.
=   Swap top of main with top of aux. The effect of the last two commands
    together is to move the second-to-top stack element from main to aux.
"   No-op.

Now there's a short linear bit:

{{  Pull two characters from aux to main, i.e. the first and last (remaining)
    characters of the input (mod 8).
+   Add them.
_5  Push 5.
%   Modulo.

The IP is now at a junction which acts as a branch to test divisibility by 5. If the result of the modulo is non-zero, we know that the input is not a Watson-Crick palindrome and we turn east:

_   Push 0.
!   Print it. The IP hits a dead end and turns around.
_   Push 0.
%   Try to take modulo, but division by zero fails and the program terminates.

Otherwise, we need to keep checking the rest of the input, so the IP keeps going south. The { pulls over the bottom of the remaining input. If we've exhausted the input, then this will be a 0 (from the bottom of aux), and the IP continues moving south:

)   Increment 0 to 1.
!   Print it. The IP hits a dead end and turns around.
)   Increment 0 to 1.
{   Pull a zero over from aux, IP keeps moving north.
%   Try to take modulo, but division by zero fails and the program terminates.

Otherwise, there are more characters in the string to be checked. The IP turns west and moves into the next (clockwise) 2x2 loop which consists largely of no-ops:

"   No-op.
"   No-op.
{   Pull one value over from aux. If it's the bottom of aux, this will be
    zero and the IP will leave the loop eastward.
"   No-op.

After this loop, we've got the input on the main stack again, except for its first and last character and with a zero on top. The ; discards the 0 and then = swaps the tops of the stacks, but this is just to cancel the first = in the loop, because we're now entering the loop in a different location. Rinse and repeat.

Martin Ender

Posted 2016-04-24T23:22:55.183

Reputation: 184 808

3

C#, 65 bytes

bool F(string s)=>s.SequenceEqual(s.Reverse().Select(x=>"GACT"[("GACT".IndexOf(x)+2)%4]));

.NET has some fairly long framework method names at times, which doesn't necessarily make for the best code golf framework. In this case, framework method names make up 33 characters out of 90. :)

Based on the modulus trick from elsewhere in the thread:

bool F(string s)=>s.Zip(s.Reverse(),(a,b)=>a%8+b%8).All(x=>x%5==0);

Now weighs in at 67 characters whereof 13 are method names.

Another minor optimization to shave off a whopping 2 characters:

bool F(string s)=>s.Zip(s.Reverse(),(a,b)=>(a%8+b%8)%5).Sum()<1;

So, 65 of which 13 are framework names.

Edit: Omitting some of the limited "boilerplate" from the solution and adding a couple of conditions leaves us with the expression

s.Zip(s.Reverse(),(a,b)=>(a%8+b%8)%5).Sum()

Which gives 0 if and only if the string s is a valid answer. As cat points out, "bool F(string s)=>" is actually replacable with "s=>" if it's otherwise clear in the code that the expression is a Func<string,bool>, ie. maps a string to a boolean.

robhol

Posted 2016-04-24T23:22:55.183

Reputation: 31

1Welcome to PPCG, nice first answer! :D – cat – 2016-04-27T12:26:42.740

@cat Thanks for that! :) – robhol – 2016-04-27T20:37:34.407

1I don't really know C#, but if this is a lambda, then you can leave out its type and assigning it, as anonymous functions are fine as long as they are assignable. – cat – 2016-04-27T21:29:10.523

1Also, can't you do !s.Zip... instead of s.Zip...==0? (Or can't you ! ints in C#?) Even if you can't boolean-negate it, you can leave out any sort of inversion and state in your answer that this returns <this thing> for falsy and <this other deterministic, clearly discernable thing> for truthy. – cat – 2016-04-27T21:35:24.637

1@cat: You're right about dropping the type. I thought the code had to be directly executable, but making simple assumptions about input and output makes it a bit easier.

The other thing won't work, however - rightly so, in my opinion, since a boolean operation has no logical (hue hue) way to apply to a number. Assigning 0 and 1 the values of false and true is, after all, just convention. – robhol – 2016-04-30T00:24:30.610

3

sed, 67 61 bytes

G;H;:1;s/\(.\)\(.*\n\)/\2\1/;t1;y/ACGT/TGCA/;G;s/^\(.*\)\1$/1/;t;c0

(67 bytes)

Test

for line in ATCGCGAT AGT GTGACGTCAC GCAGTGA GCGC AACTGCGTTTAC ACTG
do echo -n "$line "
    sed 'G;H;:1;s/\(.\)\(.*\n\)/\2\1/;t1;y/ACGT/TGCA/;G;s/^\(.*\)\1$/1/;t;c0' <<<"$line"
done

Output

ATCGCGAT 1
AGT 0
GTGACGTCAC 1
GCAGTGA 0
GCGC 1
AACTGCGTTTAC 0
ACTG 0

By using extended regular expressions, the byte count can be reduced to 61.

sed -r 'G;H;:1;s/(.)(.*\n)/\2\1/;t1;y/ACGT/TGCA/;G;s/^(.*)\1$/1/;t;c0'

PM 2Ring

Posted 2016-04-24T23:22:55.183

Reputation: 469

If you can do it in 61 bytes, then that's your score -- there's nothing against NFA or turing-complete regexp on this particular challenge. Some challenges disallow regex in full, but usually only [tag:regex-golf] will disallow non regular-expressions. – cat – 2016-04-27T20:58:26.003

2

REXX 37

s='ATCGCGAT';say s=translate(reverse(s),'ATCG','TAGC')

aja

Posted 2016-04-24T23:22:55.183

Reputation: 141

2

R, 101 bytes

g=function(x){y=unlist(strsplit(x,""));all(sapply(rev(y),switch,"C"="G","G"="C","A"="T","T"="A")==y)}

Test Cases

g("ATCGCGAT")
[1] TRUE
g("AGT")
[1] FALSE
g("GTGACGTCAC")
[1] TRUE
g("GCAGTGA")
[1] FALSE
g("GCGC")
[1] TRUE
g("AACTGCGTTTAC")
[1] FALSE
g("ACTG")
[1] FALSE

syntonicC

Posted 2016-04-24T23:22:55.183

Reputation: 329

strsplit(x,"")[[1]] is 3 bytes shorter than unlist(strsplit(x,"")) and, here, is equivalent since x is always a single string of character. – plannapus – 2016-04-26T07:30:24.077

2

Octave, 52 bytes

f=@(s) prod(mod((i=mod(toascii(s),8))+flip(i),5)==0)

Following Denis's trick ... take the ASCII values mod 8, flip and add together; if every sum is a multiple of five, you're golden.

dcsohl

Posted 2016-04-24T23:22:55.183

Reputation: 641

That one whitespace is significant? That's... odd. – cat – 2016-04-27T21:36:55.503

Also, you can leave out the f= assignment; unnamed functions are okay. – cat – 2016-04-27T21:37:23.350

1

J, 19 bytes

|.-:-&.('+AGCT'i.])

Try it online!

FrownyFrog

Posted 2016-04-24T23:22:55.183

Reputation: 3 112

1

Wolfram Language (Mathematica), 45 bytes

17-#==Reverse@#&@Mod[ToCharacterCode@#,11,2]&

Try it online!

Converts A to 10, T to 7, C to 12, and G to 5, by taking the ASCII codes mod 11 with offset 2. Then checks if the resulting list and its reverse add to 17 in each coordinate.

Misha Lavrov

Posted 2016-04-24T23:22:55.183

Reputation: 4 846

1

Clojure/ClojureScript, 49 chars

#(=(list* %)(map(zipmap"ATCG""TAGC")(reverse %)))

Works on strings. If the requirements are loosened to allow lists, I can take off the (list* ) and save 7 chars.

MattPutnam

Posted 2016-04-24T23:22:55.183

Reputation: 521

1

R, 70 bytes

f=function(x)all(chartr("GCTA","CGAT",y<-strsplit(x,"")[[1]])==rev(y))

Usage:

> f=function(x)all(chartr("GCTA","CGAT",y<-strsplit(x,"")[[1]])==rev(y))
> f("GTGACGTCAC")
[1] TRUE
> f("AACTGCGTTTAC")
[1] FALSE
> f("AGT")
[1] FALSE
> f("ATCGCGAT")
[1] TRUE

plannapus

Posted 2016-04-24T23:22:55.183

Reputation: 8 610

1

C, 71 bytes

Requires ASCII codes for the relevant characters, but accepts uppercase, lowercase or mixed-case input.

f(char*s){char*p=s+strlen(s),b=0;for(;*s;b&=6)b|=*--p^*s++^4;return!b;}

This code maintains two pointers, s and p, traversing the string in opposite directions. At each step, we compare the corresponding characters, setting b true if they don't match. The matching is based on XOR of the character values:

'A' ^ 'T' = 10101
'C' ^ 'G' = 00100

'C' ^ 'T' = 10111
'G' ^ 'A' = 00110
'A' ^ 'C' = 00010
'T' ^ 'G' = 10011
 x  ^  x  = 00000

We can see in the above table that we want to record success for xx10x and failure for anything else, so we XOR with 00100 (four) and mask with 00110 (six) to get zero for AT or CG and non-zero otherwise. Finally, we return true if all the pairs accumulated a zero result in b, false otherwise.

Test program:

#include <stdio.h>
int main(int argc, char **argv)
{
    while (*++argv)
        printf("%s = %s\n", *argv, f(*argv)?"true":"false");
}

Toby Speight

Posted 2016-04-24T23:22:55.183

Reputation: 5 058

1

, 13 chars / 17 bytes

⟮ïĪ`ACGT”⟯ᴙ≔Ⅰ

Try it here (Firefox only).

Explanation

Transliterate input from ACGT to TGCA and check if the resulting string is a palindrome.

Mama Fun Roll

Posted 2016-04-24T23:22:55.183

Reputation: 7 234