Reduce string to a snippet of the alphabet

25

2

Given a non-empty string consisting of only lower and upper case alphabetical characters and spaces ([a-zA-Z ]), reduce it to a snippet of the alphabet, starting with the first character.

To reduce a string, begin with the first alphabetical character, then remove every character after it that is not the next letter of the alphabet. Continue doing this until you reach the end of the string.

For example codegolf:

Begin with c, remove o as it is not the next letter of the alphabet.
Keep d as it is the next letter of the alphabet, and keep e as it is the next letter too.
Remove g, o, and l, and keep f.

Your final snippet would then be cdef

Rules

  • Capitalisation should be maintained, so CodEgolF would result in CdEF
  • Space is not a letter of the alphabet, and thus should always be removed, even if it is the start of the string
  • Due to the nature of the reduction, the first alphabetical character of the input will always be the first character of the output.
  • zZ is the last letter of the alphabet. There are no letters after it, the alphabet does not loop.

Test Cases

codegolf -> cdef
CodEgolf -> CdEf
 codeolfg -> cdefg
ProgrammingPuzzles -> P
Stack Exchange -> St
The quick red fox jumped over the lazy brown dog -> Tuvw
Zebra -> Z
Abcdegfhijkl -> Abcdef

Scoring

This is , so fewest bytes in each language wins!

Skidsdev

Posted 2017-08-17T10:00:49.740

Reputation: 9 656

From the second last test case, i see that if we reach z We just stop, right? – Mr. Xcoder – 2017-08-17T10:05:10.327

@Mr.Xcoder Correct, see the last point under "Rules" – Skidsdev – 2017-08-17T10:08:46.450

2Please add a test case with a space at the beginning. Like: <space>codegolf – Mr. Xcoder – 2017-08-17T10:20:19.580

Can I return an array of the output letters? – TheLethalCoder – 2017-08-17T10:32:21.963

Can we return the letters separated by a newline or as an array? – Mr. Xcoder – 2017-08-17T10:33:30.150

1@Mr.Xcoder yes you can – Skidsdev – 2017-08-17T10:43:34.303

Answers

12

JavaScript (ES6), 66 79 68 67 bytes

f=([c,...s],p)=>c?(p?~parseInt(c+p,36)%37:c<'!')?f(s,p):c+f(s,c):''

How?

Testing consecutive letters

Because converting two characters to their ASCII codes would be a rather lengthy operation in JS, we use the following formula instead:

~parseInt(b + a, 36) % 37

Provided that both a and b are in [a-zA-Z ], the above expression equals 0 if and only if a and b are consecutive letters (i.e. consecutive digits in base 36), no matter the case of the characters.

For instance:

~parseInt("Y" + "x", 36) = ~(36 * parseInt("Y", 36) + parseInt("x", 36))
                         = ~(36 * 34 + 33)
                         = -(36 * 34 + 33 + 1)
                         = -(37 * 34)

Formatted and commented

f = ([c,                              // c = current character
         ...s],                       // s = array of remaining characters
                p) =>                 // p = previous matching letter
  c ? (                               // if there's still at least 1 character to process:
      p ?                             //   if p was already defined:
        ~parseInt(c + p, 36) % 37     //     test if p and c are NON-consecutive letters
      :                               //   else:
        c < '!'                       //     test if c is a space character
    ) ?                               //   if the above test passes:
      f(s, p)                         //     ignore c and keep the current value of p
    :                                 //   else:
      c + f(s, c)                     //     append c to the final result and update p to c
  :                                   // else:
    ''                                //   stop recursion

Test cases

f=([c,...s],p)=>c?(p?~parseInt(c+p,36)%37:c<'!')?f(s,p):c+f(s,c):''

console.log(f("codegolf")) // -> cdef
console.log(f("CodEgolf")) // -> CdEf
console.log(f(" codeolfg")) // -> cdefg
console.log(f("ProgrammingPuzzles")) // -> P
console.log(f("Stack Exchange")) // -> St
console.log(f("The quick red fox jumped over the lazy brown dog")) // -> Tuvw
console.log(f("Zebra")) // -> Z
console.log(f("Abcdegfhijkl")) // -> Abcdef

Arnauld

Posted 2017-08-17T10:00:49.740

Reputation: 111 334

7

Python 3, 75 85 84 91 81 77 75 bytes

I think this is as short as it can get in Python 3. It can be shortened by a few bytes in Python 2, as shown in Sisyphus' submission.

  • EDIT: +10 for fixing a bug
  • EDIT: -1 by fixing another bug
  • EDIT: +7 for fixing another bug
  • EDIT: -10 bytes with help from @Ruud
  • EDIT: -4 bytes since the OP allowed us to output the letters separated by a newline
  • EDIT: -2 bytes thanks to @Ruud, back to the original byte count!
s=input().strip();k=0
for i in s:
 if(ord(i)-ord(s[0]))%32==k:k+=1;print(i)

Try it online!

Mr. Xcoder

Posted 2017-08-17T10:00:49.740

Reputation: 39 774

I have ideas for improvement, golfing on mobile soon. – Mr. Xcoder – 2017-08-17T10:45:43.077

281 bytes. Uppercase and lowercase letters conveniently match when modulated by 32 – Arfie – 2017-08-17T10:48:11.503

@Ruud Those are exactly the things I was talking about in my comment, editing. – Mr. Xcoder – 2017-08-17T10:49:05.083

79 bytes – Arfie – 2017-08-17T10:53:50.883

Also, you can probably switch to Python 2 now and save a byte on the print – Arfie – 2017-08-17T10:55:40.530

@Ruud Thanks for all your help. I will keep it in Python 3 though – Mr. Xcoder – 2017-08-17T10:57:44.700

8I am waiting for the downvoter to explain their reasons. – Mr. Xcoder – 2017-08-17T12:58:07.173

7

Python 2, 69 bytes

lambda s:reduce(lambda x,y:x+y*((ord(y)-ord(x[~0]))%32==1),s.strip())

Try it online!

A simple reduction of the string. We simply concatenate the next character if and only if (ord(y)-ord(x[~0]))%32==1. Very ugly check - I am sure it can be improved, but I'm not sure how!

Sisyphus

Posted 2017-08-17T10:00:49.740

Reputation: 1 521

Clever solution! Too bad it's Python 2-only :P – Mr. Xcoder – 2017-08-17T12:34:13.150

You can make it compatible with Python 3 with from functools import*. – totallyhuman – 2017-08-17T12:52:20.427

1

@ThomasWard totallyhuman was just telling others how to make it Python 3 compatible. Btw, import functools as f and f. is much longer than from functools import* for sure, even used once. See this thread for more information.

– Mr. Xcoder – 2017-08-17T15:20:49.850

6

05AB1E, 13 bytes

áćsv¤yìuÇÆiy«

Try it online!

-1 thanks to Adnan

Erik the Outgolfer

Posted 2017-08-17T10:00:49.740

Reputation: 38 134

Can you replace ðK by á? – Adnan – 2017-08-17T11:29:47.913

@Adnan I think so... – Erik the Outgolfer – 2017-08-17T11:52:02.193

4

Brachylog, 15 bytes

;ṢxS⊇.ḷ~sẠ∧Sh~h

Try it online!

This would be 10 bytes: ⊇.ḷ~sẠ&h~h, if it weren't for the fairly uninteresting "strings can start with spaces" constraint.

Explanation

;ṢxS               S is the Input with all spaces removed
   S⊇.             The Output is an ordered subset of the Input
     .ḷ            The Output lowercased…
        ~sẠ          …is a substring of "abcdefghijklmnopqrstuvwxyz"
           ∧
            Sh     The first char of S…
              ~h   …is the first char of the Output

Since this is fairly declarative, this is also really slow.

Fatalize

Posted 2017-08-17T10:00:49.740

Reputation: 32 976

Well, at least it beats Jelly! And, on the plus side, I don't think you can really outgolf this... – Erik the Outgolfer – 2017-08-17T16:21:35.983

3

MATL, 18 16 15 bytes

Thanks to Mr.Xcoder for pointing out a mistake, now corrected

Xz1&)"t@hkd1=?@

Letters in the output are separated by newlines.

Try it online! Or verify all test cases (the footer code displays all output letters on the same line for clarity).

Explanation

Xz       % Implicitly input a string. Remove spaces
1&)      % Push first character and then the remaining substring
"        % For each
  t      %   Duplicate previous character
  @      %   Push current character
  h      %   Concatenate both characters
  k      %   Convert to lowercase
  d      %   Consecutive difference. Gives a number
  1=     %   Is it 1?
  ?      %   If so
    @    %     Push current char
         %   End (implicit)
         % End (implicit)
         % Display stack (implicit)

Luis Mendo

Posted 2017-08-17T10:00:49.740

Reputation: 87 464

You forgot to remove the spaces when they are at the beginning of the string: Space is not a letter of the alphabet, and thus should always be removed, even if it is the start of the string. – Mr. Xcoder – 2017-08-17T10:19:06.117

@Mr.Xcoder Thanks! Corrected – Luis Mendo – 2017-08-17T10:21:33.370

3

Java (OpenJDK 8), 102 101 74 bytes

s->{char c=0;for(char x:s)if(c<1&x>32|~-x%32==c%32)System.out.print(c=x);}

Try it online!

-27 bytes thanks to @Olivier Grégoire

Nevay

Posted 2017-08-17T10:00:49.740

Reputation: 421

175 bytes: s->{char c=0;for(char x:s)if(c<33&x>33|~-x%32==c%32)System.out.print(c=x);} (with char[] as input). – Olivier Grégoire – 2017-08-18T09:33:01.720

2

C# (Mono), 129 107 93 91 87 bytes

s=>{var r=s.Trim()[0]+"";foreach(var c in s)if(r[r.Length-1]%32==~-c%32)r+=c;return r;}

Saved 2 bytes thanks to @Mr. Xcoder.
Saved 4 bytes thanks to @jkelm.

Try it online!

TheLethalCoder

Posted 2017-08-17T10:00:49.740

Reputation: 6 930

Fails on leading spaces – Skidsdev – 2017-08-17T10:44:51.877

@Mayube Woops didn't see that, fixed. – TheLethalCoder – 2017-08-17T10:47:21.917

291 bytes. In C-like languages and Python, (c-1)%32 is ~-c%32 – Mr. Xcoder – 2017-08-17T11:40:09.763

187 bytes You don't need to reassign the trimmed string because of the checks in the for loop – jkelm – 2017-08-17T12:16:44.760

2

Pyth, 23 22 21 20 bytes

-1 byte indirectly thanks to @Erik the Outgolfer's trick (-Qd).

-1 byte thanks to @Erik the Outgolfer.

VQIqZ%-CNCh-Qd32N=hZ

Try it here.

Mr. Xcoder

Posted 2017-08-17T10:00:49.740

Reputation: 39 774

2

Pyth, 21 20 18 bytes

ef&qhThQhxGrT0tyr6

Try it here.

Way more efficient 20-byte version:

.U+b?t-CrZ1Creb1kZr6

Try it here.

-1 thanks to Mr. Xcoder (indirectly).

Erik the Outgolfer

Posted 2017-08-17T10:00:49.740

Reputation: 38 134

Equivalent: .U+b?tlrreb1rZ1kZrz6 (I think). That trick helped me though. – Mr. Xcoder – 2017-08-17T12:43:29.413

@Mr.Xcoder If that was an equivalent I could've saved a byte with .U+b?tlrreb1rZ1kZr6 but unfortunately r <str> 6 means A.strip(), not removing non-leading-or-trailing whitespace. – Erik the Outgolfer – 2017-08-17T12:49:13.387

Oh yeah, I didn't see your solution relies on all the spaces being removed (mine doesn't) – Mr. Xcoder – 2017-08-17T12:50:46.417

@Mr.Xcoder Umm, you should remove all spaces. – Erik the Outgolfer – 2017-08-17T12:51:05.887

No, I shouldn't, since space has an ASCII value of 32, whilst all the letters have > 64, and thus not affecting the functionality. I think this applies to your answer too. – Mr. Xcoder – 2017-08-17T12:53:18.330

@Mr.Xcoder Hmm, you may be right. – Erik the Outgolfer – 2017-08-17T12:53:49.070

I am quite sure about that, because I am then guaranteed to take the first letter, and thus spaces will always be falsy. It is a pure port of my Python solution. – Mr. Xcoder – 2017-08-17T12:55:03.300

Let us continue this discussion in chat.

– Erik the Outgolfer – 2017-08-17T12:55:37.007

2

PHP, 64+1 bytes

while($c=$argn[$i++])$c<A||$n&&($c&_)!=$n||(print$c)&$n=++$c&__;

Run as pipe with -nR or try it online.


Apart from the usual tricks: When $c reaches Z, ++$c results in AA,
and &__ keeps that length untouched; so $n will match no further $c.

Titus

Posted 2017-08-17T10:00:49.740

Reputation: 13 814

2

Haskell, 106 105 97 bytes

import Data.Char
import Data.List
z=ord.toUpper
a%b|z a+1==z b=b|0<3=a
nub.scanl1(%).filter(>' ')

I tried to use fromEnum + char arithmetic instead of importing Data.Char, but that ended up being longer...

Saved 8 bytes thanks to H.PWiz!

Try it online.

Cristian Lupascu

Posted 2017-08-17T10:00:49.740

Reputation: 8 369

102 bytes with filter – H.PWiz – 2017-08-17T13:41:39.483

Or 100 bytes with Data.List

– H.PWiz – 2017-08-17T13:47:38.333

@H.PWiz Great! Thanks! – Cristian Lupascu – 2017-08-17T14:48:34.240

1

Jelly, 17 bytes

ḟ⁶;ð,ŒuṪ€O_/⁼-ð¡/

Try it online!

Erik the Outgolfer

Posted 2017-08-17T10:00:49.740

Reputation: 38 134

1

q/kdb+, 47 45 bytes

Solution:

{10h$({(x;x,y)1=mod[y-last x;32]}/)7h$trim x}

Examples:

q){"c"$({(x;x,y)1=mod[y-last x;32]}/)7h$trim x}"CodEgolf"
"CdEf"
q){"c"$({(x;x,y)1=mod[y-last x;32]}/)7h$trim x}" codeolfg"
"cdefg"
q){"c"$({(x;x,y)1=mod[y-last x;32]}/)7h$trim x}"ProgrammingPuzzles"
"P"
q){"c"$({(x;x,y)1=mod[y-last x;32]}/)7h$trim x}"The quick red fox jumped over the lazy brown dog"
"Tuvw"

Explanation:

Leveraging the mod 32 trick from existing solutions along with converge function. Iterate over the string, if the difference between the last element of the result (e.g. starts with T for "The quick red fox...") and the current character is 1 (after being mod'd with 32), then we add this to the result (hence taking why we take last x), then cast everything back to a string.

{10h$({(x;x,y)1=mod[y-last x;32]}/)7h$trim x} / the solution
{                                           } / lambda function
                                      trim x  / trim whitespace (leading/trailing)
                                   7h$        / cast string to ASCII (a -> 97)
     ({                         }/)           / converge
                    y-last x                  / y is the next item in the list, x contains results so far
              1=mod[        ;32]              / is the result mod 32 equal to 1
       (x;x,y)                                / if false, return x, if true return x concatenated with y
 10h$                                         / cast back to characters

streetster

Posted 2017-08-17T10:00:49.740

Reputation: 3 635

1

Japt, 18 17 16 bytes

Saved 1 byte thanks to @Shaggy

x
c
Çc %H¥V%H©V°

Test it online!

Was thinking this would be a good bit shorter, but... Such is life...

Explanation

x    First line: set U to the result.
x    Trim all spaces off of the input. Only necessary to remove leading spaces.

c    Second line: set V to the result.
c    Take the charcode of the first character in U.

 Ç   c %H¥ V%H© V°
UoZ{Zc %H==V%H&&V++}   Final line: output the result.
UoZ{               }   Filter to only the chars in Z where
    Zc                   the charcode of Z
       %H                mod 32
         ==V%H           equals V mod 32.
              &&V++      If true, increment V for the next letter.

ETHproductions

Posted 2017-08-17T10:00:49.740

Reputation: 47 880

Shorter than my 28 byte travesty, at least! :D It looks like you can replace rS with x. – Shaggy – 2017-08-17T15:00:25.343

1

C# (.NET Core), 70 60 + 18 bytes

-10 bytes thanks to TheLethalCoder

a=>{var c=a.Trim()[0];return a.Where(x=>x%32==c%32&&++c>0);}

Byte count also includes:

using System.Linq;

Try it online!

1 byte longer (currently) (not anymore) than TheLethalCoder's so posting for the sake of fun. Different approach, with LINQ.

This takes advantage of two C-like features in C# - a character char variable is implicitly behaving the same as an integer int, and boolean AND operator && doesn't execute right operation if left returns a false. Code explanation:

a =>                                  // Take string as input
{
    var c = a.Trim()[0];              // Delete leading spaces and take first letter
    return a.Where(                   // Filter out characters from the string, leaving those that:
               x => x % 32 == c % 32  // it's the next character in alphabet case-insensitive (thanks to modulo 32 - credits to previous answers)
               && ++c > 0             // If it is, go to the subsequent character in alphabet (and this always has to return true)
           );
}

Grzegorz Puławski

Posted 2017-08-17T10:00:49.740

Reputation: 781

Remove the .ToArray() by returning as an IEnumerable<char> to save bytes. – TheLethalCoder – 2017-08-17T15:40:59.470

@TheLethalCoder right, I just saw the comment under the challenge. Thank you! – Grzegorz Puławski – 2017-08-17T17:56:33.660

1

Perl 5, 30 + 1 (-n) = 31 bytes

/$b/i&&(print,$b=++$_)for/\S/g

Try it online!

How?

/$b/i        # check if this letter equals the one in $b, ignore case
&&(print,    # output it if so
$b=++$_)     # store the next character to find
for/\S/g     # Looping over all non-whitespace characters

Xcali

Posted 2017-08-17T10:00:49.740

Reputation: 7 671

1

Perl 6, 51 bytes

{S:i:g/\s|(\w){}<([<!before "{chr $0.ord+1}">.]+//}

Test it

Expanded:

{  # bare block lambda with implicit parameter $_

  S                          # substitute implicitly on $_, not in-place
  :ignorecase
  :global
  /

    |  \s                    # match any space

    |  (\w)                  # match a word character
       {}                    # make sure $/ is updated (which $0 uses)

       <(                    # ignore everything before this

       [

           <!before "{       # make sure this won't match after this point
             chr $0.ord + 1  # the next ASCII character
           }">

           .                 # any character

       ]+                    # match it at least once

  //                         # remove what matched
}

Note that <!before …> is a zero width assertion

Brad Gilbert b2gills

Posted 2017-08-17T10:00:49.740

Reputation: 12 713

0

Retina, 76 bytes

 

^.
$&$&$&¶
{T`@@L@l`@l@l@`..¶
T`l`L`.¶
(.)(.)((¶).*?(\1|\2)|¶.*)
$5$5$5$4

Try it online! Link includes test cases. Explanation:

 

Delete spaces.

^.
$&$&$&¶

Triplicate the first character and insert a separator.

{T`@@L@l`@l@l@`..¶
T`l`L`.¶

Convert the second and third characters to lower case and increment them. Convert the latter to upper case. These are now the search characters.

(.)(.)((¶).*?(\1|\2)|¶.*)
$5$5$5$4

Try to match either of the search characters. If found, then triplicate the match, which restarts the loop for the next search. Otherwise, just delete the search characters and the rest of the input.

Neil

Posted 2017-08-17T10:00:49.740

Reputation: 95 035

0

8th, 114 bytes

Code

: z dup n:1+ 32 bor >r "" swap s:+ . ; 
: f s:trim 0 s:@ z ( nip dup 32 bor r@ n:= if rdrop z then ) s:each rdrop ;

Explanation

: z             \ n -- (r: x)
                \ print letter and save on r-stack OR-bitwised ASCII code of following letter
  dup           \ duplicate item on TOS
  n:1+          \ get ASCII code of the following letter
  32 bor        \ bitwise OR of ASCII code and 32 
  >r            \ save result on r-stack
  "" swap s:+ . \ print letter
;

: f        \ s -- 
  s:trim   \ remove trailing whitespace
  0 s:@    \ get 1st letter
  z        \ print 1st letter and save on r-stack OR-bitwised ASCII code of following letter
  ( nip    \ get rid of index
    dup    \ duplicate item on TOS
    32 bor \ bitwise OR of current ASCII code and 32 
    r@     \ get value stored on r-stack
    n:=    \ compare values to see if letter is printable or not
    if 
      rdrop \ clean r-stack
      z     \ print letter and save on r-stack OR-bitwised ASCII code of following letter
    then 
  ) 
  s:each    \ handle each character in string
  rdrop     \ clean r-stack
;

Example

ok> " The quick red fox jumped over the lazy brown dog" f
Tuvw

Chaos Manor

Posted 2017-08-17T10:00:49.740

Reputation: 521

0

C (gcc), 79 78 75 70 bytes

c;f(char*s){for(c=0;*s;s++)*s^32&&!c|c==(*s|32)-1?c=32|putchar(*s):0;}

Try it online!

cleblanc

Posted 2017-08-17T10:00:49.740

Reputation: 3 360

0

Proton, 59 bytes

Port of the Python 2 submission.

s=>reduce((x,y)=>x+y*((ord(y)-ord(x[-1]))%32==1),s.strip())

Try it online!

Mr. Xcoder

Posted 2017-08-17T10:00:49.740

Reputation: 39 774

0

Pyth, 15 bytes

eo,}r0NG_xQhNty

Test suite

Unlike all of the other answers, this doesn't stick together the output, it generates all subsequences of the input, and then orders them to put the desired string at the end, and outputs it.

isaacg

Posted 2017-08-17T10:00:49.740

Reputation: 39 268

I think you have to check if the first letter of the output is the first letter of the input too. And I think initial order matters. – Erik the Outgolfer – 2017-08-18T09:14:55.637

@EriktheOutgolfer Sorry, are you saying the answer is wrong? I make sure the sequence whose first character is earliest in the input among all subsequence that are in alphabetical is the one sorted to the end. See the test case starting with a space. – isaacg – 2017-08-18T21:20:16.967

Can you add an explanation please? I may have misunderstood or something... – Erik the Outgolfer – 2017-08-19T10:39:55.410

0

J, partial solution

I'm posting this for feedback and ideas for improvement more than anything else. It works, but doesn't handle the capitalization and space edge cases, and is already long for J.

First a dyadic helper verb that tells you if the left and right args are alphabetically adjacent:

g=.(= <:)&(a.&i.)  NB. could save one char with u:

Next a verb that removes the first element which is not part of an alphabetic streak starting from the first element:

f=.({~<^:3@>:@i.&0@(0,~2&(g/\))) ::]

Note we use Adverse :: to return the entire argument unchanged if there is no non-streak element found (ie, if the entire argument is a valid alphabetic streak).

Finally, the solution is given by applying f until convergence:

f^:_ 'codegolf'  NB. => 'cdef'

Try it online!


And here is a parsed version of f for easier reading:

           ┌─ ~ ─── {                         
           │                              ┌─ <
           │                       ┌─ ^: ─┴─ 3
           │                 ┌─ @ ─┴─ >:      
       ┌───┤           ┌─ @ ─┴─ i.            
       │   │     ┌─ & ─┴─ 0                   
       │   │     │                            
       │   └─ @ ─┤     ┌─ 0                   
── :: ─┤         │     ├─ ~ ─── ,             
       │         └─────┤                      
       │               │     ┌─ 2             
       │               └─ & ─┴─ \ ─── / ──── g
       └─ ]         

Side Question: why do the box characters not align perfectly when display on SO (they work in my console):

Jonah

Posted 2017-08-17T10:00:49.740

Reputation: 8 729