Split a Shakespeare Script

13

4

Mr. William Shakespeare wrote plays. A lot of plays. In this tarball containing every single one of his works, each one of his plays is one long file.
It needs to be split into Scenes for a stage production. Because actors are impatient, your code needs to be as short as possible.

Task:

Your task is to write a program or function to split the plays contained in this file into separate files, numbered sequentially starting from 1, where each one contains a scene. You should retain all whitespace and titles.

Input:

Input will be a single play via stdin, or the filename as a parameter. You can choose. The play will look something like:

 TITUS ANDRONICUS


    DRAMATIS PERSONAE


SATURNINUS  son to the late Emperor of Rome, and afterwards
    declared Emperor.

BASSIANUS   brother to Saturninus; in love with Lavinia.

TITUS ANDRONICUS    a noble Roman, general against the Goths.

MARCUS ANDRONICUS   tribune of the people, and brother to Titus.


LUCIUS  |
    |
QUINTUS |
    |  sons to Titus Andronicus.
MARTIUS |
    |
MUTIUS  |


Young LUCIUS    a boy,
[...]
ACT I



SCENE I Rome. Before the Capitol.


    [The Tomb of the ANDRONICI appearing; the Tribunes
    and Senators aloft. Enter, below, from one side,
    SATURNINUS and his Followers; and, from the other
    side, BASSIANUS and his Followers; with drum and colours]

SATURNINUS  Noble patricians
[...]
ACT I



SCENE II    A forest near Rome. Horns and cry of hounds heard.


    [Enter TITUS ANDRONICUS, with Hunters, &c., MARCUS,
    LUCIUS, QUINTUS, and MARTIUS]

TITUS ANDRONICUS    The hunt is up, the morn is bright and grey,
    The fields are
[...]
ACT II



SCENE I Rome. Before the Palace.


    [Enter AARON]

AARON   Now climbeth Tamora
[...]

Output:

The output should look something like this:

ACT I



SCENE I Rome. Before the Capitol.


    [The Tomb of the ANDRONICI appearing; the Tribunes
    and Senators aloft. Enter, below, from one side,
    SATURNINUS and his Followers; and, from the other
    side, BASSIANUS and his Followers; with drum and colours]

SATURNINUS  Noble patricians...
ACT I



SCENE II    A forest near Rome. Horns and cry of hounds heard.


    [Enter TITUS ANDRONICUS, with Hunters, &c., MARCUS,
    LUCIUS, QUINTUS, and MARTIUS]

TITUS ANDRONICUS    The hunt is up, the morn is bright and grey,
    The fields are...
ACT II



SCENE I Rome. Before the Palace.


    [Enter AARON]

AARON   Now climbeth Tamora ...

etc.

Output either into numbered files, or to the stdout stream (returning for functions) with a deliminator of your choice.

Bonuses:

  • 10% If you save the bit before Act 1 into file 0. Note: It must not break if the bit before Act 1 is empty.
  • 15% If you can take both stdin and a file path parameter inputs
  • 20% If you can output both into files and to stdout / return.
  • 200 reputation if you can make the smallest SPL program. This bounty has been awarded.

Leaderboards

Here is a Stack Snippet to generate both a regular leaderboard and an overview of winners by language.

To make sure that your answer shows up, please start your answer with a headline, using the following Markdown template:

# Language Name, N bytes

where N is the size of your submission. If you improve your score, you can keep old scores in the headline, by striking them through. For instance:

# Ruby, <s>104</s> <s>101</s> 96 bytes

If there you want to include multiple numbers in your header (e.g. because your score is the sum of two files or you want to list interpreter flag penalties separately), make sure that the actual score is the last number in the header:

# Perl, 43 + 2 (-p flag) = 45 bytes

You can also make the language name a link which will then show up in the leaderboard snippet:

# [><>](http://esolangs.org/wiki/Fish), 121 bytes

body{text-align:left!important}#answer-list,#language-list{padding:10px;width:290px;float:left}table thead{font-weight:700}table td{padding:5px}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script> <link rel="stylesheet" type="text/css" href="//cdn.sstatic.net/codegolf/all.css?v=83c949450c8b"> <div id="answer-list"> <h2>Leaderboard</h2> <table class="answer-list"> <thead> <tr><td></td><td>Author</td><td>Language</td><td>Size</td></tr></thead> <tbody id="answers"> </tbody> </table> </div><div id="language-list"> <h2>Winners by Language</h2> <table class="language-list"> <thead> <tr><td>Language</td><td>User</td><td>Score</td></tr></thead> <tbody id="languages"> </tbody> </table> </div><table style="display: none"> <tbody id="answer-template"> <tr><td>{{PLACE}}</td><td>{{NAME}}</td><td>{{LANGUAGE}}</td><td>{{SIZE}}</td><td><a href="{{LINK}}">Link</a></td></tr></tbody> </table> <table style="display: none"> <tbody id="language-template"> <tr><td>{{LANGUAGE}}</td><td>{{NAME}}</td><td>{{SIZE}}</td><td><a href="{{LINK}}">Link</a></td></tr></tbody> </table><script>var QUESTION_ID=68997,OVERRIDE_USER=43394;function answersUrl(e){return"https://api.stackexchange.com/2.2/questions/"+QUESTION_ID+"/answers?page="+e+"&pagesize=100&order=desc&sort=creation&site=codegolf&filter="+ANSWER_FILTER}function commentUrl(e,s){return"https://api.stackexchange.com/2.2/answers/"+s.join(";")+"/comments?page="+e+"&pagesize=100&order=desc&sort=creation&site=codegolf&filter="+COMMENT_FILTER}function getAnswers(){jQuery.ajax({url:answersUrl(answer_page++),method:"get",dataType:"jsonp",crossDomain:!0,success:function(e){answers.push.apply(answers,e.items),answers_hash=[],answer_ids=[],e.items.forEach(function(e){e.comments=[];var s=+e.share_link.match(/\d+/);answer_ids.push(s),answers_hash[s]=e}),e.has_more||(more_answers=!1),comment_page=1,getComments()}})}function getComments(){jQuery.ajax({url:commentUrl(comment_page++,answer_ids),method:"get",dataType:"jsonp",crossDomain:!0,success:function(e){e.items.forEach(function(e){e.owner.user_id===OVERRIDE_USER&&answers_hash[e.post_id].comments.push(e)}),e.has_more?getComments():more_answers?getAnswers():process()}})}function getAuthorName(e){return e.owner.display_name}function process(){var e=[];answers.forEach(function(s){var r=s.body;s.comments.forEach(function(e){OVERRIDE_REG.test(e.body)&&(r="<h1>"+e.body.replace(OVERRIDE_REG,"")+"</h1>")});var a=r.match(SCORE_REG);a&&e.push({user:getAuthorName(s),size:+a[2],language:a[1],link:s.share_link})}),e.sort(function(e,s){var r=e.size,a=s.size;return r-a});var s={},r=1,a=null,n=1;e.forEach(function(e){e.size!=a&&(n=r),a=e.size,++r;var t=jQuery("#answer-template").html();t=t.replace("{{PLACE}}",n+".").replace("{{NAME}}",e.user).replace("{{LANGUAGE}}",e.language).replace("{{SIZE}}",e.size).replace("{{LINK}}",e.link),t=jQuery(t),jQuery("#answers").append(t);var o=e.language;/<a/.test(o)&&(o=jQuery(o).text()),s[o]=s[o]||{lang:e.language,user:e.user,size:e.size,link:e.link}});var t=[];for(var o in s)s.hasOwnProperty(o)&&t.push(s[o]);t.sort(function(e,s){return e.lang>s.lang?1:e.lang<s.lang?-1:0});for(var c=0;c<t.length;++c){var i=jQuery("#language-template").html(),o=t[c];i=i.replace("{{LANGUAGE}}",o.lang).replace("{{NAME}}",o.user).replace("{{SIZE}}",o.size).replace("{{LINK}}",o.link),i=jQuery(i),jQuery("#languages").append(i)}}var ANSWER_FILTER="!t)IWYnsLAZle2tQ3KqrVveCRJfxcRLe",COMMENT_FILTER="!)Q2B_A2kjfAiU78X(md6BoYk",answers=[],answers_hash,answer_ids,answer_page=1,more_answers=!0,comment_page;getAnswers();var SCORE_REG=/<h\d>\s*([^\n,]*[^\s,]),.*?([\d\.]+)(?=[^\n\d<>]*(?:<(?:s>[^\n<>]*<\/s>|[^\n<>]+>)[^\n\d<>]*)*<\/h\d>)/,OVERRIDE_REG=/^Override\s*header:\s*/i;</script>

wizzwizz4

Posted 2016-01-09T14:18:21.497

Reputation: 1 895

16I will give out a +200 bounty to the first valid Shakespeare Programming Language submission. – cat – 2016-01-09T14:28:05.040

3Come on, everyone knows that CodeGolfs aren't fast. Why not say that "The tarball is already pretty full, so your code should be as short as possible"? – J_F_B_M – 2016-01-09T14:30:38.943

@LegionMammal978 "With a deliminator of your choice". – J_F_B_M – 2016-01-09T14:31:18.180

Could you include some contents of the relevant file in the question instead of just giving a link? – nicael – 2016-01-09T14:54:06.210

The output was quite clear though, I was primarily asking to include some key parts of the input file – nicael – 2016-01-09T15:03:05.970

Are functions allowed? – Downgoat – 2016-01-09T16:44:53.330

Do the bonuses combine as sums or products? That is, does achieving all 3 bonuses result in 10 + 15 + 20 = 45% reduction or 90% of 85% of 80% = 61.2% (a 38.8% reduction)? – trichoplax – 2016-01-10T19:05:14.377

1@trichoplax Products. I thought that was how everybody did it! crosses out in notebook – wizzwizz4 – 2016-01-10T19:32:56.037

The leaderboard stack snippet is broken, and I can't fix it because it's only 4 characters. Can someone with 2k+ rep change (\d+) to ([\d\.]+) in SCORE_REG please? – Shelvacu – 2016-01-10T20:34:46.703

@shelvacu I would ask on the Meta page, linked to in the question. That is where the leaderboard is maintained. – wizzwizz4 – 2016-01-10T21:08:01.800

1

@cat Here you go! http://codegolf.stackexchange.com/a/69360/43394

– wizzwizz4 – 2016-01-13T19:46:49.817

@wizzwizz4 I have to wait 24 hours to award it :( – cat – 2016-01-13T19:49:27.023

@cat Could you wait 7 days? The question won't go anywhere, and neither will the answer... (Or maybe Robert deserves it ASAP. d:-D ) – wizzwizz4 – 2016-01-13T20:24:42.820

@wizzwizz4 sure, I'll wait 7 days and accept the shortest next wednesday. – cat – 2016-01-13T20:35:50.970

2@cat -- Leave it open; I'm sure there are smaller Shakespeare solutions than mine. Mine is as fat as the sum of a big big big cat and a cat. – Robert Fraser – 2016-01-15T22:59:44.280

1@RobertFraser I can't stop giggling at that -- I think you've invented William Suess ;) – cat – 2016-01-16T02:15:26.980

Answers

38

Shakespeare Programming Language 1.2.1, 930 895 887 - 10% = 798.3 bytes

G.Ajax,a.Puck,a.Page,a.Ford,a.Act I:a.Scene I:a.[Enter Ajax and Puck]Puck:Open thy mind.Ajax:Open thy mind.[Exit Puck][Enter Page]Ajax:Open thy mind.SCENE II:b.[Exeunt][Enter Puck and Ajax]Ajax:Am I as fat as the sum of the cube of a big big cat and a cat?Puck:If not,let us return to scene III.Am I as fat as the sum of you and a big cat?[Exit Puck][Enter Page]Page:If not,let us return to scene III.Am I as fat as the sum of the sum of the cube of a big big cat and a big big big big cat and a big big cat?[Exit Page][Enter Ford]Ajax:If not,let us return to scene III.You is a big big big big big big cat.Speak thy mind.Scene III:c.[Exeunt][Enter Ajax and Puck]Puck:Speak thy mind.You is as fat as I.[Exit Ajax][Enter Page]Page:You is as fat as I.Puck:Open thy mind.Is you as fat as a hog?[Exit Page][Enter Ajax]Puck:If not,let us return to Scene II.Speak thy mind.Ajax:Speak thy mind.

Ungolfed and rewritten in Sharkspearean language:

Four Gentlemen of Verona.

Ajax, a master code-golfer with years of experience.
Puck, a young Java programmer and a strong believer in object-oriented design patterns.
Page, a rapscallion of ill repute.
Ford, a car manufacturer.

Act I: A one-act masterpiece.

Scene I: In which many minds are opened, possibly via the consumption of psychadelic drugs.
[Enter Ajax and Puck]
Puck: Open thy mind.
Ajax: Open thy mind.
[Exit Puck]
[Enter Page]
Ajax: Open thy mind.

SCENE II: In which things are compared.
[Exeunt]
[Enter Puck and Ajax]
Ajax: Am I as hairy as the sum of the cube of a furry purple chihuahua and a summer's day?
Puck: If not, let us proceed to scene III. Am I as half-witted as the sum of you and a cunning squirrel?
[Exit Puck]
[Enter Page]
Page: If not,let us proceed to scene III. Am I as delicious as the sum of the sum of the cube of a warm healthy hamster and a proud handsome charming noble nose and a big old aunt?
[Exit Page]
[Enter Ford]
Ajax: If not, let us proceed to scene III. You are the cube of a tiny small pony. Speak thy mind.

Scene III: In which minds are spoken.
[Exeunt]
[Enter Ajax and Puck]
Puck: Speak thy mind. You are as smelly as I.
[Exit Ajax]
[Enter Page]
Page: You are as oozing as I.
Puck: Open thy mind. Are you as disgusting as a Microsoft?
[Exit Page]
[Enter Ajax]
Puck: If not,let us return to Scene II. Speak thy mind.
Ajax:Speak thy mind.

In C-like psuedocode:

Scene_I:
    Ajax = getchar()
    Puck = getchar()
    Page = getchar()
Scene_II:
    if(Ajax != 'A')
        goto Scene_III
    if(Puck != 'C')
        goto Scene_III
    if(Page != 'T')
        goto Scene_III
    Ford = '@'
    putchar(Ford)
Scene_III:
    putchar(Ajax)
    Ajax = Puck
    Puck = Page
    Page = getchar()
    if(Page != -1)
        goto Scene_II
    putchar(Ajax)
    putchar(Puck)

Requires the input file contains at least 3 characters. Uses "@" as a delimiter and reports results to stdout. I'm taking the 10% bonus since the part before the first scene will be before the first "@", much like Martin Büttner's solution above.

The way it works is to put a "@" if it sees three characters "ACT" in a row. Note this means it would transform "ENACTED" into "EN@ACTED". This can be fixed at the cost of a few hundred bytes, but luckily it seems every "ACT" in the given plays (at least the few I checked) was the beginning of a scene.

Tested with the 1.2.1 SPL linked above. I'm not sure if it will work on the web interpreter. The script used for testing was:

#!/bin/bash
set -e
SCRIPT_DIR=`dirname "$0"`
cd "$SCRIPT_DIR"
spl/bin/spl2c <splits.spl >splits.c
gcc -O2 -Wall -Wno-unused -I./spl/include -L./spl/lib -lm -lspl -o splits splits.c
./splits <measureforemeasure >measure.split.txt

The "esoteric" parts of the SPL once you get past the syntax are the shuffling of variables on "stage" (generally, you only want to have two characters on stage at a time) and the representation of constant numbers. There are 6 word lists of import that come with the distribution: positive adjectives, neutral adjectives, negative adjectives, positive nouns, neutral nouns, and negative nouns. A positive/neutral noun (ie plum or stone wall) is 1, and a negative noun (ie flirt-gill or Microsoft) is -1. Positive/neutral adjectives (ie embroidered or bottomless) multiply the number by 2, and negative adjectives (ie fat-kidneyed or fatherless) multiply by -2. The word lists are sadly rather limited, with only 10-20 entries each.

At my next meeting, I'll be suggesting we move all our production code to Shakespeare because it's far more expressive than Scala.

Robert Fraser

Posted 2016-01-09T14:18:21.497

Reputation: 912

2Golf this as much as you can. Please! – wizzwizz4 – 2016-01-13T19:45:44.893

1Holy cats, I didn't think anyone actually would! I'll award this in 24 hours, which is as soon as I can :) – cat – 2016-01-13T19:50:37.487

2@wizzwizz4 - definitely; i'll give it a shot when i don't have real work to do :-). it'll will be as succinct as if it were were written by the bard himself – Robert Fraser – 2016-01-13T19:57:10.227

1The bounty now counts as a bonus. Add a +200 reputation to the title if you want! :-) – wizzwizz4 – 2016-01-13T21:19:35.537

2Now, who can I hire to perform this on stage? – cat – 2016-01-16T02:17:30.203

1I reckon you've probably got the bounty by now. Somebody else could take it, but I doubt it. :-) – wizzwizz4 – 2016-01-19T18:21:27.377

3@cat - Just get three people who are insecure about their weight and show them a picture of a cat. – Robert Fraser – 2016-01-19T19:10:10.117

You can now add a try it online link. There's also a tip on the tips thread for SPL you didn't use.

– NieDzejkob – 2018-05-08T13:50:44.360

Some quick golfing can get this down to 717 bytes

– Jo King – 2018-05-27T08:49:18.650

706 bytes – Hello Goodbye – 2019-12-14T21:23:35.123

12

Retina, 9 - 10% = 8.1 bytes

Byte count assumes ISO 8859-1 encoding.

¶ACT 
=$0

Inserts a = (as a delimiter) in front of every ACT that is preceded by a linefeed and followed by a space.

Try it online! (But you'll have to copy in the input yourself due to its size.)

Martin Ender

Posted 2016-01-09T14:18:21.497

Reputation: 184 808

Congratulations to answer 69000 (according to the share-link). – J_F_B_M – 2016-01-11T01:00:15.020

@J_F_B_M Post 69000. That's questions and answers. – wizzwizz4 – 2016-01-13T20:25:26.690

4

awk, 51 * .9 * .85 * .8 = 31.2

Splits into multiple files. Outputs on stdout separated by a =.

/^ACT/{f++;$0="="$0}{system("echo \""$0"\">>"f*1)}1

Rainer P.

Posted 2016-01-09T14:18:21.497

Reputation: 2 457

+1 All you need to do now is to output all of the files to stdout separated by a specific character, and you're done! – wizzwizz4 – 2016-01-09T15:49:38.423

Done. With the bonus it's almost the same length. – Rainer P. – 2016-01-09T15:57:32.620

+2... +2........ +2............ No. The system doesn't allow it :-( I would however recommend separating them with a character that is even less common, such as ¬ or ¦. – wizzwizz4 – 2016-01-09T16:23:37.273

3

JavaScript ES6, 28 - 10% = 25.2 bytes

s=>s.replace(/\nACT/g,"=$&")

Not even the JS shell has file I/O so this can't qualify for the -20% bonus

Try it online here (you'll have to paste the input in yourself)

Downgoat

Posted 2016-01-09T14:18:21.497

Reputation: 27 116

I think that you can take out the T for one byte saved. – Mama Fun Roll – 2016-01-10T15:38:53.593

Doesn't replace remove the ACT line? – wizzwizz4 – 2016-01-10T21:09:32.183

@wizzwizz4 because I have the $& it won't – Downgoat – 2016-01-10T21:10:03.757

@Doᴡɴɢᴏᴀᴛ You learn something new every day! – wizzwizz4 – 2016-01-10T21:35:04.000

3

Perl, 66 - 10% - 20% = 47.52 bytes

BEGIN{open(S,">0");}++$?,open(S,">$?"),print"=\n"if/^ACT/;print S

Added one for the -p option.

Neil

Posted 2016-01-09T14:18:21.497

Reputation: 95 035

1

Ruby, 30 - 10% - 15% = 23.715 22.95 bytes

Splits input on $. 15% bonus applies because Ruby redirects $< to point at the file passed into ARGV by default if it's supplied, or STDIN if not.

-1 byte by leveraging gsub similar to @Downgoat ES6 solution but I'm still leveraging the hope that ACT only ever appears at the ACT labels and not inside any other word, just because

$><<$<.read.gsub("ACT","$ACT")

Also, my 41.004 (originally 67) byte solution that also does file output. Starring probably the only time the each command saves bytes over map in Ruby, because each returns the array passed in unadulterated after running its block, unlike map.

i=-1;$><<$<.read.split(/(?=ACT)/).each{|s|open("#{i+=1}",?w)<<s}*?$

Value Ink

Posted 2016-01-09T14:18:21.497

Reputation: 10 608