Scan dactylic hexameter in a unique puzzle

10

1

As a terrible Latin student for several years I have learned to hate many things about Latin. However there is one thing I love.

Latin scansion.

Scansion is the act of determining the meter of a particular line of poetry. For Latin this means demarcating each syllable in the line as "light" or "heavy".

In Latin scansion has many of rules. However unlike English Latin scansion is fairly regular and often requires no knowledge of Latin vocabulary or grammar to be performed. For this problem we will be using simplified subset of those rules (real Latin does not have a neat spec).

Before you begin the scansion you must elide. Elision is the dropping syllables between words to ease pronunciation. (e.g. "he is" -> "he's"). Unlike English, Latin elision follows very nice rules.

  • The final vowel of a word ending with a vowel is omitted if the next word begins with a vowel.

    NAUTA EST -> NAUTEST

  • The same goes for words ending in a vowel followed by "m".

    FIDUM AGRICOLAM -> FIDAGRICOLAM

  • Word-initial "h" followed by a vowel counts as a single vowel for elision purposes and is always dropped when elided.

    MULTAE HORAE -> MULTORAE

    or

    MULTAM HORAM -> MULTORAM

After elision we can begin scansion. Scansion is done to a specific meter. The meter for this challenge is dactylic hexameter. Dactylic hexameter has six "feet" each foot consists of two or three syllables. Syllables can be either long or short depending on the vowel. Each of the first five feet will be either a dactyl, a long syllable followed by two short ones, or a spondee, two long syllables. And the last foot will be a long followed by an anceps (short or long, for this problem you will not have to determine which).

  • A vowel in latin can be either short or long

  • An "i" sandwiched between two vowels (e.g. eiectum) is a consonant. (i.e. a "j")

  • An "i" beginning a word followed by a vowel (e.g Iactus) is also a consonant

  • A "u" after a "q" is also a consonant (i.e. a "v")

  • Diphthongs (ae, au, ei, eu, oe, and ui) are made up of two vowels but count as one vowel and are always long

  • A vowel with two or more consonants between it and the next vowel is always long

  • For the previous rule an "l" or an "r" after a "b","c","d","g","p", or "t" does not count as a consonant

  • "x" counts as two consonants

  • "ch", "ph", "th", and "qu" count as one consonant

  • The syllable "que" at the end of a word (after elision) is always short

  • If a vowel is not forced by one of the previous rules it can be either long or short this will depend on the meter

Your task will be to take a line of latin and produce the scansion of it. You will take in the line as string via standard input and output a string representing the final scansion.

The input will contain only spaces and characters A-Z.

To represent the scansion you will output all of the syllables with | demarcating the separation of feet. A long syllable will be represented by a - while a short syllable will be marked by a v and an anceps (the last syllable of every line) will be marked by a x. If there are multiple solutions as there often will be you may output anyone of them.

Test Cases

The start of Virgil's Aeneid.

 ARMA VIRUMQUE CANO TROIAE QUI PRIMUS AB ORIS     -> -vv|-vv|--|--|-vv|-x (or -vv|-vv|--|-vv|--|-x)
 ITALIAM FATO PROFUGUS LAVINIAQUE VENIT           -> -vv|--|-vv|-vv|-vv|-x
 LITORA MULTUM ILLE ET TERRIS IACTATUS ET ALTO    -> -vv|--|--|--|-vv|-x
 VI SUPERUM SAEVAE MEMOREM IUNONIS OB IRAM        -> -vv|--|-vv|--|-vv|-x (or -vv|--|-vv|-vv|--|-x)
 MULTA QUOQUE ET BELLO PASSUS DUM CONDERET URBEM  -> -vv|--|--|--|-vv|-x
 INFERRETQUE DEOS LATIO GENUS UNDE LATINUM        -> --|-vv|-vv|-vv|-vv|-x
 ALBANIQUE PATRES ATQUE ALTAE MOENIA ROMAE        -> --|-vv|--|--|-vv|-x

Further stipulations

In the proper fashion of Latin poetry all answers must begin with an invocation to the muses.

Latin has only two one letter words "e" and "a". You may assume that no other one letter words will appear as input.

Post Rock Garf Hunter

Posted 2016-09-02T00:28:10.510

Reputation: 55 382

2Oh god this brings back memories... – ThreeFx – 2016-09-02T10:03:04.613

1An "i" proceeding another vowel is a consonant (i.e. a "j"). In Lavinjaque (--vv) it is, but in Italiam (-vv-) in the same verse it isn't. Maybe put Js in the input? Do you actually have a working solution generating this output? – Lynn – 2016-09-03T08:27:47.660

Oh, the penultimate foot is always a dactyl, classically. You should specify whether answers can assume so. – Lynn – 2016-09-03T08:31:12.610

@Lynn Since the penultimate foot is not always a dactyl I have intentionally left it ambiguous. It may be either. – Post Rock Garf Hunter – 2016-09-03T14:35:20.447

@Dave 1) yes you are right 2) It must end the entire word. I will fix these shortly – Post Rock Garf Hunter – 2016-09-03T15:48:11.133

How does "QUI" parse? must the "I" vowel be long? or to put it another way, does rule 4/9 on the "QU" mean that rule 5 on the "UI" does not apply? – Dave – 2016-09-03T16:10:43.257

@Dave "QUI" can be long or short. Rule 5 does not apply. – Post Rock Garf Hunter – 2016-09-03T16:14:52.843

Answers

5

sed, 402 392 374 359 363 334 333 bytes

“Sing, goddess, the anger of Peleus’ son Achilleus and its devastation, which put pains thousandfold upon the Achians, hurled in their multitudes to the house of Hades strong souls of heroes, but gave their bodies to be the delicate feasting of dogs, of all birds, and the will of Zeus was accomplished since that time when first there stood in division of conflict Atreus’ son the lord of men and brilliant Achilleus.”

— Homer (The Iliad); confused why this quote is here? check the rules.

sed -E 's/[AEIOU]M? H?([AEIOU])/\1/g;s/X/cc/g;s/(^|[ AEIOU])I([AEIOU])/\1c\2/g;s/QUE( |$)/cv/g;s/A[EU]|E[IU]|OE|UI/-/g;s/[CPT]H|[BCDGPT][LR]|QU|[^-vAEIOU ]/c/g;s/ //g;s/ucc+/-/g;s/c//g;s/^[-u]([-u]|[vu]{2})[-u]([-u]|[vu]{2})[-u]([-u]|[vu]{2})[-u]([-u]|[vu]{2})[-u]([-u]|[vu]{2})[-u].$/-\1|-\2|-\3|-\4|-\5|-x/;s/[uv]/-/g;s/---/-vv/g'

Not exactly golfed, but this implements all the given rules in the form of regular expressions, which sed just runs one-by-one to reach the solution. This handles each line independently, so can process an entire multi-line input.

Usage:

printf 'ARMA VIRUMQUE CANO TROIAE QUI PRIMUS AB ORIS
ITALIAM FATO PROFUGUS LAVINIAQUE VENIT
LITORA MULTUM ILLE ET TERRIS IACTATUS ET ALTO
VI SUPERUM SAEVAE MEMOREM IUNONIS OB IRAM
MULTA QUOQUE ET BELLO PASSUS DUM CONDERET URBEM
INFERRETQUE DEOS LATIO GENUS UNDE LATINUM
ALBANIQUE PATRES ATQUE ALTAE MOENIA ROMAE' | sed -E '<...>';

Breakdown:

sed -E "
# Apply Elision
 s/[AEIOU]M? H?([AEIOU])/\1/g;

# Convert into vowels (u, v or -) and consonants (c) according to the rules given
 s/X/cc/g;
 s/(^|[ AEIOU])I([AEIOU])/\1c\2/g;
 s/QUE( |\$)/cv/g;
 s/A[EU]|E[IU]|OE|UI/-/g;
 s/[CPT]H|[BCDGPT][LR]|QU|[^-vAEIOU ]/c/g;
 s/[A-Z]/u/g; # all remaining vowels are unknown

# Remove all spaces
 s/ //g;

# A vowel followed by 2 consonants before the next vowel is long
# (and we don't care if the last vowel is long or short)
 s/ucc+/-/g;

# Remove all consonants
 s/c//g;

# Look for a matching dactylic hexameter and insert pipe separators
 s/^\
[-u]([-u]|[vu]{2})\
[-u]([-u]|[vu]{2})\
[-u]([-u]|[vu]{2})\
[-u]([-u]|[vu]{2})\
[-u]([-u]|[vu]{2})\
[-u].\$/-\1|-\2|-\3|-\4|-\5|-x/;

# Substitute identified feet with the necessary long/short vowels
 s/[uv]/-/g;
 s/---/-vv/g
"

Results for test cases:

-vv|-vv|--|--|-vv|-x
-vv|-vv|--|-vv|-vv|-x
-vv|--|--|--|-vv|-x
-vv|--|-vv|-vv|--|-x
-vv|--|--|--|-vv|-x
--|-vv|-vv|-vv|-vv|-x
--|-vv|--|--|-vv|-x

Dave

Posted 2016-09-02T00:28:10.510

Reputation: 7 519

Worth noting that I get different results for test cases 2 & 3, which appear to be alternative solutions not included in the question. Could be that I misinterpreted a rule though. – Dave – 2016-09-03T19:02:11.743

I don't think your scansion for test case 2 works. The last "U" in "PROFUGUS" must be long because there are two consonants ("S" and "L") before the next vowel. In your scansion you have it short. I am checking the third one now. Nice answer anyway :) – Post Rock Garf Hunter – 2016-09-03T23:49:35.197

@WheatWizard ah ok, that was a rule I was wondering about (should have asked) — I took it to mean 2 consonants without spaces. Easy enough to fix. I'll post an update soon. – Dave – 2016-09-04T08:03:02.363

Looks like I also had a bug where vccvccv would become -?? instead of --? — fixed now. Looks like it agrees with your samples on all but case #2 now. – Dave – 2016-09-04T09:45:33.663