Practical Golf - US States

11

2

My family has an e-commerce business. On our own site, we force people to choose their state from a dropdown menu when they enter their address, but through some other channels we use, customers can enter anything they want into the box.

My mom loves the invoice templates I made for her, which are automagically generated. But because they're so pretty and balanced, she can't stand it when people WRITE OUT the names of their states, or worse, write something like "new jersey." She says it ruins the look.

My dad likes code to be lightweight. So rather than using a switch-case block, he wants a leaner solution.

So the challenge is to make a short function that takes the possible inputs and returns a two letter abbreviation (capitalized, for Mom). We're going to make a (faulty) assumption that our users can spell and always put a space in the name (where needed) or pass in the correct abbreviation. The scope is the 50 US states.

  • New York
  • new york
  • NY
  • ny

are all acceptable inputs for New York, and should output NY.

If something like New Yrok is passed in, the function can return the original value.

You can use any common language. This is a popularity contest, so the one with the most votes at the end of a week wins. I assume that this will favor novelty and utility.

EDIT: The description is story fluff, but I was working a similar project and thought that there must be a more interesting way to do it. I can do the project myself (already did) but I thought this was a good place for a more interesting challenge. By "Any common language" I was excluding custom languages/libraries designed for this challenge - I was trying to look for novel methods, rather than free code help. I figure everyone has done this at some point, but it would be fun to do it in an unusual way. I find that the most interesting projects are the ones where you tackle everyday tasks in new and interesting ways - that's why this is a popularity contest rather than golf.

Josiah

Posted 2014-10-03T18:23:05.850

Reputation: 253

Question was closed 2014-10-09T19:46:46.440

I'm disappointed it is closed. It looks like everybody managed to get an answer in except me... – Jerry Jeremiah – 2015-08-09T23:26:22.927

Not sure if +1 for your mom's thoughts on aesthetics or -1 for your dad's thoughts on short code. – Ingo Bürk – 2014-10-03T18:32:29.157

Would ny output NY? – hmatt1 – 2014-10-03T18:37:03.207

Yes, it should. I'll make that edit. – Josiah – 2014-10-03T18:38:37.357

14I'm unsure why this is a popularity contest instead of code golf (especially since the name includes 'golf' and your dad favors short code). – Geobits – 2014-10-03T18:47:55.227

I agree with Geobits, this would work better as a code-golf I think. – James Williams – 2014-10-03T18:52:45.857

@Geobits: For actual production code, you would want the code to be lean but reasonable, not golfed... unless you want to maintain code that looks like something an electric cat coughed up (X⌿⍨Y^.=⍨X↑⍨(⊃⍴X),⍴Y)? – Claudiu – 2014-10-03T19:26:26.237

5@Claudiu True, but this site isn't intended for production code... – Geobits – 2014-10-03T19:27:41.400

@Geobits: Isn't it? "My family has an e-commerce business. On our own site [...] My dad likes code to be lightweight. So rather than using a switch-case block, he wants a leaner solution." It sounds like he plans to use this code on the e-commerce site. – Claudiu – 2014-10-03T20:39:46.180

3@Claudiu I honestly assumed that was "story fluff" of the sort that usually goes with these challenges. Either way, when I said "this site..." I meant PP&CG, as most code here is explicitly not intended for use in production. Honestly, if he's looking for actual code to use on his site, it would be more ethical to do it himself or contract it out ;) – Geobits – 2014-10-03T20:52:43.297

1@Geobits I thought it was fluff as well. Based on his profile, it sounds like OP is actually looking for code since his parents have an online toy store. I deleted my answer and voting to close as off-topic. This isn't a code challenge, its free contracting. – hmatt1 – 2014-10-03T21:18:33.603

1It sounds like OP is asking us to write code for him, but trying to frame it as a code-challenge. – hmatt1 – 2014-10-03T21:19:15.940

8@chilemagic you can use any code ... so OP will rewrite his site to use your APL/CJAM/GolfScript solution? It's a challenge based on a true story. I vote up – edc65 – 2014-10-03T21:23:49.430

4It's a pretty trivial task, why would OP go to all the effort of typing up a question when it would be easier just to code it himself? Either way, I enjoyed giving it a go. – James Williams – 2014-10-03T21:47:53.700

@edc65 IMO, There's a big difference between what you quoted ("you can use any code") and what the OP actually said ("you can use any common language"). – Geobits – 2014-10-03T23:10:52.063

@Geobits good point, I honestly misread. On the other hand, as you said, that's not the place for production code. I'm here to have fun. – edc65 – 2014-10-03T23:22:43.903

3Grossly unsuitable winning criterion – None – 2014-10-04T14:54:09.337

Answers

27

Ruby

Thought it would be interesting to extract the state abbreviations without writing any of the names or abbrevations explicitly. This one does not take misspelling of the input into consideration, because we don't care about such thing here on codegolf.SE, rihgt?

def f(s)
  [
    /(.).* (.)/,              # two words
    /^([CDGHKLPV]).*(.)$/,    # first and last letter
    /^(.).*([ZVX])/,          # unique letter
    /^([NFOUW])(.)/,          # two first letters
    /^(.)([DNR])/,            # unique second letter
    /^(.).*(L|N)\2/,          # double letters
    /^(.).SS(A|O)/,           # double S before the one used
    /^(.).*?[SNW](.)/,        # identified by the letters before them
    /(.)(.)/                  # two first letters

  ].find { |r| r =~ s.upcase }
  $1+$2
end

It took a conciderable time to figure out clever patterns to match all the states. The order of the patterns is important -- each consecutive pattern applies to the remaining states that were not matched by a previous pattern:

All states with two words in them use the initial letters of the two words:

New Hampshire, New Jersey, New Mexico, New York, North Carolina, North Dakota, Rhode Island, South Carolina, South Dakota, West Virginia

All states beggining with any letter in {CDGHKLPV} use the first and last letter in the name:

California, Colorado, Connecticut, Delaware, Georgia, Hawaii, Kansas, Kentucky, Louisiana, Pennsylvania, Virginia, Vermont

Of the remaining states, the letters {ZVX} are unique:

Arizona, Nevada, Texas

All remaining states beginning with {FNOUW} use the two first letters.

Florida, Nebraska, Ohio, Oklahoma, Oregon, Utah, Washington, Wisconsin, Wyoming

Then, {DNR} are unique as second letters:

Arkansas, Indiana, Idaho

It's really getting hard to make general patterns, but...

Only three remaining states use double N or L, and the double letter is used in the state abbreviation:

Tennessee, Minnesota, Illinois

A or O after double S is unique to

Massachusetts and Missouri

Whenever {SNW} appear before other letters in the remaining state names, the letters after them are used in the abbreviations:

Alaska, Maryland, Maine, Mississippi, Montana, Iowa

Two left. These use the two first letters:

Alabama, Michigan


It can be golfed of course:

Ruby 2 – 191 165 154 characters

Another 26 characters off by uglifying the regexes a bit. Also, one of the original regexes turned out to be redundant!

gets;[/.* (.)/,/^[CDGHKLPV].*(.)$/,/.*([ZVX])/,/^[NFOUW](.)/,/^.([DNR])/,/.*(L|N)\1/,
/.*SS(A|O)/,/.*?[SNW](.)/,/.(.)/].find{|r|$_.upcase=~r}
puts $&[0]+$1

daniero

Posted 2014-10-03T18:23:05.850

Reputation: 17 193

"Currently less than a third of the size of the Golfscript entry!" :P Keep in mind, Golfscript doesn't use Regexes. – Josiah Winslow – 2014-10-04T19:22:24.147

And I changed the size. :P – Josiah Winslow – 2014-10-04T19:39:07.860

@JosiahWinslow No harm meant; The reason I pointed it out was just that it's quite rare that you see a normal language beat golfscript in codegolf by a factor of 3 ;) – daniero – 2014-10-04T19:40:03.680

I didn't think it meant any harm. I was just trying to make a joke. If Golfscript used regexes...if ONLY. (Oh, and the factor is more like 3.41884817 :P) – Josiah Winslow – 2014-10-04T19:41:16.877

@JosiahWinslow hehe, good :P And I can't imagine what would happen if someone were to mix regex with golfscript. I'm not sure the world is ready for such a thing.. – daniero – 2014-10-04T20:14:32.460

1(@JosiahWinslow and oh, make that 3.9575757575... :P) – daniero – 2014-10-04T20:20:50.870

I'm trying to do something with embedding this into Golfscript, just as a test...and I'm having trouble working the input in. gets didn't work, neither did string concatenation. Help? – Josiah Winslow – 2014-10-04T21:22:48.243

6lol for the boobs regex in the explanation that didnt survive compression – masterX244 – 2014-10-04T22:11:50.300

This is the type of thing I was looking for - and shorter than writing out each state's name! I would suggest that your explanation for {CDGHKLPV} should read "first and last" rather than "first and second" but the regex is right. – Josiah – 2014-10-05T00:18:19.987

Wow, what are the odds? Some user with the same name as me... :\ Also, I got Golfscript down to 185 bytes! (with embedded Ruby :P ) – Josiah Winslow – 2014-10-05T16:51:06.383

@Josiah You're right, that was a typo. It's fixed now, and also one regex could be removed. – daniero – 2014-10-05T21:31:36.397

1I like this answer, but it is not valid as it can not spot invalid input (as you say). There is even a specific example If something like New Yrok is passed in, the function should return the original value. – edc65 – 2014-10-06T15:56:48.357

I agree with @edc65. This is awesome but doesn't follow the requirements – Brandon – 2014-10-06T16:34:05.957

4

C#

I used characters already in the states for the abbreviations to shorten up the state string.

public string GetAbbr(string state)
            {

                var states =
                    new[] {
                        "AlasKa", "ALabama", "AriZona", "ARkansas", "CAlifornia", "COlorado", "ConnecticuT",
                        "DElaware", "FLorida", "GeorgiA", "HawaiI", "IDaho", "ILlinois", "INdiana", "IowA", "KansaS",
                        "KentuckY", "LouisianA", "MainE", "MarylanD", "MAssachusetts", "MIchigan", "MinNnesota",
                        "MiSsissippi", "MissOuri", "MonTana", "NEbraska", "NeVada", "New Hampshire", "New Jersey",
                        "New Mexico", "New York", "North Carolina", "North Dakota", "OHio", "OKlahoma", "ORegon",
                        "PennsylvaniA", "Rhode Island", "South Carolina", "South Dakota", "TeNnessee", "TeXas", "UTah",
                        "VermonT", "VirginiA", "WAshington", "washington D.C.", "West Virginia", "WIsconsin", "WYoming"
                    };
                var all = states.ToDictionary(st => string.Concat(st.Where(char.IsUpper)));

                var wanted = all.FirstOrDefault(pair => state.ToUpper().Equals(pair.Value.ToUpper()) || state.ToUpper().Equals(pair.Key));

                return wanted.Key ?? state;
            }

Brandon

Posted 2014-10-03T18:23:05.850

Reputation: 257

1Nice workaround! – Beta Decay – 2014-10-06T15:46:40.610

2

JavaScript (E6)

Here the bulk is the list of names, using the camelCase trick to shorten a bit. Golfed, 617 bytes.

F=i=>
  "AkAlAzArCaCoCtDeFlGaHiIdIlInIaKsKyLaMeMdMaMiMnMsMoMtNeNvNhNjNmNyNcNdOhOkOrPaRiScSdTnTxUtVtVaWaWvWiWyAlaskaAlabamaArizonaArkansasCaliforniaColoradoConnecticutDelawareFloridaGeorgiaHawaiiIdahoIllinoisIndianaIowaKansasKentuckyLouisianaMaineMarylandMassachusettsMichiganMinnesotaMississippiMissouriMontanaNebraskaNevadaNew hampshireNew jerseyNew mexicoNew yorkNorth carolinaNorth dakotaOhioOklahomaOregonPennsylvaniaRhode islandSouth carolinaSouth dakotaTennesseeTexasUtahVermontVirginiaWashingtonWest virginiaWisconsinWyoming"
  .match(/.[^A-Z]*/g).map((w,q)=>U(w,U(w)==U(i)?p=q%50:p),U=s=>s.toUpperCase(),p=-1)[p]||i

edc65

Posted 2014-10-03T18:23:05.850

Reputation: 31 086

0

Python

Decided just to do this as a code-golf challenge. Got it down to 906 713 694 chars with the help of daniero and hsl:

s='AK,AL,AZ,AR,CA,CO,CT,DE,FL,GA,HI,ID,IL,IN,IA,KS,KY,LA,ME,MD,MA,MI,MN,MS,MO,MT,NE,NV,NH,NJ,NM,NY,NC,ND,OH,OK,OR,PA,RI,SC,SD,TN,TX,UT,VT,VA,WA,WV,WI,WY,ALASKA,ALABAMA,ARIZONA,ARKANSAS,CALIFORNIA,COLORADO,CONNECTICUT,DELAWARE,FLORIDA,GEORGIA,HAWAII,IDAHO,ILLINOIS,INDIANA,IOWA,KANSAS,KENTUCKY,LOUISIANA,MAINE,MARYLAND,MASSACHUSETTS,MICHIGAN,MINNESOTA,MISSISSIPPI,MISSOURI,MONTANA,NEBRASKA,NEVADA,NEW HAMPSHIRE,NEW JERSEY,NEW MEXICO,NEW YORK,NORTH CAROLINA,NORTH DAKOTA,OHIO,OKLAHOMA,OREGON,PENNSYLVANIA,RHODE ISLAND,SOUTH CAROLINA,SOUTH DAKOTA,TENNESSEE,TEXAS,UTAH,VERMONT,VIRGINIA,WASHINGTON,WEST VIRGINIA,WISCONSIN,WYOMING'.split(",")
x=input().upper()
print(s[s.index(x)%50]if x in s else x)

However, if modules are allowed (like the us module), I can get it down to 130 chars:

import us
i=raw_input()
x=us.states.lookup(i)
print x.abbr if x else i

And if you didn't have to return the original value when the state doesn't exist I could get it down to 50 chars:

import us
print us.states.lookup(raw_input()).abbr

James Williams

Posted 2014-10-03T18:23:05.850

Reputation: 1 735

You can save roughly 200 characters on the first one by letting s be one large string, then split it on commas (,); No need for all the single-quotes. – daniero – 2014-10-03T19:29:37.297

@daniero Can't believe I didn't think of that! Will do now. – James Williams – 2014-10-03T19:34:12.373

You can remove Washington, D.C., as it isn't a U.S. state. – NinjaBearMonkey – 2014-10-03T19:37:20.723

@hsl Thanks. I took the list from a list of states I found online, didn't realise Washington D.C. was in there. – James Williams – 2014-10-03T19:46:00.227

0

Javascript

I know this isn't code golf, but I want to golf it anyway. :)

var r=new XMLHttpRequest
r.open("GET","https://gist.githubusercontent.com/mshafrir/2646763/raw/f2a89b57193e71010386a73976df92d32221d7ba/states_hash.json",0)
r.send()
var o=r.responseText,m=prompt(),a=m
o=JSON.parse(o)
for(var i in o)if(o[i].toLowerCase()==m.toLowerCase())a=i
alert(a)

Yay for new things! (Stack Snippets)

Beta Decay

Posted 2014-10-03T18:23:05.850

Reputation: 21 478

3This is a standard loophole and standard loopholes apply without having to be mentioned explicitly. – Ingo Bürk – 2014-10-04T15:03:51.330

@IngoBürk I don't believe this falls under the standard loopholes... It's getting the required data from the internet in the same way as reading fron a file. – Beta Decay – 2014-10-04T15:10:44.170

2So is eval(open('a.txt')) also valid? If you use a file of any kind, you must also include that file and its file name in your character count. (This isn't code golf, so it actually doesn't really matter in this case anyway.) – Doorknob – 2014-10-04T18:55:27.843

@Doorknob Since you raise the point that this isn't code golf, I don't see why I'm getting downvotes... I haven't violated any rules of pop cons. – Beta Decay – 2014-10-04T18:57:37.940

@Doorknob But I do understand your point that if this was code golf, I'd be in the wrong for not including it in my byte count. – Beta Decay – 2014-10-04T18:58:36.643

2No reason to downvote, it's perfectly in the spirit of the question - favor novelty and utility - and fun – edc65 – 2014-10-04T19:04:22.517

0

Golfscript - 750 653

The bulk is in the state names and abbreviations.

{.96>32*-}%.,2>{"ALABAMA,AL,ALASKA,AK,ARIZONA,AZ,ARKANSAS,AR,CALIFORNIA,CA,COLORADO,CO,CONNECTICUT,CT,DELAWARE,DE,FLORIDA,FL,GEORGIA,GA,HAWAII,HI,IDAHO,ID,ILLINOIS,IL,INDIANA,IN,IOWA,IA,KANSAS,KS,KENTUCKY,KY,LOUISIANA,LA,MAINE,ME,MARYLAND,MD,MASSACHUSETTS,MA,MICHIGAN,MI,MINNESOTA,MN,MISSISSIPPI,MS,MISSOURI,MO,MONTANA,MT,NEBRASKA,NE,NEVADA,NV,NEW HAMPSHIRE,NH,NEW JERSEY,NJ,NEW MEXICO,NM,NEW YORK,NY,NORTH CAROLINA,NC,NORTH DAKOTA,ND,OHIO,OH,OKLAHOMA,OK,OREGON,OR,PENNSYLVANIA,PA,RHODE ISLAND,RI,SOUTH CAROLINA,SC,SOUTH DAKOTA,SD,TENNESSEE,TN,TEXAS,TX,UTAH,UT,VERMONT,VT,VIRGINIA,VA,WASHINGTON,WA,WEST VIRGINIA,WV,WISCONSIN,WI,WYOMING,WY"","/.@?)=}{}if

Explanation:

{        }%                         Map this to every character in the input string:
 .96>32*-                             Subtract 32 from the ASCII value if it's from "a" onwards.
                                      This turns every lowercase letter into an uppercase letter.
           .,2>                     Check if the input length is greater than 2.
               {              }     If it is, they inputted the full name.
                "..."                 Our string is in the form "STATE NAME,STATE ABBREVIATION".
                     ","/             We split the string at every comma to turn it into an array.
                         .@?          Then we see where the input string is in the array...
                            )=        ...then we return the value right next to it.
                               {}   If not, they inputted the abbreviation.
                                      ...do nothing.
                                 if EndIf
                                    (implied) Print the abbreviation

Josiah Winslow

Posted 2014-10-03T18:23:05.850

Reputation: 725

Sorry, but I just don't see the point of taking my whole script and adding nothing but a few bytes of boilerplate; It simply brings nothing. But thanks for the credits I guess... Yours truly, "the other guy". – daniero – 2014-10-05T21:18:14.743

Sorry, troll entry. I know it's not a real entry. – Josiah Winslow – 2014-10-05T22:37:57.477

Well, consider me trolled then ;) – daniero – 2014-10-06T16:49:18.997

@daniero Hey, at least I know it's possible to have regexes in Golfscript! That's actually the only reason I did that lol :p – Josiah Winslow – 2014-10-06T21:43:10.343

0

bash + sed, 291 bytes

Shameless conversion of Daniero's Ruby solution to sed:

echo $*|tr a-z A-Z|sed -e\
"/\(.\).* \(.\).*/b1;/^\([CDGHKLPV]\).*\(.\)$/b1;/^\(.\).*\([ZVX]\).*/b1;\
/^\([NFOUW]\)\(.\).*/b1;/^\(.\)\([DNR]\).*/b1;/^\(.\).*\([LN]\)[LN].*/b1;\
/^\(.\).*SS\([AO]\).*/b1;/^\(.\).*\([ED])\)$/b1;/^\(.\).*[SNW]\(.\).*/b1;\
/\(.\)\(.\).*/b1;:1 s//\1\2/"

Glenn Randers-Pehrson

Posted 2014-10-03T18:23:05.850

Reputation: 1 877

0

PHP

My attempt, which was not as successful as I had hoped, uses string length and some specific character placement to extract the abbreviation from the state name. Probably some better sequencing of name elimination is possible.

function findAbb ($state) {
    $first = substr($state, 0, 1);
    $last = substr($state, -2,1);
    $state = strtolower($state);
    if (strlen($state) < 4) {
        return strtoupper($state);
    }
    if (strpos($state, ' ')) { //if it's a space, return the first letter of each word.
        $space_index = strpos($state, ' ');
        $state = explode(' ', $state);
        return strtoupper(substr($state[0], 0, 1) . substr($state[1], 0, 1));
    }
    if (startsWith($state, 'io')) { //iowa is annoying, get rid of it.
        return strtoupper($first . $last);
    }
    if (startsWith($state, 'w,i')) { //if it starts with a W, return the first 2.
        return strtoupper(substr($state, 0, 2));
    }
    if (strlen($state) < 7 && strpos($state, 'm')===false) { //matches texas, ohio, and utah.
        return strtoupper($first . substr($state, -4,1));
    }
    if (strlen($state) < 7 && substr($state, 0, 1) > 'j' && substr($state, 0, 1) < 'n') { //matches maine, kansas, and hawaii
        return strtoupper($first . $last);
    }
    if (startsWith($state, 'c,d,k,l,p,v,g,h')) { //some unique states
        return strtoupper($first . $last);
    }
    if (strpos($state, 'sk')) {
        return strtoupper ('ak');
    }
    if (startsWith($state, 'k,l', 1)) {
        return strtoupper(substr($state, 0, 2));
    }
    if (startsWith($state, 'n')) {
        return strtoupper($first . substr($state, 2, 1));
    }
    if (startsWith($state, 'n', 2) || startsWith($state, 'z', 3)) { //montana, tennessee, minnesota, and arizona
        return strtoupper($first . substr($state, 3, 1));
    }
    if (startsWith($state, 'm') && ($last == 's') || ($last == 'n')) {
        return strtoupper(substr($state, 0, 2));
    }
    if (strpos($state,'o')) {
        return strtoupper($first . 'o');
    }
    if (strpos($state,'y')) {
        return strtoupper($first . 'd');
    }
    if (strpos($state,'r')) {
        return strtoupper($first . 'r');
    }
    if (strpos($state,'ss')) {
        return strtoupper($first . 's');
    }

    return $state; //otherwise return the name of the state (it was mispelled).
}

function startsWith ($state, $letters, $index = 0) { //takes a comma separated array and finds contents.
    $letters = split(',',$letters);
    for ($q = 0; $q<count($letters); $q++) {
        if (strpos($state,$letters[$q]) === $index) {
            return true;
        }
    }
    return false;
}

Of course, it can be golfed. This is my first golfing attempt, so insight appreciated. (911)

function t($s){$s=u($s);$f=b($s,0,1);$l=b($s,-2,1);
if(strlen($s)<4)return $s;if(strpos($s,' '))$s=split(' ',$s);
return b($s[0],0,1).b($s[1],0,1);
if(w($s,'IO'))return $f.$l;
if(w($s,'W,I'))return b($s,0,2);
if(strlen($s)<7 && strpos($s,'M')===false)return $f.b($s,-4,1);
if(strlen($s)<7 && b($s,0,1)>'I' && b($s,0,1)<'N')return $f.$l;
if(w($s,'C,D,K,L,P,V,G,H'))return $f.$l;if(strpos($s, 'SK'))return 'AK';
if(w($s,'K,L',1))return b($s,0,2);if(w($s,'N'))return $f.b($s,2,1);
if(w($s,'N',2) || w($s,'Z',3))return $f.b($s,3,1);
if(w($s,'M') && ($l=='S') || ($l=='N'))return b($s,0,2);
if(strpos($s,'O'))return $f.'O';
if(strpos($s,'Y'))return $f.'D';if(strpos($s,'R'))return $f.'R';
if(strpos($s,'SS'))return $f.'S';return $s;}function w($s,$l,$i=0){$l=split(',',$l);
for($q=0;$q<count($l);$q++)if(strpos($s,$l[$q])===$i)return 1;return 0;}
function u($z){return strtoupper($z);}
function b($v,$x,$y){return substr($v,$x,$y);}

Josiah

Posted 2014-10-03T18:23:05.850

Reputation: 253