Convert human readable time interval to date components

16

4

Challenge

Write the shortest program that converts a human readable time interval to date components of the form:

{±YEARS|±MONTHS|±DAYS|±HOURS|±MINUTES|±SECONDS}

Sample cases

Each test case is two lines, input followed by output:

1 year 2 months 3 seconds
{1|2|0|0|0|3}

-2 day 5 year 8months
{5|8|-2|0|0|0}

3day 9     years 4 seconds -5 minute 4 years 4 years -3seconds
{17|0|3|0|-5|1}

Rules

  • You can not use strtotime or any built-in function that does the whole job.
  • Shortest code wins (bytes)
  • You can print your output to stdout or a file, the result can also be returned by a function, it's up to you
  • The token can be in singular or plural form.
  • The components may be in a random order
  • There may be no white space between the number and the token
  • Sign is optional when the time interval is positive (input and output)
  • If a component appears more than once the values should be added
  • Each component has its own sign
  • The components should be handled separately (e.g. 80 minutes remains as 80 in the output)
  • The input is guaranteed to be lower case

Happy Golfing!

fpg1503

Posted 2015-02-09T11:24:01.523

Reputation: 330

2I like this challenge but I'm having a hard time coming up with anything that isn't long and messy in languages which are ill-suited for code golf. :/ – Alex A. – 2015-02-09T22:36:30.453

Does output format matter? – Titus – 2017-01-07T12:14:37.563

Sign is optional when the time interval is positive Does that mean that input may contain + signs? – Titus – 2017-01-07T12:23:02.483

Answers

3

CJam, 60 bytes

After being stuck in the 60's for a long time, I finally managed to squeeze this down to 60 bytes. Good enough! Ship it!

Try it online

Squished:

'{0a6*q[{_A,s'-+#)!{"ytdhic"#:I){]'0+iA/I_3$=@+t[}*}*}/'|*'}

Expanded and commented:

'{              "Add '{' to output";
0a6*            "Initialize time to a list of 6 zeros";
q               "Read the input";
[               "Open an empty numeric character buffer";
{               "For each character in the input:";
  _               "Append the character to the numeric character buffer";
  A,s'-+#)!       "Check if the character is not part of a number";
  {               "If so:";
    "ytdhic"#:I     "Remove the character from the numeric character buffer and
                     convert it to the corresponding time unit index, or -1 if
                     not recognized
                     (Time units are recognized by a character in their name
                     that does not appear before the recognition character
                     in any other name)";
    ){              "Repeat (time unit index + 1) times:";
      ]'0+iA/         "Close the numeric character buffer and parse it as an
                       integer (empty buffer is parsed as 0)";
      I_3$=@+t        "Add the integer to the value of the indexed time unit";
      [               "Open an empty numeric character buffer";
    }*              "End repeat
                     (This is used like an if statement, taking advantage of
                     the fact that iterations after the first have no effect)";
  }*              "End if";
}/              "End for";
'|*             "Insert a '|' between each time unit value (implicitly added to
                 output)";
'}              "Add '}' to output";

I initially started using a token-based approach, but that got pretty firmly stuck at... 61 bytes. Sigh. So I totally changed gears and switched to this character-based approach, which is much more interesting anyways.

My parsing method works by adding any valid numeric characters reached (0-9 and -) to a buffer and parsing the buffer as an integer when a certain character from one of the time unit names is reached. Those characters are y, t, d, h, i, and c, which all satisfy the conditions that they appear in a time unit name and don't appear before the recognition character in any other time unit name. In other words, when one of these time unit recognition characters is reached, the numeric buffer will be filled with the last number seen if this actually signals a time unit, or the numeric buffer will be empty if this just appears in, but shouldn't signal, some other time unit. In either case, the numeric buffer is parsed as an integer, or 0 if it was empty, and this is added to the corresponding time unit value. Thus recognition characters appearing in other time units after their recognition character have no effect.

Other crazy hacks include:

  • Abusing loops so numeric characters are left on the stack (which acts as the numeric character buffer) "for free."
  • Repeating a block zero or multiple times instead of conditionally because the loop is more compact than an if statement, and iterations after the first have no effect.

For anyone curious about my token-based solution that got stuck at 61 bytes, I'll post it here as well. I never got around to expanding or commenting it, though.

CJam, 61 bytes

'{0a6*q'm-'{,64/~m*{:X/XS**}/S%2/{~0="yodhis"#_3$=@i+t}/'|*'}

Runer112

Posted 2015-02-09T11:24:01.523

Reputation: 3 636

+1 This definitely deservers more upvotes. – oopbase – 2015-02-13T13:01:12.033

2@Forlan07 Thanks for the support. :) But I was a bit late to answer, so it's not unexpected. The process of producing this answer was satisfying enough anyways. – Runer112 – 2015-02-13T13:42:36.663

10

Perl: 61 characters

Thanks to @nutki.

s/-?\d+ *m?(.)/$$1+=$&/ge;$_="{y|o|d|h|i|s}";s/\w/${$&}+0/ge

Sample run:

bash-4.3$ perl -pe 's/-?\d+ *m?(.)/$$1+=$&/ge;$_="{y|o|d|h|i|s}";s/\w/${$&}+0/ge' <<< '1 year 2 months 3 seconds'
{1|2|0|0|0|3}

bash-4.3$ perl -pe 's/-?\d+ *m?(.)/$$1+=$&/ge;$_="{y|o|d|h|i|s}";s/\w/${$&}+0/ge' <<< '-2 day 5 year 8months'
{5|8|-2|0|0|0}

bash-4.3$ perl -pe 's/-?\d+ *m?(.)/$$1+=$&/ge;$_="{y|o|d|h|i|s}";s/\w/${$&}+0/ge' <<< '3day 9     years 4 seconds -5 minute 4 years 4 years -3seconds'
{17|0|3|0|-5|1}

My poor efforts: 78 77 characters

s/([+-]?\d+) *(..)/$a{$2}+=$1/ge;$_="{ye|mo|da|ho|mi|se}";s/\w./$a{$&}||0/ge

manatwork

Posted 2015-02-09T11:24:01.523

Reputation: 17 865

1Some improvements I could find: s/(-?\d+) *(..)/$$2+=$1/ge;$_="{ye|mo|da|ho|mi|se}";s/\w./${$&}+0/ge – nutki – 2015-02-09T13:52:39.003

1Another 4 chars: s/-?\d+ *(m.|.)/$$1+=$&/ge;$_="{y|mo|d|h|mi|s}";s/\w+/${$&}+0/ge – nutki – 2015-02-09T13:54:44.347

Wow. Great tricks, @nutki. – manatwork – 2015-02-09T13:57:11.457

1Also found in other solutions, (m.|.) -> m?(.) saves extra 4. – nutki – 2015-02-09T14:03:12.077

Doh. That was about to try out now. So it works. :) – manatwork – 2015-02-09T14:07:00.377

5

Python 2, 99 bytes

import re
f=lambda I:"{%s}"%"|".join(`sum(map(int,re.findall("(-?\d+) *m?"+t,I)))`for t in"yodhis")

This is a lambda function which takes in a string and simply uses a regex to extract the necessary numbers.

Thanks to Martin for pointing out that \s* could just be <space>*. It's easy to forget that regexes match spaces literally...

Sp3000

Posted 2015-02-09T11:24:01.523

Reputation: 58 729

5

Ruby, 119 106 86 85 84 bytes

One byte saved thanks to Sp3000.

->i{?{+"yodhis".chars.map{|w|s=0;i.scan(/-?\d+(?= *m?#{w})/){|n|s+=n.to_i};s}*?|+?}}

This is an unnamed function, which takes the input as a string, and returns the result (also as a string). You can test it by assigning it to f, say, and calling it like

f["3day 9     years 4 seconds -5 minute 4 years 4 years -3seconds"]

Martin Ender

Posted 2015-02-09T11:24:01.523

Reputation: 184 808

4

JavaScript 100 105 112

Edit Adding template strings (first implemented dec 2014, so valid for this challenge) - at time I was not aware of them

Edit Eureka, at last I got the meaning of m? in all the other answers!

s=>s.replace(/(-?\d+) *m?(.)/g,(a,b,c)=>o['yodhis'.search(c)]-=-b,o=[0,0,0,0,0,0])&&`{${o.join`|`}}`

Test

F=
s=>s.replace(/(-?\d+) *m?(.)/g,(a,b,c)=>o['yodhis'.search(c)]-=-b,o=[0,0,0,0,0,0])&&`{${o.join`|`}}`

;['1 year 2 months 3 seconds','-2 day 5 year 8months'
,'3day 9     years 4 seconds -5 minute 4 years 4 years -3seconds']
.forEach(i=>console.log(i,F(i)))

edc65

Posted 2015-02-09T11:24:01.523

Reputation: 31 086

3

R, 197 bytes

I realize this isn't a competitive entry at all, I mostly just wanted to come up with a solution in R. Any help shortening this is of course welcome.

function(x){s="{";for(c in strsplit("yodhis","")[[1]])s=paste0(s,ifelse(c=="y","","|"),sum(as.numeric(gsub("[^0-9-]","",str_extract_all(x,perl(paste0("(-?\\d+) *m?",c)))[[1]]))));s=paste0(s,"}");s}

Like Martin's answer, this is an unnamed function. To call it, assign it to f and pass a string.

This is pretty hideous, so let's take a look at an un-golfed version.

function(x) {
    s <- "{"
    for (c in strsplit("yodhis", "")[[1]]) {
        matches <- str_extract_all(x, perl(paste0("(-?\\d+) *m?", c)))[[1]]
        nums <- gsub("[^0-9-]", "", matches)
        y <- sum(as.numeric(nums))
        s <- paste0(s, ifelse(c == "y", "", "|"), y)
    }
    s <- paste0(s, "}")
    return(s)
}

Based on the structure alone it's easy to see what's going on, even if you aren't too familiar with R. I'll elaborate on some of the stranger looking aspects.

paste0() is how R combines strings with no separator.

The str_extract_all() function comes from Hadley Wickham's stringr package. R's handling of regular expressions in the base package leaves much to be desired, which is where stringr comes in. This function returns a list of regular expression matches in the input string. Notice how the regex is surrounded in a function perl()--this is just saying that the regex is Perl-style, not R-style.

gsub() does a find-and-replace using a regex for each element of the input vector. Here we're telling it to replace everything that isn't a number or minus sign with an empty string.

And there you have it. Further explanation will be gladly provided upon request.

Alex A.

Posted 2015-02-09T11:24:01.523

Reputation: 23 761

I don’t think outsourcing string extraction to an external package is a good idea. Isn’t it a loophole when an external community-supported library is used? Even if it is OK, why did you not include library(stringr) in your source? – Andreï Kostyrka – 2016-08-11T12:54:49.290

2

Cobra - 165

def f(s='')
    l=int[](6)
    for i in 6,for n in RegularExpressions.Regex.matches(s,'(-?\\d+) *m?['yodhis'[i]]'),l[i]+=int.parse('[n.groups[1]]')
    print'{[l.join('|')]}'

Οurous

Posted 2015-02-09T11:24:01.523

Reputation: 7 916

2

C++14, 234 229 bytes

Edit: cut down 5 bytes by using old style declaration instead of auto.

I know the winner has already been chosen, and that this would be the longest submission so far, but I just had to post a C++ solution, because I bet nobody expected one at all :)

To be honest, I'm pretty much happy with how short it turned out to be (by C++ measurements, of course), and I'm sure it can't get any shorter that this (with just one remark, see below). It is also quite a nice collection of features new to C++11/14.

No third-party libraries here, only standard library is used.

The solution is in a form of lambda function:

[](auto&s){sregex_iterator e;auto r="{"s;for(auto&t:{"y","mo","d","h","mi","s"}){int a=0;regex g("-?\\d+ *"s+t);decltype(e)i(begin(s),end(s),g);for_each(i,e,[&](auto&b){a+=stoi(b.str());});r+=to_string(a)+"|";}r.back()='}';s=r;};

Ungolfed:

[](auto&s)
{
    sregex_iterator e;
    auto r="{"s;
    for(auto&t:{"y","mo","d","h","mi","s"})
    {
        int a=0;
        regex g("-?\\d+\\s*"s+t);
        decltype(e)i(begin(s),end(s),g);
        for_each(i,e,[&](auto&b)
        {
            a+=stoi(b.str());
        });
        r+=to_string(a)+"|";
    }
    r.back()='}';
    s=r;
}

For some reason, I had to write

regex g("-?\\d+\\s*"s+t);
decltype(e)i(begin(s),end(s),g);

instead of just

decltype(e)i(begin(s),end(s),regex("-?\\d+\\s*"s+t));

because the iterator would only return one match if I pass in a temporary object. This doesn't seem right to me, so I wonder if there's a problem with GCC's regex implementation.

Full test file (compiled with GCC 4.9.2 with -std=c++14):

#include <iostream>
#include <string>
#include <regex>

using namespace std;

int main()
{
    string arr[] = {"1 year 2 months 3 seconds",
                    "-2 day 5 year 8months",
                    "3day 9     years 4 seconds -5 minute 4 years 4 years -3seconds"};
    for_each(begin(arr), end(arr), [](auto&s){sregex_iterator e;auto r="{"s;for(auto&t:{"y","mo","d","h","mi","s"}){int a=0;auto g=regex("-?\\d+ *"s+t);decltype(e)i(begin(s),end(s),g);for_each(i,e,[&](auto&b){a+=stoi(b.str());});r+=to_string(a)+"|";}r.back()='}';s=r;});
    for(auto &s : arr) {cout << s << endl;}
}

Output:

{1|2|0|0|0|3}
{5|8|-2|0|0|0}
{17|0|3|0|-5|1}

Alexander Revo

Posted 2015-02-09T11:24:01.523

Reputation: 270

0

PHP, 141 bytes

preg_match_all("#(.?\d+)\s*m?(.)#",$argv[1],$m);$r=[0,0,0,0,0,0];foreach($m[1]as$i=>$n)$r[strpos(yodhis,$m[2][$i])]+=$n;echo json_encode($r);

takes input from first command line argument; uses [,] for output instead of {|}. Run with -r.

breakdown

preg_match_all("#(.?\d+)\s*m?(.)#",$argv[1],$m);    # find intervals.
# (The initial dot will match the sign, the space before the number or a first digit.)
$r=[0,0,0,0,0,0];                   # init result
foreach($m[1]as$i=>$n)              # loop through matches
    $r[strpos(yodhis,$m[2][$i])]+=$n;   # map token to result index, increase value
echo json_encode($r);               # print result: "[1,2,3,4,5,6]"

Titus

Posted 2015-02-09T11:24:01.523

Reputation: 13 814