Count the timespans

17

Inspired by a real-life scenario, which I have asked for an answer to here: https://superuser.com/questions/1312212/writing-a-formula-to-count-how-many-times-each-date-appears-in-a-set-of-date-ran

Given an array of timespans (or startdate-enddate pairs), output a count of how many timespans cover each day, for all days in the total range.

For example:

  #      Start      End
  1    2001-01-01 2001-01-01
  2    2001-01-01 2001-01-03
  3    2001-01-01 2001-01-02
  4    2001-01-03 2001-01-03
  5    2001-01-05 2001-01-05

Given the above data, the results should be as follows:

2001-01-01: 3 (Records 1,2,3)
2001-01-02: 2 (Records 2,3)
2001-01-03: 2 (Records 2,4)
2001-01-04: 0
2001-01-05: 1 (Record 5)

You only need to output the counts for each day (in order, sorted oldest-newest); not which records they appear in.

You can assume that each timespan only contains dates, not times; and so whole days are always represented.

I/O

Input can be any format that represents a set of timespans - so either a set of pairs of times, or a collection of (builtin) objects containing start- and end-dates. The date-times are limited to between 1901 and 2099, as is normal for PPCG challenges.

You can assume the input is pre-sorted however you like (specify in your answer). Input dates are inclusive (so the range includes the whole of the start and end dates).

You can also assume that, of the two dates in any given range, the first will be older or equal to the second (i.e. you won't have a negative date range).

Output is an array containing the count for each day, from the oldest to the newest in the input when sorted by Start Date.

So, the output for the above example would be {3,2,2,0,1}

It is possible that some days are not included in any time range, in which case 0 is output for that date.

Winning Criteria

This is code-golf, so lowest bytes wins. Usual exclusions apply

Pseudo-algorithm example

For each time range in input
    If start is older than current oldest, update current oldest
    If end is newer than current newest, update current newest
End For
For each day in range oldest..newest
   For each time range
       If timerange contains day
            add 1 to count for day
End For
Output count array

Other algorithms to get to the same result are fine.

simonalexander2005

Posted 2018-06-13T15:23:43.017

Reputation: 1 157

3Is an array of integers required, or are we allowed to return something else, say a dictionary with keys as each date? If we are allowed to return a dictionary, then can we omit dates that are not in any of the timespans? – JungHwan Min – 2018-06-13T16:54:15.677

1can we take input as two lists, one with start dates and the other with corresponding end dates? – Giuseppe – 2018-06-13T20:15:13.160

Yeah, all of those things are fine, except omitting a date - I explicitly say that 0 should be output in that case – simonalexander2005 – 2018-06-13T21:01:49.807

3May I ask why 0 should be in a dictionary? It only appears to force the user to iterate from min(input) to max(input), which doesn't seem to add anything to the core of the challenge (computing timespans). – JungHwan Min – 2018-06-13T21:43:00.843

Has this been in the Sandbox for quite a while (more than a few months)? I could have sworn I've seen a similar challenge before, but am unable to find it.. – Kevin Cruijssen – 2018-06-14T08:05:40.750

Yeah, it's been there a while - I only just got around to moving it here as it had a few upvotes so I thought it worthwhile – simonalexander2005 – 2018-06-14T08:38:13.110

2@JungHwanMin I guess it doesn't alter it; but because I explicitly had it in the spec when I posted it, I don't want to go messing with it and make someone else redo their answer – simonalexander2005 – 2018-06-14T08:39:36.640

@simonalexander2005 Making 0 optional wouldn't invalidate anything. – JungHwan Min – 2018-06-14T12:20:31.897

@JungHwanMin making 0 optional could make some valid answer longer than necessary – edc65 – 2018-06-14T13:13:08.590

@edc65 How so? They could include the 0 if the version that has the 0 is shorter. – JungHwan Min – 2018-06-14T13:26:23.217

@JungHwanMin now the 0s are not optional. There are now valid answers that could be made shorter if 0 is optional. So people should modify their now valid answer to comply with the new spec, or leave a sub optimal answer. – edc65 – 2018-06-14T13:41:59.843

If returning a dictionary/object, does it need to be sorted by date or is it enough that the keys indicate with date is which? – Shaggy – 2018-06-15T09:37:42.493

Suggested test case: 2000-02-27 2000-03-02 – Adám – 2018-06-15T09:49:00.660

@Jakob correct, added. – simonalexander2005 – 2018-06-15T12:57:58.360

@Shaggy no sort needed – simonalexander2005 – 2018-06-15T12:58:11.610

Answers

3

APL (Dyalog Unicode), 32 bytesSBCS

Full program. Prompts stdin for a list of pairs of International Date Numbers (like what Excel and MATLAB use). Both list and pairs may be given in any order, e.g. (End,Start). Prints list of counts to stdout.

¯1+⊢∘≢⌸(R,⊢)∊(R←⌊/,⌊/+∘⍳⌈/-⌊/)¨⎕Try it online!

If this is invalid, a list of (Y M D) pairs can be converted for an additional 21 bytes, totalling 53:

¯1+⊢∘≢⌸(R,⊢)∊(R⌊/,⌊/+∘⍳⌈/-⌊/)¨{2⎕NQ#'DateToIDN'⍵}¨¨⎕Try it online!


 prompt console for evaluated input

( apply the following tacit function to each pair

⌊/ the minimum (lit. min-reduction), i.e. the start date

⌈/- the maximum (i.e. end date) minus that

⌊/+∘⍳ the start date plus the range 1-through-that

⌊/, the start date prepended to that

R← assign this function to R (for Range)

ϵnlist (flatten) the list of ranges into a single list

() apply the following tacit function to that:

R,⊢ the result of applying R (i.e. the date range) followed by the argument
  (this ensures that each date in the range is represented at least once, and that the dates appear in sorted order)

 for each pair of unique (date, its indices of occurrence in the input), do:

⊢∘≢ ignore the actual date in favour of the tally of indices

¯1+ add -1 to those tallies (because we prepended one of each date in the range)

Adám

Posted 2018-06-13T15:23:43.017

Reputation: 37 779

9

JavaScript (ES6), 85 bytes

Takes input as a list of Date pairs. Expects the list to be sorted by starting date. Returns an array of integers.

f=(a,d=+a[0][0])=>[a.map(([a,b])=>n+=!(r|=d<b,d<a|d>b),r=n=0)|n,...r?f(a,d+864e5):[]]

Try it online!

or 84 bytes if we can take JS timestamps as input (as suggested by @Shaggy)

Arnauld

Posted 2018-06-13T15:23:43.017

Reputation: 111 334

Ah, nuts! – Shaggy – 2018-06-13T16:39:04.287

Save a byte by taking the primitive values as input: TIO

– Shaggy – 2018-06-13T18:32:20.553

7

JavaScript, 75 73 bytes

Takes input as a sorted array of arrays of date primitive pairs, outputs an object where the keys are the primitives of each date and the values the counts of those dates in the ranges.

a=>a.map(g=([x,y])=>y<a[0][0]||g([x,y-864e5],o[y]=~~o[y]+(x<=y)),o={})&&o

Try it


I was working on this 60 byte version until it was confirmed that dates that don't appear in any of the ranges must be included so hastily updated it to the solution above.

a=>a.map(g=([x,y])=>x>y||g([x+864e5,y],o[x]=-~o[x]),o={})&&o

Try it online (or with human readable dates in the output)

Shaggy

Posted 2018-06-13T15:23:43.017

Reputation: 24 623

It looks like ES6 does define a key order for JS objects (https://stackoverflow.com/a/31102605/8127), basicaly insertion order for string and symbol keys (and TIO's Nodejs does seem to follow that: https://tinyurl.com/ybjqtd89). And in general my opinion is that an implementation detail (which is what the object here is) shouldn't dictate the interpretation of the challenge rules, but I'll wait for the Meta post.

– sundar - Reinstate Monica – 2018-06-15T09:07:16.457

6

Octave, 63 bytes

@(x)histc(t=[cellfun(@(c)c(1):c(2),x,'un',0){:}],min(t):max(t))

Try it online!

Now that was ugly!

Explanation:

Takes the input as a cell array of datenum elements (i.e. a string "2001-01-01" converted to a numeric value, looking like this:

{[d("2001-01-01") d("2001-01-01")]
[d("2001-01-01") d("2001-01-03")]
[d("2001-01-01") d("2001-01-02")]
[d("2001-01-03") d("2001-01-03")]
[d("2001-01-05") d("2001-01-05")]};

where d() is the function datenum. We then use cellfun to create cells with the ranges from the first column to the second for each of those rows. We concatenate these ranges horizontally, so that we have a long horizontal vector with all the dates.

We then create a histogram using histc of these values, with bins given by the range between the lowest and the highest date.

Stewie Griffin

Posted 2018-06-13T15:23:43.017

Reputation: 43 471

5

R, 75 bytes

function(x,u=min(x):max(x))rowSums(outer(u,x[,1],">=")&outer(u,x[,2],"<="))

Try it online!

Input is a matrix whose first column is Start and second column is End. Assumes Start<=End but does not require Start dates to be sorted.

JayCe

Posted 2018-06-13T15:23:43.017

Reputation: 2 655

This is as far as i was able to go trying to replicate the Octave answer by Stewie Griffin... what am I doing wrong? – JayCe – 2018-06-13T20:35:56.583

it's because of the way R does its bins in hist; you could do c(-25668,min(x):max(x)) since -25568 is before 1900 but this ends up being longer than your suggested answer. That being said, there's a better way to generate the dates than apply; I have one that's at 68 bytes and I just haven't found the time to post it myself. – Giuseppe – 2018-06-13T20:47:56.743

Ah, no, actually, use (min(x)-1):max(x) and it should work as expected; then if you can find the not-apply way to generate the dates you can get this to 63 bytes and tie the Octave answer. – Giuseppe – 2018-06-13T20:52:56.483

@Giuseppe You should post it as a separate answer :) – JayCe – 2018-06-13T21:02:59.400

posted :-) I have to admit, I was using table and factor before which was my original use of Map for 68 bytes, but hist is a neat approach that I always forget about, probably because it's annoying to get the bins just right (as we have seen) – Giuseppe – 2018-06-13T21:09:37.303

@Giuseppe very clever use of Map. I don’t think I have ever used it for golfing. – JayCe – 2018-06-13T22:33:04.287

4

Python 2, 114 87 93 bytes

-27 bytes thanks to Jonathan Allan
+6 bytes thanks to sundar

Takes input as list of pairs of datetime objects.
Assumes that first pair starts with the lowest date.

def F(I):
 d=I[0][0]
 while d<=max(sum(I,[])):print sum(a<=d<=b for a,b in I);d+=type(d-d)(1)

Try it online!

Dead Possum

Posted 2018-06-13T15:23:43.017

Reputation: 3 256

days is the default argument for timedelta. – Jonathan Allan – 2018-06-14T09:38:42.853

...in fact I think you can drop the from datetime import* and replace d+=timedelta(days=1) with d+=type(d-d)(1) since the inputs are already dates. 87 bytes

– Jonathan Allan – 2018-06-14T09:44:17.667

1This seems to assume that the start of the first range is the lowest date AND the end of the last range is the highest - but I think that's sometimes not possible even if the OP allows us to take sorted input. For eg. if input is [(2001-01-01, 2001-01-05), (2001-01-02, 2001-01-03)]. Unless OP allows us to split and rearrange these ranges during preprocessing (which seems unlikely), this input can't be processed by this code properly. – sundar - Reinstate Monica – 2018-06-14T17:26:30.983

@sundar Yes, I see what you're talking about. I've updated solution to handle this. Thanks! – Dead Possum – 2018-06-15T11:37:46.680

4

Groovy, 142 bytes

{a={Date.parse('yyyy-mm-dd',it)};b=it.collect{a(it[0])..a(it[1])};b.collect{c->b.collect{it}.flatten().unique().collect{it in c?1:0}.sum()}}

Formatted out:

 {                                   // Begin Closure
    a={Date.parse('yyyy-mm-dd',it)}; // Create closure for parsing dates, store in a().
    b=it.collect{                    // For each input date pair...
        a(it[0])..a(it[1])           // Parse and create date-range.
    };
    b.collect{                       // For each date range...
        c->
        b.collect{                   // For each individual date for that range...
           it
        }.flatten().unique().collect{ // Collect unique dates.
            it in c?1:0
        }.sum()                      // Occurrence count.
    }
}

Magic Octopus Urn

Posted 2018-06-13T15:23:43.017

Reputation: 19 422

4

Red, 174 bytes

func[b][m: copy #()foreach[s e]b[c: s
until[m/(c): either none = v: m/(c)[1][v + 1]e < c: c + 1]]c: first sort b
until[print[either none = v: m/(c)[0][v]](last b)< c: c + 1]]

Quite long and literal implementation.

Try it online!

Readable:

f: func [ b ] [
    m: copy #()
    foreach [ s e ] b [
        c: s
        until [
            m/(c): either none = v: m/(c) [ 1 ] [ v + 1 ]   
            e < c: c + 1
        ]      
    ]
    c: first sort b
    until[
        print [ either none = v: m/(c) [ 0 ] [ v ] ]
        ( last b ) < c: c + 1
    ]      
]

Galen Ivanov

Posted 2018-06-13T15:23:43.017

Reputation: 13 815

3

R, 65 63 bytes

function(x)hist(unlist(Map(`:`,x[,1],x[,2])),min(x-1):max(x))$c

Try it online!

This is a collaboration between JayCe and myself, porting Stewie Griffin's answer over to R.

To quote JayCe:

Input is a matrix whose first column is Start and second column is End. Assumes Start<=End but does not require Start dates to be sorted.

Possibly, $c is unnecessary but it's not quite in the spirit of the challenge so I've included it.

Giuseppe

Posted 2018-06-13T15:23:43.017

Reputation: 21 077

1Min(x-1) for 2 bytes ? – JayCe – 2018-06-13T21:45:00.100

^ By which I mean this

– JayCe – 2018-06-14T12:49:44.823

@JayCe yes, nice! I meant to come back to this earlier but I forgot. – Giuseppe – 2018-06-14T12:51:44.397

3

Wolfram Language (Mathematica), 62 bytes

Lookup[d=DayRange;Counts[Join@@d@@@#],#[[1,1]]~d~#[[-1,1]],0]&

Try it online!

+35 bytes because OP specified that 0 must be included in the output.

If omitting an entry in a dictionary were allowed, 27 bytes

Counts[Join@@DayRange@@@#]&

Try it online!

The built-in DayRange accepts two DateObjects (or a string equivalent) and outputs a list of Dates between those dates (inclusive).

JungHwan Min

Posted 2018-06-13T15:23:43.017

Reputation: 13 290

3

Powershell, 122 121 118 113 bytes

filter d{0..($_[-1]-($s=$_[0])).Days|%{$s.AddDays($_)}}$c=@{};$args|d|%{++$c.$_};,($c.Keys.Date|sort)|d|%{+$c.$_}

save it as count-timespan.ps1. Test script:

.\count-timespan.ps1 `
    @([datetime]"2001-01-01", [datetime]"2001-01-01")`
    @([datetime]"2001-01-01", [datetime]"2001-01-03")`
    @([datetime]"2001-01-01", [datetime]"2001-01-02")`
    @([datetime]"2001-01-03", [datetime]"2001-01-03")`
    @([datetime]"2001-01-05", [datetime]"2001-01-05")

Explanation

filter d{                           # define a function with a pipe argument (it's expected that argument is an array of dates)
    0..($_[-1]-($s=$_[0])).Days|%{  # for each integer from 0 to the Days
                                    # where Days is a number of days between last and first elements of the range
                                    # (let $s stores a start of the range)
        $s.AddDays($_)              # output to the pipe a date = first date + number of the current iteration
    }                               # filter returns all dates for each range
}                                   # dates started from first element and ended to last element
$c=@{}                              # define hashtable @{key=date; value=count}
$args|d|%{++$c.$_}                  # count each date in a array of arrays of a date
,($c.Keys.Date|sort)|d|%{+$c.$_}    # call the filter via pipe with the array of sorted dates from hashtable keys

# Trace:
# call d @(2001-01-01, 2001-01-01) @(2001-01-01, 2001-01-03) @(2001-01-01, 2001-01-02) @(2001-01-03, 2001-01-03) @(2001-01-05, 2001-01-05)
# [pipe]=@(2001-01-01, 2001-01-01, 2001-01-02, 2001-01-03, 2001-01-01, 2001-01-02, 2001-01-03, 2001-01-05)
# $c=@{2001-01-03=2; 2001-01-01=3; 2001-01-05=1; 2001-01-02=2}
# call d @(2001-01-01, 2001-01-02, 2001-01-03, 2001-01-05)
# [pipe]=@(2001-01-01, 2001-01-02, 2001-01-03, 2001-01-04, 2001-01-05)
# [output]=@(3, 2, 2, 0, 1)

mazzy

Posted 2018-06-13T15:23:43.017

Reputation: 4 832

Thanks! $cnt.Keys.Date of course. – mazzy – 2018-06-14T19:01:33.970

-3 bytes: function replaced with scriptblock. golfed and ungolfed codes are tested. – mazzy – 2018-06-14T21:14:32.793

-5 bytes: scriptblock replaced on filter. Call of a filter is more compact. – mazzy – 2018-06-18T08:09:10.303

3

J, 43 bytes

(],.[:+/@,"2="{~)&:((>./(]+i.@>:@-)<./)"1),

the input is a list of pairs of integers, where each integer is the offset from any arbitrary common 0-day.

ungolfed

(] ,. [: +/@,"2 ="{~)&:((>./ (] + i.@>:@-) <./)"1) ,

explanation

structure is:

(A)&:(B) C
  • C creates a hook that gives the main verb A&:B the input on the left and the input flattened on the right
  • B aka ((>./ (] + i.@>:@-) <./)"1) takes the min and max of a list and returns the resulting range, and acts with rank 1. hence it gives the total range on the right, and the individual ranges on the left.
  • A then uses = with rank "0 _ (ie, rank of {) to count how many times each input appears in any of the ranges. finally it zips every year with those counts.

Try it online!

Jonah

Posted 2018-06-13T15:23:43.017

Reputation: 8 729

2

JavaScript (Node.js), 80 bytes

(a,u=[])=>a.map(g=([p,q])=>p>q||g([p,q-864e5],u[z=(q-a[0][0])/864e5]=-~u[z]))&&u

Try it online!

undefined means zero; First element should start earliest

(a,u=[])=>a.map(g=([p,q])=>p>q||g([p,q-1],u[z=(q-a[0][0])/864e5]=-~u[z]))&&u is shorter if you only see elements and use more stack

l4m2

Posted 2018-06-13T15:23:43.017

Reputation: 5 985

6You should ask for confirmation that substituting another value for 0 is acceptable. – Shaggy – 2018-06-13T17:52:06.010

1

Julia 0.6, 77 bytes

M->[println(sum(d∈M[r,1]:M[r,2]for r∈1:size(M,1)))for d∈M[1]:max(M...)]

Try it online!

Inspired by @DeadPossum's Python solution.

Takes input as a matrix, where each row has two dates: the starting and ending dates of an input range. Assumes the input has the earliest date first, and that each row has the starting date first, but assumes no sorting beyond that between different rows.


Older solution:

Julia 0.6, 124 bytes

R->(t=Dict();[[d∈keys(t)?t[d]+=1:t[d]=1 for d∈g]for g∈R];[d∈keys(t)?t[d]:0 for d∈min(keys(t)...):max(keys(t)...)])

Try it online!

Accepts input as an array of date ranges. Does not assume any sorting among the different ranges in the array.

sundar - Reinstate Monica

Posted 2018-06-13T15:23:43.017

Reputation: 5 296

1

Ruby, 70 bytes

->s{f,=s.min;(p s.count{|a,b|(f-a)*(f-b)<=0};f+=86400)until f>s[0][1]}

Try it online!

Input:

Array of pairs of dates, sorted by ending date descending.

G B

Posted 2018-06-13T15:23:43.017

Reputation: 11 099

1

R (70)

function(x)(function(.)tabulate(.-min(.)+1))(unlist(Map(seq,x$S,x$E,"d")))

Presumes a data frame x with two columns (Start and End or possibly S and E) with dates (class Date).

Try it online

lebatsnok

Posted 2018-06-13T15:23:43.017

Reputation: 383

Hi, could you include TIO links (see other answers) with an example input/output? It's not cheating to include a package, but library(magrittr) needs to be included in the byte counts. – JayCe – 2018-06-15T14:50:12.057

Also as per consensus submissions need to be full functions or programs, not snippets, so if you go with a function whose sole argument is xyour answer starts with function(x) and then the body of the function. – JayCe – 2018-06-15T14:55:31.767