A Free Sample of Autocorrelation

11

1

Consider a 1-dimensional, real-valued vector x that represents observations of some process measured at equally spaced intervals over time. We call x a time series.

Let n denote the length of x and denote the arithmetic mean of x. The sample autocovariance function is defined as

autocovariance

for all -n < h < n. This measures the linear dependence between two points on the same series observed at different times.

The sample autocorrelation function, or ACF, is defined as

autocorrelation

This measures the linear predictability of the series x at time t, which we denote xt, using only the value xt+h.

Note that these sample estimates do not match the naïve calculations based on the theoretical properties. That is, the sample autocorrelation function is not equal to the Pearson correlation coefficient of x with the h-step lag of x.

Task

Given an array x and a nonnegative integer h, print or return the first h+1 lag autocorrelations of x, beginning with lag 0. The lag autocorrelations are those corresponding to negative inputs in the above formulas.

You can assume that 0 < h < n, where n is the length of x, and that 2 < n < 256.

Output should be correct to within 1E-4. There are no restrictions on the use of built-in functions or running time.

Examples

h, x -> output
--------------
5, [2.4, 2.4, 2.4, 2.2, 2.1, 1.5, 2.3, 2.3, 2.5, 2] -> [1.00000000,  0.07659298, -0.06007802, -0.51144343, -0.02912874, -0.10468140]
1, [2134, 1863, 1877, 1877, 1492, 1249] -> [1.0000000, 0.3343041]
2, [13067.3, 13130.5, 13198.4] -> [1.0000000000, -0.0002854906, -0.4997145094]

Alex A.

Posted 2016-02-17T02:09:44.830

Reputation: 23 761

Answers

4

Jelly, 26 25 24 23 20 bytes

×L_SµḊ;0µƓС׹S€µF÷Ḣ

Try it online!

How it works

×L_SµḊ;0µƓС׹S€µF÷Ḣ  Main link. Input: x (list) STDIN: h (integer)

×L                    Multiply all items in x by the length of x.
  _S                  Subtract the sum of x from all products.
                      Let's call the result X.
    µ                 Begin a new monadic chain. Argument: t (list)
     Ḋ                Remove the first element of t.
      ;0              Append a 0 to the result.
        µ             Push the previous chain and begin a new one.
                      Argument: X
         Ɠ            Read h from STDIN.
          С          Repeat the Ḋ;0 chain h times, collecting the h+1 intermediate
                      results in a list A.
            ×¹        Multiply the vectors in A by X.
              S€      Compute the sum of each vectorized product.
                µ     Begin a new, monadic chain. Argument: S (sums)
                 F    Flatten S. This does nothing except creating a deep copy.
                   Ḣ  Pop the first element of S.
                  ÷   Divide all elements of the copy by the first element.

Dennis

Posted 2016-02-17T02:09:44.830

Reputation: 196 637

6

R, 3 31 25 bytes

# changes the builtin to only return the acf
body(acf)=body(acf)[1:18]

Usage (returns an array with the autocorrelations)

(acf(c(2.4, 2.4, 2.4, 2.2, 2.1, 1.5, 2.3, 2.3, 2.5, 2),5))
# , , 1
#
#             [,1]
# [1,]  1.00000000
# [2,]  0.07659298
# [3,] -0.06007802
# [4,] -0.51144343
# [5,] -0.02912874
# [6,] -0.10468140

Background:

The 31 byte solution based on the original acf built in

function(n,h)c(acf(n,h,,F)$acf)

Note that the 3 byte option acf is the originalbuilt in that will plot (and print to 3 digits) while returning the required autocorrelation as an element in a list.

usage

 acf(c(2.4, 2.4, 2.4, 2.2, 2.1, 1.5, 2.3, 2.3, 2.5, 2),5)

output:

#    Autocorrelations of series ‘c(2.4, 2.4, 2.4, 2.2, 2.1, 1.5, 2.3, 2.3, 2.5, 2)’, by lag
#
#     0      1      2      3      4      5 
# 1.000  0.077 -0.060 -0.511 -0.029 -0.105 

If we want the correlations to display to more than 3 decimal places then 28 bytes will do it (or 31, if we want to suppress the plot)

# will still plot (28 bytes)
function(n,h)c(acf(n,h)$acf)
# won't plot (31 bytes)
function(n,h)c(acf(n,h,,F)$acf)

mnel

Posted 2016-02-17T02:09:44.830

Reputation: 826

This is probably the cleverest trick I have ever seen. I had no idea one could do that. We're trying to get R selected as Language of The Month - you can upvote this meta answer to make it happen.

– JayCe – 2018-08-16T16:37:00.340

3

Python 3, 147 130 126 120 bytes

def p(h,x):x=[t-sum(x)/len(x)for t in x];return[sum(s*t for s,t in zip(x[n:],x))/sum(b*b for b in x)for n in range(h+1)]

This solution is probably going to be golfed further, It's just a start.

You can call it with:

p(5,[2.4, 2.4, 2.4, 2.2, 2.1, 1.5, 2.3, 2.3, 2.5, 2])

Cameron Aavik

Posted 2016-02-17T02:09:44.830

Reputation: 723

2

MATL, 20 bytes

tYm-tPX+tX>/GnqiQ:+)

EDIT (May 20, 2016): as of version 18.0.0 of the language, use Y+ instead of X+. The link includes this change.

Try it online!

Correlation is closely related to convolution. We normalize by subtracting mean, then convolve, normalize again by dividing by maximum value, and then pick the desired lags.

tYm-       % implicit input. Duplicate and subtract mean
tPX+       % duplicate, flip, convolve
tX>/       % duplicate, divide by maximum value
Gnq        % length of array, n. Subtract 1
iQ:        % input number h. Generate vector [1,2,...,h+1]
+          % add to obtain vector [n,n+1,...,n+h]
)          % use that vector as an index. Implicitly display

Luis Mendo

Posted 2016-02-17T02:09:44.830

Reputation: 87 464

1

Mathematica, 27 bytes

Thanks to LegionMammal978 for saving 1 byte.

We could beat Jelly if function names were not so long.

#2~CorrelationFunction~{#}&

Test case

%[5,{2.4,2.4,2.4,2.2,2.1,1.5,2.3,2.3,2.5,2}]
(* {1.,0.076593,-0.060078,-0.511443,-0.0291287,-0.104681} *)

njpipeorgan

Posted 2016-02-17T02:09:44.830

Reputation: 2 992

I was about to post this before my Internet went down... You can save a byte with #2~CorrelationFunction~{#}&. – LegionMammal978 – 2016-02-17T11:02:27.787

1

Octave, 47 37 bytes

@(h,x)xcov(x,'coeff')(numel(x)+(0:h))

Rainer P.

Posted 2016-02-17T02:09:44.830

Reputation: 2 457

@Rainer You can maybe save a few bytes using an anonymous function (and in that case I think you can skip disp, because you are returnig a function output) – Luis Mendo – 2016-02-17T17:09:41.753

@LuisMendo You're right. Saved 10 bytes, even without counting the disp. – Rainer P. – 2016-02-17T20:54:29.380