Effective sample size

In statistics, effective sample size is a notion defined for a sample from a distribution when the observations in the sample are correlated or weighted.[1]

Correlated observations

Suppose a sample of several observations is drawn from a distribution with mean and standard deviation . Then the mean of this distribution is estimated by the mean of the sample:

In that case, the variance of is given by

However, if the observations in the sample are correlated, then is somewhat higher. For instance, if all observations in the sample are completely correlated (), then regardless of .

The effective sample size is the unique value (not necessarily an integer) such that

is a function of the correlation between observations in the sample. Suppose that all the correlations are the same and nonnegative, i.e. if , then . In that case, if , then . Similarly, if then . More generally,

The case where the correlations are not uniform is somewhat more complicated. Note that if the correlation is negative, the effective sample size may be larger than the actual sample size. If we allow the more general form (where ) then it is possible to construct correlation matrices that have an even when all correlations are positive. Intuitively, the maximal value of over all choices of the coefficients may be thought of as the information content of the observed data.

Weighted samples

If the data has been weighted, then several observations composing a sample have been pulled from the distribution with effectively 100% correlation with some previous sample. In this case, the effect is known as Kish's Effective Sample Size[2]


gollark: All OS components are MIT-licensed.
gollark: No, PotatOS is committed to open source.
gollark: > THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
gollark: Oh right, the code is MIT-license too.
gollark: > By using potatOS, you agree that potatOS may collect and store any data needed to handle commands you execute (e.g. files stored on your computer), or to do anything else it has been programmed to do, or anything whatsoever. privacy policy line 6

References

  1. Tom Leinster (December 18, 2014). "Effective Sample Size" (html).
  2. "Design Effects and Effective Sample Size" (html).

Further reading

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.