Inference for Sample Correlation When Data Are Subjected to Type II Censoring

Student: James Konoske
Faculty Mentor: Scott Linder (OWU Department of Mathematics and Computer Science)

In social and natural science the goal is to be able to predict the population. In order to do that we take sample distributions that we suspect represent the population parameters. In this project we are able to better look into what a sampling distribution of correlation under type II censoring would be without having to solve for the parameter rho which is normally unknown and needs to be calculated. Thus we are able to estimate a population without having to make many time consuming calculations.


In social and natural science, it is frequently the goal to estimate correlation between two quantitative variables. This requires knowing the margin of error, which in turn depends on knowing the sampling distribution of the sample correlation coefficient (r). However, when bivariate data is subjected to type II censoring on one of the variates, this sampling distribution is unknown.

We approximate the sampling distribution of the absolute value of |r| by the Beta distribution, with parameters that depend on experimental conditions — original sample size (n), number of actual observations (p), and actual population correlation (ρ). Specifically, for each combination of (n, p, ρ), we simulate values of |r| , then computed Method of Moments estimates of parameters of the Beta distribution fit to them. We construct a least-squares regression function relating these parameter estimates to n, p, and ρ.

Hence, armed with these regression functions, a researcher would be able to estimate percentiles of the sampling distribution of |r|, given a sample obtained in a clinical setting (using the sample correlation, r, instead of unknown ρ). This, in turn, would allow the researcher to estimate |ρ| via a confidence interval, even though the actual sampling distribution of |r| is unknown.