Sampling with replacement vs Poisson sampling: a comparative study in optimal subsampling

Kavli Affiliate: Jing Wang

| First 5 Authors: Jing Wang, Jiahui Zou, HaiYing Wang, ,

| Summary:

Faced with massive data, subsampling is a commonly used technique to improve
computational efficiency, and using nonuniform subsampling probabilities is an
effective approach to improve estimation efficiency. For computational
efficiency, subsampling is often implemented with replacement or through
Poisson subsampling. However, no rigorous investigation has been performed to
study the difference between the two subsampling procedures such as their
estimation efficiency and computational convenience. This paper performs a
comparative study on these two different sampling procedures. In the context of
maximizing a general target function, we first derive asymptotic distributions
for estimators obtained from the two sampling procedures. The results show that
the Poisson subsampling may have a higher estimation efficiency. Based on the
asymptotic distributions for both subsampling with replacement and Poisson
subsampling, we derive optimal subsampling probabilities that minimize the
variance functions of the subsampling estimators. These subsampling
probabilities further reveal the similarities and differences between
subsampling with replacement and Poisson subsampling. The theoretical
characterizations and comparisons on the two subsampling procedures provide
guidance to select a more appropriate subsampling approach in practice.
Furthermore, practically implementable algorithms are proposed based on the
optimal structural results, which are evaluated through both theoretical and
empirical analyses.

| Search Query: ArXiv Query: search_query=au:”Jing Wang”&id_list=&start=0&max_results=10

Read More