3.2.3 Sample Theory
Explains sampling theory and approach for doing it with a library h2o
Last updated
Was this helpful?
Explains sampling theory and approach for doing it with a library h2o
Last updated
Was this helpful?
Sampling Distributions
Sampling distributions describe the distribution for a specific statistic. That is, sampling distributions are a subset (sample) of the full data set, with which you can play, explore, and simulate statistics like averages, variance, and skew.
Sampling distributions help us create conclusions using the statistics about a population. A sample population is the statistical representation of the actual population. Before we dive into what and how to do sampling, let's understand a few key terms, that would help us calculate samples.
I. Confidence Interval Confidence intervals are ‘intervals’ that we can create to guess with a certain degree of accuracy where a parameter of interest lies.
Confidence Interval Width: The distance between the upper and lower bounds of the confidence interval.
II. Method & Formula for Sampling
In the real world, working with an entire population's data can be slow and heavy, but we can use sampling distributions to estimate what a population parameter most probably is. The general process is as follows:
Get your data
Figure out what you want to estimate(ex. Binary Classification/Regression)
Bootstrap that parameter
Create Confidence intervals.
It depends on the type of problem you want to address based on the distribution of the target on which the model is being built to predict.
Binary:
For the other distributions, we use the Rule of thumb for minimum sample size
Multivariate: n >= 100 + 50k
Regression: n >= 104 + k Note: Where k is the number of the independent variable
III. Implementation - Python function We created our own function:
Assumptions:
Error% = 1%
Confidence Interval = 95%
z = 1.95
IV. Sampling Theorem Script:
We have used the library to operate and test the sampling to help us with the Interpretability and Evaluations of the Interpretable Models.