Sample size
FAQ Sample Size
COI refers to "Common Output Indicators", YEI to "Youth Employment Initiative" and LTRI to "Long Term Result Indicators"
The "LTRI Simple" considers one indicator at the time, instead "LTRI Integrated" takes into account all the information necessary for all the Long Term Results Indicators. In order words, the user can decide to construct one sample for each LTRI using the sheet named "LTRI Simple" or to construct a unique sample for all the indicators using the "LTRI Integrated".
Population is the total amount of units. Stratum is the fraction of units in the population that has certain characteristics. The sum of the units in the different strata has to be equal to the Population. On the other hand the Sample is the subpart of units in the strata/population that has to be interviewed. The tool requires the number of units in the strata as input and it provides the number of unit that has to be sampled as output.
There are many reasons to divide the population, according to some predetermined characteristics, in a certain number of strata. Among that:Stratification improves the representativeness of the entire sample with respect to the entire population. It is possible to fix a certain precision of the estimate of the proportion also within the single strata.
According to the guidance document, the variable "Region" could assume three different values: More "Developed Region" (MR), "Transition Region" (TR) and "Less Developed Region" (LR).
The strata in which the population should be divided are set by the "Monitoring and Evaluation of European Cohesion Policy - European Social Fund" guidance document for the programming period 2014-2020. For example paragraph 3.1.4 of the guidance document says "The only difference is that for the YEI the breakdown by category of region is not required for any indicator". The tools reflects these requirements.
CL is the acronym for "Confidence Level". It refers to the uncertainty associated to the estimate of the proportion. The user can choose among 0.90, 0.95 and 0.99.
The Confidence Level has to be set at 0.95. The tool allows to set it also at 0.90 and 0.99 just for illustrative purpose.
ME is the acronym of "Margin of Error" that is half size of the confidence interval. For example if the user set the ME at 0.05 and she/he estimate that the proportion is equal to 0.2, the confidence interval will include (maximum) the values between 0.15 and 0.25.
The estimates are considered reliable if the margin of error in the population is equal or lower than 0.02 and if the margin of error in the strata is lower or equal than 0.05. These are the minimum requirements that should be set by the users but the tool allows to set different values of the margin of errors for informative purpose or if the users want to achieve more precise estimates.
The two tutorials want help the user to familiarize with the sample size construction. The first tutorial is reserved to not-stratified sampling in which the user set the number of units in the population and the tool provide the units that have to be sampled. The second is for stratified populations, in which the user has to choose first of all the number of strata and then she/he has to provide all the units that are in each stratum.
Without additional information the tool computes the sample size under the assumption that the proportion could take every value between 0 and 1. However quite often a priori it is possible to reduce the range of the proportion. Given all the other values of the parameters unchanged, if the range of possible values does not include 0.5, the required sample size reduces.
The range of values depends by prior information on the indicator of interest. For example if the indicator is about youth unemployment, the user (from past values of the indicator) could know that a reasonable range of values is between 0.1 and 0.3.
No. The number of units reported in the webpage refers to the units to be interviewed for achieving the intended margin of error at the specified confidence level.
The sample has to be validated checking ex-post the representativeness in terms of gender, LM status, educational attainment and age. The results of the comparison between the sample and the population in terms of these characteristics have to be documented. If they diverge significantly, the sampling of the under-represented groups has to continue or statistical methods have to be used in order to address the possible bias. This may also be presented in the AIR.
| Originally Published | Last Updated | 12 Oct 2020 | 03 Jan 2022 |
| Related project & activities | Centre for Research on Impact Evaluation (CRIE) |
| Related organisation(s) | JRC - Joint Research Centre |
| Knowledge service | Metadata | Microeconomic Evaluation |
Share this page