Part 2 of 2
Thursday June 17, 2021

Complex survey samples in M&E and adapting sampling methods to meet field needs

  • Host
    Fay Candiliari
  • Panelist
    Alexander Bertram
About the webinar

About the webinar

This Webinar is a one-hour session ideal for Monitoring and Evaluation professionals who are interested in learning more about complex survey samples in Monitoring and Evaluation and how sampling methods can be adapted to meet the needs and the reality of the field.

Some of the key points we cover are:

  • Cluster samples
  • Stratified samples
  • Multi-stage samples
  • Non-sampling errors: Selection bias, Convenience sampling, Refusals
  • Sampling in challenging environments: Without a well known population, Techniques for randomization (random walks, systematic sampling, etc.)

You can find and watch the first part of the Webinar “A guide to choosing sample sizes for M&E practitioners” in order to prepare better for the content of this Webinar.

Is this Webinar for me?

  • Are you an M&E practitioner responsible for designing surveys and data collection tools for your programmes?
  • Do you wish to understand better how to work with complex samples for your surveys and what to keep in mind while doing that?
  • Do you want to ask questions regarding complex sample sizes?
  • Do you want to explore how sampling can be adjusted to meet the requirements of challenging environments?

Then, watch our Webinar!

About the Trainer

About the Trainer

Mr. Alexander Bertram, Technical Director of BeDataDriven and founder of ActivityInfo, is a graduate of the American University's School of International Service and started his career in international assistance fifteen years ago working with IOM in Kunduz, Afghanistan, and then with Altai consulting in Kabul where he worked on the planning and analysis of large-scale nationwide surveys. He later worked as an Information Management officer with UNICEF in DR Congo. With UNICEF, frustrated with the time required to build data collection systems for each new programme, he worked on the team that developed ActivityInfo, a simplified platform for M&E data collection. In 2010, he left UNICEF to start BeDataDriven and develop ActivityInfo full time. Since then, he has worked with organizations in more than 50 countries to deploy ActivityInfo for monitoring & evaluation.

Transcript

Transcript

00:00:00 Introduction

Fay: Hello and welcome to today's webinar. This webinar is a one-hour session ideal for Monitoring and Evaluation professionals who are interested in learning more about complex survey samples in Monitoring and Evaluation and how sampling methods can be adapted to meet the needs and the reality of the field.

Some of the key points we cover are cluster samples, stratified samples, multi-stage samples, and non-sampling errors. You can find and watch the first part of the webinar, "A guide to choosing sample sizes for M&E practitioners," in order to prepare better for the content of this webinar.

Our trainer today is Mr. Alexander Bertram, Technical Director of BeDataDriven and founder of ActivityInfo. He is a graduate of the American University's School of International Service and started his career in international assistance fifteen years ago working with IOM in Afghanistan, and then with Altai Consulting in Kabul where he worked on the planning and analysis of large-scale nationwide surveys. He later worked as an Information Management officer with UNICEF in DR Congo.

Alex: Thank you so much, Fay, for that introduction. I'm really excited to do this second follow-up to our previous webinar on sampling. It's something that's near and dear to my heart; I can talk forever about it. I find it very fascinating, and it is very neat to see so much interest. Today we're going to talk about complex surveys and, more generally, how do you adapt these ideal methods to the realities of the field.

00:02:30 Key concepts in sampling

As learning objectives, we have three goals that I hope you'll be able to come away from this hour with. First, I hope you'll be able to identify what a simple random sample is. Second, to identify—or at least be aware of—the names and possibilities for strategies to address common challenges in sampling. Third, to be aware of non-sampling errors that can affect your results.

This webinar is meant to be accessible for everybody working in M&E, so I'm not going to assume that you have any previous knowledge of survey sampling or statistics. However, there are a few key concepts and words that you will need to know to follow the presentation. I'm sorry if this is obvious for many of you, but I think it's worth quickly reviewing the most important concepts before we get started.

First up: population and sample. When we're talking about surveys and sampling, the word "population" has a special meaning. It refers to the entire group of people or things about whom we want to draw conclusions. In many cases, it's too expensive to collect data for a whole population—we can't go to every single individual and get information from them—so we choose a sample. The sample is a specific group that you will collect data from. Finally, the sampling method is the process of choosing which members of the population are included in this sample. By internalizing what these three things are, it will help you solve a lot of problems related to surveys in M&E.

To illustrate, say that you have a training program that has 160 participants. We want to measure something about them; for example, out of these 160, how many have increased their income after our intervention? For whatever reason, we can't talk to all 160 of them, so we take a sample of 10. The population is everybody; the sample is the small group that we are interviewing. The population doesn't have to be people. It could be specific kinds of people, institutions, buildings, households, or something more specific like victims of gender-based violence. The population is the first step in planning a survey: identifying which group we are interested in.

The next two terms are "estimate" and "error." The sample estimate is the result that we get from our survey—the information we get from those 10 people we selected. Error is the difference between our sample estimate and the population parameter, or the true value. If 46 people in our population increased their income (29%), but in our random selection of 10 people we find that four have increased their income (40%), the difference between these two numbers is the error. In this case, our sample was 11 percentage points different than the true value. We want to design a sample so that we are close, so that we get a good estimate that minimizes this error.

Finally, I want to discuss biased versus unbiased estimates. Bias is a word we use in common speech, but in sampling, it has a specific meaning. A biased estimate is one that is more likely to go in one direction—it is more likely to be lower or higher than the true value. An unbiased estimate is just as likely to be lower or higher; it's going to be a fair value. We will be focusing on looking for unbiased estimates from our samples.

00:08:15 Simple random sampling vs. convenience sampling

Before we get to complex samples, it is worth identifying what a simple random sample is. This is the best kind of sample—the ideal sample. Let's use a concrete example. Let's say you have a program to increase women's participation in the security sector in your country. It's a three-year program with commercials, trainings, and workshops trying to get meaningful participation of women in the police and army. You want to measure the impact. You might conduct a survey of an indicator, such as the percentage of security staff holding a positive perception of women's entry, advancement, and leadership in the security sector.

Our population is all 33,544 members of the police and the army. That is far too many to interview directly, so we need to take a sample. Let's contrast the simple random sample with convenience sampling. In a simple random sample, you have a well-defined population, and every single member of that population has an equal and independent chance of being selected. Nobody is more likely to be included than anybody else. If we do this, we have a whole field of math and probability theory to estimate the margin of error, calculate sample size, and be assured that the estimate is unbiased.

Contrast this with convenience sampling. Convenience sampling is widely used but has shortcomings. You are sampling at the interviewer's or the respondent's convenience. For example, if I live in the capital city, I might just walk to a couple of police stations in my neighborhood and see who is there. Or you might do a web survey and share a link on Twitter. The downside is that you have no way to know what kind of error you have. Are the people I found at the police station at 3:00 PM comparable to those who were not there? Are they different from people in other provinces? You have no tools to determine whether it's unbiased. If a donor asks where you got a number, you can't defend a convenience survey. If you use random sampling, you can defend your work by showing you followed steps to ensure everyone had an equal chance of selection.

A simple random sample requires two things: a complete list of the population (the sample frame) and ensuring every person has an equal chance of being selected. For example, you might get a list from the Ministry of Defense and use Excel's random number function to select your sample.

00:14:45 Cluster sampling

In the real world, you face challenges. What if you don't have a full list of the population? What if the Ministry won't give you the list, or there is no central list due to decentralization? This is where we turn to complex samples. In a complex sample, members may not have an equal chance of being selected, but they still need to have a known chance. Your design might include clustering, stratification, and multiple stages.

Clustering is where we divide the population into natural groups, like police stations, villages, schools, or clinics. First, we choose the clusters randomly, and then we go into the clusters and choose people randomly within them. The biggest reason to do clustering is to reduce transportation costs. The downsides are that it might be less precise (more error than a simple random sample) and may require weighting the data for analysis.

If I just draw people randomly from my list for a nationwide survey, I might end up with one or two people per village across hundreds of villages. I would have to travel to each village just to interview one person, which is expensive. In clustering, instead of choosing people first, we choose clusters. We might choose four clusters and then interview individuals randomly within those clusters. This allows us to be much more efficient.

Does the choice between simple random sampling and complex sampling depend on population size? Not really. Cluster sampling is about reducing transportation costs. If you are doing a telephone survey or email survey, you don't need to worry about clustering because there are no travel costs.

A key concept in clustering is homogeneity, or the Intracluster Correlation (ICC). In the best-case scenario, each of your clusters looks like a mini-version of your population. For example, the age breakdown in one village is similar to the next. However, if you have a high ICC, cluster sampling can be dangerous. For example, if you want to measure access to an elementary school, clustering by village is risky because usually, either everyone in the village has access or nobody does. If you happen to select only villages with schools, your result will be 100% access, which is wrong. If you select only villages without schools, you get 0%. You need to be aware that not everything can be measured reliably with a clustered survey.

00:22:00 Weighting and probability of inclusion

In the real world, clusters are not always equally sized. Not every village is the same size. This means people in small villages might have a bigger chance of being selected, which leads to a biased estimate. To get an unbiased estimate, we require weighting the data.

Let's look at the math of "inclusion probability," which is the chance that you will be selected. Imagine four police stations with populations of 100, 200, 100, and 300 (Total 700). If we choose two clusters with equal probability, each station has a 50% chance of being selected. If we then interview 10 people in Station 1 (population 100), the individual chance is 10 out of 100, or 10%. In Station 4 (population 300), interviewing 10 people gives an individual chance of 10 out of 300, or 3%.

The total inclusion probability is the probability of the cluster being selected multiplied by the probability of the individual being selected. For Station 1, it is $0.5 \times 0.1 = 0.05$ (5%). For Station 4, it is $0.5 \times 0.033 = 0.016$ (1.6%). The person in the small station has a much higher chance of being included.

To compensate, we assign a weight to each respondent. The weight is the reciprocal of the probability of inclusion. For Station 1, the weight is $1 / 0.05 = 20$. For Station 4, the weight is $1 / 0.016 = 60$. We use these weights to avoid bias.

Alternatively, you can use Probability Proportionate to Size (PPS). Instead of selecting clusters with equal probability, you give bigger stations a larger chance of being selected. If you do the math correctly, the probabilities balance out so that the total inclusion probability for each person is the same. If everyone has the same weight, you don't need to use weights during the analysis, which simplifies things and can reduce error.

00:31:30 Stratified sampling

The next tool is stratification. In clustering, we randomly choose certain groups to sample. In stratification, we divide the population into groups (strata) and sample in all of the groups. Why do this? Sometimes it is logistically simpler, but primarily it allows you to oversample certain groups of interest. The cons are that oversampling results in more error for your total sample than a simple random sample, and it requires weighting.

For example, if we are interested in women in the security services, and they make up only 5% of the population, a simple random sample of 100 people might only yield 5 women. That is not enough data to draw conclusions about women specifically. To fix this, we stratify. We divide the population into men and women and choose to interview more women (oversampling) to ensure we have enough data points.

Whether you oversample depends on your research question. If you only care about the security service as a whole, you don't need to oversample. If you need details about a specific subgroup, you stratify.

00:36:00 Case study: Sampling in Afghanistan

I want to illustrate these ideas with a case study from work I did in Afghanistan over a decade ago. We faced many challenges: no recent census, no list of the population, difficult transportation, high costs, and security constraints. Additionally, to respect local culture, we needed separate interviewers for men and women.

This led to a complex, multi-stage sample design. We started by stratifying by province. We did 100 interviews per province because having provincial breakdowns was important for the research. We then divided each province into rural and urban strata because the logistics were different. In rural areas, we had lists of villages and could cluster by village using Probability Proportionate to Size.

Within each village, we stratified by gender. We had a male interviewer and a female interviewer. They traveled together for cost reasons, but they conducted separate samples. We clustered by household, choosing 10 households from each primary sampling unit, and then selected one respondent in each household.

This multi-stage sampling helps when you don't have a sampling frame. We used the best data available—estimates of population breakdowns—to calculate weights. We selected villages based on available lists from NGOs or census preparations.

00:44:30 Techniques for household selection

Once you get to the village, how do you select households? You want to give every household an equal chance.

The gold standard is a household listing. You arrive at the village, make a list of every single household, and then use a random number generator to draw your sample. This gives you high confidence but is very expensive and time-consuming.

Another method is the random walk. This is a protocol given to interviewing teams. For example, start at a mosque, walk towards the sun, and select every fifth household. The goal is to take the choice away from the interviewer to avoid convenience sampling bias (e.g., avoiding houses that look unfriendly). Research has shown random walks can introduce bias, such as centering on the middle of the village or underestimating poverty.

A third method is grid or area-based sampling. You can use Google Earth to outline structures and create a grid, then tell interviewers which grids to interview. This is more robust but has higher costs. In Afghanistan, we mostly used a variation of the random walk protocol due to budget constraints.

00:49:00 Q&A: Sample size and non-response

Alex: I see a question about response rates. "You carry out a web survey sent to all participants but only 33% respond."

Response rates are problematic because it means your sample could be biased towards the people who chose to respond. Your survey effectively becomes a convenience sample. The key to dealing with non-response is follow-up. In a household survey, if no one is home, you must do call-backs. The reasons someone is not home (e.g., they have a job) are likely correlated with the things you are measuring.

If you send a survey to 500 people and get a 33% response rate, you have a convenience sample. It is better to choose a random sample of 50 people and spend the time and resources to follow up relentlessly to get a response from those specific 50 people. That will yield a better, unbiased survey.

Question: Is it better to use oversampling?

Alex: It depends on your research questions. If you need to know about a small group (like women in a male-dominated sector), you must oversample to get enough data. If you are only interested in the population as a whole, you do not need to oversample.

00:56:00 Q&A: Convenience sampling vs. random sampling

Matteo (Question): Convenience sampling is often easier and quicker, meaning we can interview more people. A bigger sample could be more representative than a smaller one. So in some situations, isn't convenience sampling preferable?

Alex: I really appreciate this question because it addresses a common misconception. I have a problem with the word "representative" because it is hard to define mathematically. It is better to evaluate samples based on error and bias.

Let's compare:

With the random sample, we can calculate the margin of error (e.g., +/- 10%). With the convenience sample, we cannot calculate the error because the math requires random selection. We have no idea what the error is.

More importantly, let's talk about bias. With a properly conducted random sample, we have reason to believe the estimate is unbiased—it is not likely to be consistently high or low. With a convenience sample, we have no idea what the bias is. Did we only talk to people on their way to work? Did we only talk to people in the capital city? Those groups are likely very different from the general population.

If you give me a choice, I will take the random sample every single time. At least I know what I am getting and can plan for it. With a convenience sample, the information might be worse than useless.

01:05:00 Conclusion

As a responsible M&E practitioner, if you don't have the budget to do a random sample responsibly, consider alternatives. Maybe quantitative research isn't right for your project at this stage. Consider key informant interviews or focus groups. You won't get a percentage, but you will learn something valid.

Please don't do a convenience sample and put the numbers in a report. As soon as you put numbers in a report, people take them seriously and make decisions based on them. Be careful with quantitative research.

Thank you all for joining. Please take a look at ActivityInfo for your M&E data collection needs, including surveys, beneficiary tracking, and case management. We will share the recording and slides. Have a great evening.

Sign up for our newsletter

Sign up for our newsletter and get notified about new resources on M&E and other interesting articles and ActivityInfo news.

Which topics are you interested in?
Please check at least one of the following to continue.