Thursday November 16, 2023

Identifying and managing bias in data analysis and interpretation

Host

Eliza Avgeropoulou
Panelist

Victoria Manya

About this webinar

There are different types of bias that can affect the work of a MEAL/IM professional working with data. Being aware and alert can help us overcome them and avoid serious flaws in our data analysis and interpretation.

During this webinar, we go over the concept of bias and how it is linked to our work in M&E. We also discuss ways to manage cases of bias so as to mitigate the risks.

In summary, we explore:

What is bias?
What are the different types of biases?
Why is the identification of bias important?
How does bias affect data analysis and interpretation?
How is bias associated with monitoring, evaluation and learning?
How can we manage bias and mitigate the risks?

View the presentation slides of the Webinar.

Is this Webinar for me?

Do you wish to understand the concept of bias and how bias can affect your work?
Do you wish to explore ways to manage bias during your team's work on data analysis and interpretation?
Do you wish to ask questions about bias?

Then, watch our webinar!

About the Speakers

Eliza Avgeropoulou earned her BSc from Athens University of Economics and Business, and her MSc degree in Economic Development and Growth from Lund University and Carlos III University, Madrid. She brings eight years of experience in M&E in international NGOs, including CARE, Innovations for Poverty Action and Catholic Relief Services (CRS). The past five years, she has led the MEAL system design for various multi-stakeholders’ projects focusing on education, livelihoods, protection and cash. She believes that evidence-based decision making is the core of high quality program implementation. She now joins us as our M&E Implementation Specialist, bringing together her experience on the ground and passion for data-driven decision making to help our customers achieve success with ActivityInfo.

Victoria Manya has a diverse background and extensive expertise in data-driven impact, project evaluation, and organizational learning. She holds a Master's degree in local development strategies from Erasmus University in the Netherlands and is currently pursuing a Ph.D. at the African Studies Center at Leiden University. With over ten years of experience, Victoria has collaborated with NGOs, law firms, SaaS companies, tech-enabled startups, higher institutions, and governments across three continents, specializing in research, policy, strategy, knowledge valorization, evaluation, customer education, and learning for development. Her previous roles as a knowledge valorization manager at the INCLUDE platform and as an Organizational Learning Advisor at Sthrive B.V. involved delivering high- quality M&E reports, trainings, ensuring practical knowledge management, and moderating learning platforms, respectively. Today, as a Customer Education Specialist at ActivityInfo, Victoria leverages her experience and understanding of data leverage to assist customers in successfully deploying ActivityInfo.

Transcript

00:00:00 Introduction and poll

Thank you, Faith, for the introductions. Welcome once again to the session. Today we will be focusing on the identification and management of bias in data analysis and interpretation. We are eager to hear from each of you about your experiences with bias in monitoring and evaluation.

We have a poll intended to help us hear from you about some of the challenges that you may have encountered with bias in your practice. Your insights will provide us valuable context for discussion and also for further discussions after this session. We see that the majority of the participants today have encountered or dealt with issues relating to bias in M&E practices. If you have not, when we delve into the context around bias, you may recognize that you have seen this in your practice as an M&E practitioner.

00:03:46 Scope of the discussion

It is important to know that we have identified some of the most common biases. However, our conversation is not exhaustive, considering the broad spectrum of biases that exist in M&E and the time that we have today. Our agenda includes exploring the understanding of bias in M&E, delving into various types of biases, addressing the strategies for managing and mitigating bias, and analyzing a report that sheds light on practical mitigation strategies in real-life scenarios as it relates to ICT for Development. Finally, we will complete the session with a Q&A.

00:04:43 Understanding bias in M&E

"The cause is hidden, the result is known." This is a quote that captures our innate human curiosity and the ongoing efforts of human beings to comprehend the reasons behind events. An example of this curiosity manifests in impact evaluation, or monitoring and evaluation, as a method aimed at discerning what is effective, what isn't, and the reasons behind it. While impact is valuable for determining if our program is genuinely making a difference, it is also important for us to know that evaluations are susceptible to various biases.

In the realm of M&E, bias is a distortion that creeps into our findings or evaluations and creates systematic errors. This distortion can either skew the results, causing an overestimation or an underestimation of certain characteristics or trends from our data. The origins of bias often trace back to either incomplete information or the utilization of flawed data collection methods. Bias can be both intentional, where there is a deliberate manipulation of information, or unintentional, stemming from inadvertent errors or oversights. Ultimately, bias interferes with the accuracy of your assessment and undermines the reliability of M&E outcomes.

In psychology, the concept of optical illusions mirrors how bias operates. The angle, the context, and the positionality of those involved in the evaluation are crucial in determining your interpretations and your outcomes. Essentially, bias becomes a nearly inherent aspect of your M&E endeavors, influenced by the perspectives and circumstances of everyone involved, from data collection to data analysis, visualization, and decision-making.

When we evaluate programs, we usually use two types of information: numbers (quantitative) and description (qualitative). These types of information can be influenced by biases, which are almost like a built-in error. Historically, some evaluators think that Randomized Controlled Trials (RCTs) are the best way to avoid biases by randomly assigning participants to different groups. However, even though RCTs are considered really good, they are not perfect. While we have methods to reduce errors in evaluation, they aren't flawless and might not fit every situation perfectly.

Most types of evaluations face challenges because of biases and external factors. Cognitive and behavioral biases mean the way you think as an evaluator and the way you act can affect how you see and understand information. Even for data collection staff, the way you view a group of people or an area might affect the way you interpret the data. These viewpoints are generally called positionality. It is like wearing different glasses; when looking at the same thing, the glasses represent unique perspectives. Additionally, external pressures like politics and social issues can impact evaluation. Qualitative evaluations, which are more about exploring and understanding, are particularly sensitive to these outside influences.

00:16:10 Types of biases

We may encounter empirical bias, like sensitivity to patterns, attribution errors, self-importance, the halo effect, selection, placement, and statistical biases. We also have researcher bias, which includes the evaluator's allegiance either to a people or to a method. We have conservative biases, standpoint or positionality, and similar-person bias. Then we have methodological bias, involving availability bias, diplomatic bias, and courtesy bias. Lastly, we have contextual bias, including friendship bias and pro-project bias.

00:17:48 Empirical bias

Empirical bias includes pattern recognition bias, availability bias, and attribution bias. Pattern recognition bias often leads organizations to perceive short-term positive changes as lasting patterns, potentially overlooking underlying issues. For instance, observing a temporary increase in community well-being after a single aid distribution and assuming a sustained positive trend without addressing systemic challenges.

Availability bias can result in an exaggerated focus on the immediate response, influenced by an overestimation of the likelihood of high-profile events. For example, over-prioritizing immediate disaster response funding after a high-profile hurricane while underestimating the importance of long-term community resilience building.

Attribution bias poses a risk in assessment as attributing positive changes solely to internal efforts without considering external factors. For example, crediting a health organization alone for improved community health without acknowledging external factors like increased access to government healthcare facilities.

Selection bias happens when people choose whether to be part of a program, which can mess up how we see the program's effect. If only certain types of people join a health program, we might not get an accurate picture of how well it works for everyone. Program placement bias complements this; it occurs when we compare places with a program to those without. Because programs are often put in places with specific needs, it is hard to tell if the program or just the different needs are causing any difference.

Attrition bias is a problem when people drop out of a program, making it tough to know if the program would work for everyone. Lastly, social desirability bias means people might give answers in surveys that make them look good even if it is not true.

00:22:13 Researcher bias

Bias can come from the evaluator. Allegiance bias might manifest when researchers are strongly attached to a specific humanitarian approach and dismiss alternative methods. Conservative bias is evident when organizations resist adopting new methods despite evidence supporting their efficiency.

Perspective or positionality bias becomes apparent when your background shapes how you interpret data. For instance, an evaluator from an urban background might overlook unique challenges in rural areas. Response bias might emerge if bias in how disaster survivors respond to assessments is not considered, leading to incomplete or skewed data.

00:25:04 Methodological and project bias

Methodological bias includes courtesy bias, where individuals provide positive feedback to evaluators, especially if affiliated with an NGO, resulting in an overly optimistic portrayal. Diplomatic bias emerges when evaluators, aiming to maintain positive relationships with local authorities, avoid delving deeper into data inconsistencies. Exposure bias occurs when evaluators are disproportionately influenced by their exposure to certain aspects of a project, neglecting less visible components. Friendship bias arises when evaluators have personal connections within the community, impacting impartiality.

Project bias occurs when evaluators linked to specific initiatives intentionally emphasize positive aspects while overlooking challenges to avoid looking like a failure. In qualitative research, interpretation and translation biases can influence data accuracy. Note-taking bias may arise due to variations in the quality of notes during interviews.

00:29:09 How bias affects data analysis and interpretation

The possibility of bias permeates the entire data lifecycle, becoming especially pronounced during data collection, analysis, and interpretation. When biases are present from the outset, they create a faulty foundation that ripples through every stage.

In evaluation, the focus is often on determining whether interventions yield measurable positive impact. This involves formulating hypotheses. A null hypothesis assumes no effect, while an alternate hypothesis proposes a beneficial impact. Bias can result in Type I errors (mistakenly rejecting a true null hypothesis/false positive) or Type II errors (failing to reject a false null hypothesis/false negative). A prevailing pro-action bias in statistical practices introduces a tendency to lean towards finding positive results, often overshadowing a more objective discussion of limitations.

Non-representative samples pose a risk of biased estimation. Undercoverage bias creates limitations in understanding impact on marginalized populations. Selection bias introduces skewness, and bias due to self-selection arises when voluntary participation affects the conclusion.

00:33:44 Strategies to mitigate bias

The first step to mitigate bias is to acknowledge its existence. We need to promote triangulation of interests to encompass different perspectives. Systemic pressure from political or economic reasons can lead to bias, which is often unconscious. While intentional bias is observable, unintentional bias stems from beliefs and is harder to mitigate.

We need to be systematic, transparent, and reflexive. This means being clear on the research and analysis plans, outlining data sources, specifying collection instruments, and documenting protocols to enable reproducibility. While quantitative data can be less prone to cultural biases, qualitative research requires building rapport to enable less biased responses. We must always document the full methodological account and keep an archive of data.

00:37:58 Specific mitigation strategies

For program placement bias, we should randomize where possible. If randomization is not possible, we can model program placement or use quasi-experimental approaches.

For selection bias, randomization into control and treatment groups is ideal. If not possible, we can model the selection process using statistical methodologies like propensity score matching and difference-in-differences. It is challenging to find a comparable non-participant group; the characteristics must be quite similar. The absence of a valid control group often leads to overestimation or underestimation of impact.

For attrition bias, we need to track dropouts early. We must report the level of attrition and analyze if it has resulted in impact. The challenge is greater when dropouts are not random, meaning specific characteristics led to the dropout.

Regarding the decision tree for selection bias:

For social desirability bias, emphasize the discrepancy between actions and survey responses. Train data collectors to observe reactions, especially in face-to-face interactions. Be mindful that social desirability bias varies by context and social norms.

For cognitive bias, structure sessions to moderate dominating voices and ensure a range of perspectives are heard.

00:47:35 Using technology to mitigate bias

Technology like ActivityInfo enables access to real-time information. We can use data visualization and analysis to increase transparency by creating timely reports shared with stakeholders for corrective action. Real-time reports help identify similarities and differences across participants and non-participants, facilitate identification of selection sources, and monitor dropouts for early indication of attrition.

00:48:48 Practical scenario

Consider an unconditional cash transfer project with a baseline and endline. The project identified challenges in targeting, acknowledging potential selection bias. The survey design explores project design to identify control and treatment groups. To mitigate risk, a table report is created to include endline surveys as they come in.

By tracking indicators and comparing baseline and endline results between participants and non-participants, we can check for bias. If we see fewer surveys in the endline or cannot find non-project participants, we must ask if this impacts results and if replacements are possible. Observing characteristics like employment status or education level allows us to compare groups and determine if differences are due to self-selection or the project. Identifying these issues early via real-time data allows for correction where possible.

00:52:44 Q&A session

How can we detect outliers and extremes in a data series? Detecting outliers is crucial. Common methods include visual inspections to see the overall distribution and identify tails or spikes. Mathematical tools like Z-scores can tell you how far a data point is from the average. You can also calculate interquartile ranges; if a point is significantly outside this range, it could be an outlier. Time series analysis, as available in ActivityInfo, helps identify patterns that don't fit usual trends.

Is it possible to have bias in RCTs? Yes, bias can occur in RCTs. Sources include participants not adhering to assigned treatment, crossover between groups (contamination), and attrition. If the randomization process is not properly implemented, imbalances between groups can introduce bias. No method is infallible.

What about the margin of error in data collection? The margin of error represents the degree of uncertainty. It can increase as a result of biases. A larger margin of error means results are likely farther from true values for the whole population. Randomization and having a large enough sample can help mitigate this and lower the margin of error.

What kind of bias is it when people select a participant who can speak fluently versus one who cannot? This relates to interpretation or translation bias, or potentially selection bias/researcher bias. Choosing a participant based on fluency rather than their status as a project participant skews the perspective. It is important to include different people and perspectives, perhaps using sampling to mitigate this.

How do you handle situations where a target area is hard to reach or participants refuse to respond? If a village is unreachable, consider if it can be approached by other means (technology). If impossible, try to include another village with similar observable characteristics. Regarding refusal, try to understand why they refuse, as this might imply something about the project. If replacing participants, ensure the replacement has similar observable characteristics to the original.

How to resolve bias issues when managing a database that already contains bias? If the database already exists, you cannot change the data collection or design. You can only influence the analysis and interpretation. You must acknowledge the limitations—stating that certain results cannot be considered valid due to specific reasons—and place this in the limitations section of your report. You cannot "solve" the bias in the data itself at that stage; you can only be transparent about it.

Sign up for our newsletter

Sign up for our newsletter and get notified about new resources on M&E and other interesting articles and ActivityInfo news.