Part 2 of 3

Wednesday August 2, 2023

Discover the power of Calculated measures in ActivityInfo

Host

Jeric Kison
Panelist

Victoria Manya

About this session

In this session, we explore the power of calculated measures and how you can use them for advanced analysis with pivot tables in ActivityInfo. It is highly recommended to join or watch the introductory session of this series to get a better understanding of pivot tables and formulas. Following this session, you can raise your questions to our team in the third and final part.

In summary, we explore:

Combine data from different forms into a single measure
Utilize different types of aggregations in the same measure
Aggregate data at multiple levels for comprehensive insights

View the presentation slides of the Webinar.

Use the following practice demo database to try out the activities mentioned in the webinar.

In order to best prepare for this session, we recommend studying the following documentation:

Is this Webinar for me?

Are you struggling with advanced data analysis for your project reporting?
Do you need to combine data from different forms effectively using ActivityInfo?
Are you seeking innovative solutions for optimized data analysis?

Then, watch our Webinar!

Other parts of the series

The “Pivot tables and Calculated measures for data analysis with ActivityInfo” webinars is a series of three sessions addressed to new and older users of ActivityInfo who wish to get introduced to the built in analysis tools of the platform or dive into more advanced features such as the calculated measures.

About the trainers

Mr. Jeric Kison earned his Bachelor's Degree from York University in Canada and his MBA from the University of Oxford in the United Kingdom. He has worked with NGOs and governments across four continents on strategy and evaluation for nine years. Before joining ActivityInfo he worked as a Monitoring & Evaluation Officer at Pilipinas Shell Foundation, Inc., where he led a project to develop an organizational M&E System which included the roll-out of ActivityInfo as the organization’s new information management system. Today, Jeric is working as a Customer Success Director in the ActivityInfo team bringing together his experience on the ground and passion for data to help our customers achieve success with ActivityInfo.

Victoria Manya has a diverse background and extensive expertise in data-driven impact, project evaluation, and organizational learning. She holds a Master's degree in local development strategies from Erasmus University in the Netherlands and is currently pursuing a Ph.D. at the African Studies Center at Leiden University. With over ten years of experience, Victoria has collaborated with NGOs, law firms, SaaS companies, tech-enabled startups, higher institutions, and governments across three continents, specializing in research, policy, strategy, knowledge valorization, evaluation, customer education, and learning for development. Her previous roles as a knowledge valorization manager at the INCLUDE platform and as an Organizational Learning Advisor at Sthrive B.V. involved delivering high- quality M&E reports, trainings, ensuring practical knowledge management, and moderating learning platforms, respectively. Today, as a Customer Education Specialist at ActivityInfo, Victoria leverages her experience and understanding of data leverage to assist customers in successfully deploying ActivityInfo.

Transcript

00:00:00 Introduction

Hi, everyone. Good afternoon, good morning, or good evening, depending on where you're coming from. Victoria and I are very excited to be with you today, once again as part of our calculated measures webinar series. We are running this three-part webinar series to help our users take advantage of the enhancements to ActivityInfo's analysis capabilities that were part of our latest release.

If you were with us last week, we had our foundational session where we walked you through the foundational features that help us to take advantage of calculated measures. We talked through how to work with formulas and pivot tables as part of that first session. Now we're in the second session where we will be looking at how we can actually master everything that calculated measures has to offer.

Today's session will be quite an interactive session. We did share the link to a demo database, and we encourage you to use that database to follow along. We'll go through some practical examples of how you can use calculated measures, and we'll refresh as well how they actually work in principle. Following today, next week, we'll be back with an office hours session. You can come to that session prepared with any questions that you have based on the work that you've been doing within your own organizations and how you've applied calculated measures.

00:03:48 Understanding calculated measures

We focus on pivot tables as the host that accommodates calculated measures. In the context of expanding the analytical capabilities of pivot tables, we simply refer to our ability to enhance the functionality of pivot tables by introducing calculated measures. A calculated measure in this context is a user-defined calculation that is added to a pivot table to perform advanced computations based on existing data. As we will see, the calculated measure feature opens a world of possibilities in pivot tables.

How do formulas fit into this picture? Previously in our webinar from last week, we took time to explain how formulas aid the design of pivot tables and the creation of calculated measures. A calculated measure is created using a formula or an expression within a data model and pivot tables. When you add a calculated measure, you are simply introducing a new data point that isn't directly present in the original data set but is derived from the existing data, and you do this with the aid of formulas. Essentially, formulas allow you to create your own custom calculations to fit your specific requirements.

When you create a calculated measure in the context of data analysis and pivot tables, it means that you are adding this new measure or this metric to your data model that is not explicitly present as a field in the original data set. Calculated measures allow you to perform this custom calculation and introduce new insights based on this existing data within your form in ActivityInfo.

In a typical data model, you have different fields that represent various aspects of your data. Some of the fields could be, for instance, gender fields referring to a number of female beneficiaries in an intervention, or the number of males, or the number of IDPs. These fields are known as measures or dimensions. Generally, when you create a pivot table, you can use these measures to summarize and analyze the data based on different criteria, like across dates, regions, or partners. However, sometimes the standard measure provided by the dataset may not be sufficient to answer specific analytical questions. This is where calculated measures come into play.

One crucial advantage of calculated measures is that they are not bound by row context. In pivot tables, the data is often organized into rows and columns, and calculations are performed by row. However, calculated measures allow you to perform calculations that consider the entire data set or a specific group of data, disregarding this row-level context. This freedom opens up a wide range of analytical possibilities.

00:09:48 Use cases for calculated measures

When you create a calculated measure, you can do a number of things with it. First, it allows you to combine data from different forms into one measure. Imagine that your data is distributed across different forms in ActivityInfo. With a calculated measure, you can bring all that data together into one single measure. For example, let's say you have data on the total number of workshops and training sessions, and you want to find the total number of capacity-building activities. With a simple formula, you can add them up to get the total number of capacity-building activities implemented across your entire project, even though this information was initially stored in different forms.

Secondly, with calculated measures, we can use different types of aggregations together in the same measure. It is like putting puzzle pieces together to see the bigger picture. For instance, you can divide the sum of teachers by the sum of students to find the student-to-teacher ratio, or the sum of male beneficiaries to female beneficiaries to find the gender ratio.

Third, calculated measures allow you to aggregate data multiple times at different levels. For instance, you might want to know the total number of beneficiaries at the country level, but you have data at the district level. You want to ensure that you can create a formula that adds up the beneficiaries in each district and then find the maximum for each activity within the district. Considering the possibility for double counting, you can create a formula that adds up the beneficiaries in the district, finds the maximum for each activity within the district, and finally adds up those maximums to get the total at the country level. You can see this as building a data pyramid one step at a time.

Finally, you can apply an explicit filter to run a calculation on a subset of data. Here with calculated measures, you can zoom in on specific details. For example, you might want to summarize results for only a particular sector or a particular sub-sector. All you need to do is set up the formula that helps you filter, and you get the exact information that you need. It is like focusing a camera lens on the most critical aspects of your data.

00:14:28 Supported functions

Calculated measures in ActivityInfo come with several real functions designed to work with relational data. These new functions are meant to complement the existing functions that we already support.

The parameters for the syntax are "Table," which represents the form containing the data to be iterated over, and "Expression," which represents the numeric expression to be evaluated for each row in the table.

In addition to these aggregation functions, we will introduce another set of functions which support table manipulation. These functions allow you to create what we call calculated tables. Calculated tables are custom tables that you can create to organize your data in a way that suits your analysis better. Sometimes the existing forms or tables in your database might not be structured exactly how you need them for your analysis. Calculated tables enable you to group together values based on specific dimensions or to combine data from multiple forms.

00:25:40 Demo background and first quiz

We have for our use the "Who is doing What, Where" (3W) database template. The purpose of this database is to identify actors responding to a humanitarian situation, the sectors that they are working in, and the locations. It supports response coordination, partnership identification, assessing progress in an intervention, and identifying responses that overlap or gaps in responses. In our example today, we have two clusters working within two sectors, and the information is contained in the Health form and the WASH cluster form.

Let's delve into our quiz for the day. Assuming we want to aggregate the total number of beneficiaries that are reached in the WASH and Health cluster of this sample database, what would be the correct formula to use?

For the majority of the people who chose option A, that is indeed the correct response. Option A represents the correct formula because it uses the SUMX function to sum the total beneficiaries reached from the two different tables (WASH and Health) and then aggregates the total beneficiaries reached to get a total sum. Option B is not correct because it uses a COUNTX function. Option C uses AVERAGEX, and Option D uses MINX, which are not appropriate for calculating the total beneficiaries reached.

00:30:42 Demo: Using different aggregations in the same measure

Assuming we want to use different types of aggregations together in the same measure. In the WASH cluster 3W form, we have information on the pane containing the fields we will be working with. If I want to calculate the ratio of beneficiaries reached compared to the total beneficiaries targeted so far, here is how you can do it.

We want to use a SUMX. We select the WASH cluster form and go on to select the "Total Beneficiaries Reached". We close the parenthesis. Remember that what we want to do is to compare it with the total beneficiaries targeted. So, we will continue by dividing this by SUMX again, choosing the same form, and then selecting the "Total Beneficiaries Targeted".

Once we click okay, the value returns as 2. What we want is to show the ratio of total beneficiaries reached to those that were targeted. So you come here and choose the correct number format, and the ratio shows 1:8. With this, you get enough insight into how your project is faring.

00:33:55 Demo: Combining data from different forms

The second possibility is to combine the data from different forms into one measure. Here you can also add an explicit filter. Assuming we want to combine the total beneficiaries reached from the WASH and the Health sector for a specific sector named "Infrastructure".

We add a calculated measure, and our formula will be SUMX. We choose the form where we want to start pulling data from; the first is the WASH form. The expression is: If(Sector = "Infrastructure", Total Beneficiaries Reached). Now we want to do the same for the other form as well. So we add a plus sign and do the same SUMX, adding the Health cluster form. The expression is: If(Sector = "Infrastructure", Total Beneficiaries Reached). This will return a value that combines the total beneficiaries reached from the WASH and the Health cluster for the specific sector named Infrastructure.

00:38:14 Demo: Aggregating data at multiple levels

In this third demonstration, we need to calculate the total number of beneficiaries that have been reached by the Health cluster. If you take a look at the Health cluster form, we have the reported number of beneficiaries reached by a number of different implementing partners who are implementing different kinds of activities. For each month, they report on those numbers.

It is entirely possible that these implementing partner organizations are reaching the same groups of people in the course of their activities, especially if they're operating in the same districts. So we'll need to be mindful of the possibility that we're double counting the same beneficiaries. You might think that you might be able to create a basic pivot table to get to that figure, taking the maximum of figures across the districts. However, we can safely assume that there is limited likelihood that beneficiaries will be transferring into different districts within the same reporting month. So rather than taking the maximum across these districts, we might want to simply take the sum to get a more accurate picture.

If you use a basic pivot table and select "Summarize by Max," it takes the maximum value across the districts. But what if you wanted to apply different levels of aggregation and use a different aggregation function at each level? At the first level, we take the maximum within a district, but at the second level, you want to take the sum across the districts. Basic pivot tables don't enable that calculation.

To do this with calculated measures, the first step is to use the SUMMARIZE function, which is one of the new table functions. This will allow us to create a custom table from our Health cluster's form such that we're grouping together the dimensions that we need for analysis. We select the form, then the dimensions we want to group by (Reporting Month and District). Next, we give this new column a name, "Total Beneficiaries". Finally, we provide an expression to evaluate the value: MAX(Total Beneficiaries Reached).

This creates a custom table grouped by reporting month and district, taking the maximum beneficiaries. The next step is to take the sum across the rows that exist in this new table. To do that, we wrap the previous formula in a SUMX function. The first argument is the table expression we just created. The second argument is the field we want to sum: [Total Beneficiaries]. This formula first takes the maximum within a district and then takes the sum across those districts.

00:48:40 Defining context and second quiz

In this quiz, we wanted to ask you this question to refresh our understanding of the concept of a context within calculated measures. For calculated measures, how would you define the context for your data analysis?

Most of you responded with option C: "Always relying on measures to conduct initial analysis of the data set in order to define context." This may seem like a correct response, but the key thing to remember for calculated measures is that when you're defining the context for your measure, you need to select a specific table that contains the data that you want to analyze. That table can exist in the form of an existing form in your database, or you can create a custom table using one of the table functions. So actually, the correct response to this quiz is option D: "Selecting a form from the list or utilizing one of the table functions."

00:51:47 Practice exercise: Ratio of girls to boys

We have some prompts for you to try out in your own demo database. The first one is to find the ratio of girls to boys within the WASH form in our database. Within this form, we have columns that show the disaggregated figures for the number of girls that were reached and the number of boys that were reached.

To do this calculation, we'll need to select the fields that are the inputs of this formula. We create a new pivot table report and add a calculated measure. We need the total of girls and the total of boys. To get the total figure, you want to use the SUMX aggregation function. First, define the context, which is the WASH cluster form. Then define the expression: Total Girls.

We need to get the ratio of the girls to boys, so we'll need to divide that. We use a division operator. For the second part, we use a similarly structured SUMX function taking the WASH form as the context and selecting the field for Total Boys. We might want to multiply this by 100 to get the final number. The result is 670, or 6.7 if divided properly. This tells us there are almost seven girls to every boy in this data set.

00:59:41 Final demo recap

I will quickly show the Infrastructure calculation again. We go to the calculated measure and use SUMX. We choose the form of interest, which is the WASH cluster. The expression is: If(Sector = "Infrastructure", Total Beneficiaries Reached). In addition to that, we add a plus sign to add that to the next form. We do the same SUMX, choose the Health cluster form, and use the expression: If(Sector = "Infrastructure", Total Beneficiaries Reached). This filters by infrastructure for both forms and calculates the total beneficiaries reached, giving us an aggregate from both forms.

Sign up for our newsletter

Sign up for our newsletter and get notified about new resources on M&E and other interesting articles and ActivityInfo news.