Introduction to Calculated tables for data analysis
HostJeric Kison
PanelistVictoria Manya
About this webinar
About this webinar
Join our upcoming webinar to delve into the significance of Calculated tables in the realm of data analysis. Gain insights into how Calculated tables harmonize data sets, allowing for data transformation and in-depth analysis.
The “Introduction to Calculated tables for data analysis in ActivityInfo” webinars is a series of two sessions addressed to new and older users of ActivityInfo who wish to get introduced to the Calculated tables feature.
During the first session we explore a variety of table functions and practical examples that illustrate the real-world applications of Calculated tables within the ActivityInfo platform. The second session is an Office Hour where you can address more of your questions.
In summary, we explore:
- Gain an understanding of tables as the building blocks of data analysis in ActivityInfo.
- Discover why Calculated tables are indispensable for transforming your data into a more manageable structure.
- Learn how Calculated tables allow you to combine, transform, and manipulate data tables for meaningful insights.
- Explore table functions available in ActivityInfo that help you craft Calculated tables.
- Walk through practical examples that illustrate how Calculated tables can be applied to real-world scenarios.
View the presentation slides of the Webinar.
Use the database template used in the Webinar.
Is this Webinar for me?
- Are you responsible for data analysis in ActivityInfo or is this a role you would like to take on?
- Do you wish to make the most of the latest data analysis features in the platform?
- Would you like to address your questions to our team?
Then, watch our Webinar!
About the trainers
About the trainers
Mr. Jeric Kison earned his Bachelor's Degree from York University in Canada and his MBA from the University of Oxford in the United Kingdom. He has worked with NGOs and governments across four continents on strategy and evaluation for nine years. Before joining ActivityInfo he worked as a Monitoring & Evaluation Officer at Pilipinas Shell Foundation, Inc., where he led a project to develop an organizational M&E System which included the roll-out of ActivityInfo as the organization’s new information management system. Today, Jeric is working as a Customer Success Director in the ActivityInfo team bringing together his experience on the ground and passion for data to help our customers achieve success with ActivityInfo.
Victoria Manya has a diverse background and extensive expertise in data-driven impact, project evaluation, and organizational learning. She holds a Master's degree in local development strategies from Erasmus University in the Netherlands and is currently pursuing a Ph.D. at the African Studies Center at Leiden University. With over ten years of experience, Victoria has collaborated with NGOs, law firms, SaaS companies, tech-enabled startups, higher institutions, and governments across three continents, specializing in research, policy, strategy, knowledge valorization, evaluation, customer education, and learning for development. Her previous roles as a knowledge valorization manager at the INCLUDE platform and as an Organizational Learning Advisor at Sthrive B.V. involved delivering high- quality M&E reports, trainings, ensuring practical knowledge management, and moderating learning platforms, respectively. Today, as a Customer Education Specialist at ActivityInfo, Victoria leverages her experience and understanding of data leverage to assist customers in successfully deploying ActivityInfo.
Transcript
Transcript
00:00:00
Introduction
Good afternoon, everyone. Thanks so much for joining us today. Victoria and I are excited to be doing this webinar series on Calculated tables. Calculated tables are a new feature that we've recently released to enhance our analysis functionalities on ActivityInfo. We are going to be doing two sessions about this topic, starting off with today's session where we'll be doing an introduction to Calculated tables and how you can use them for analyzing your data in ActivityInfo. Following that, we'll do our second session next week, which will be an Office Hour session. You can come to that session with any questions that you have following your use of Calculated tables on your own projects.
Today we're going to be covering a few key areas. First of all, we'll be explaining a bit about how Calculated tables work, what they are, and why you would use them in the first place. We will go over some key concepts to help us all understand what Calculated tables are. Then we'll go into the practical matters of how you would actually go about creating Calculated tables in ActivityInfo, and within that, we'll go over some of the common functions that you would use to create those tables. Finally, we'll go into some worked examples where we'll go over real-world scenarios, look at example data sets, and walk through the practical steps of constructing a formula.
00:02:15
Key concepts: Rows, columns, and tables
Before we move into the practice of how Calculated tables work, we would like to go through some concepts that would allow us to understand the feature better. Let's begin with the concepts of rows and columns within a table. Rows are the horizontal divisions or lines within a table, typically running from left to right. They represent the individual records or entries within the table, and each row contains specific information related to a single record item or an observation. In a data set, each row may contain data about a particular entity, such as a person, a transaction, or an event.
Columns are the vertical divisions or lines within a table, typically running from the top to the bottom. They represent specific categories or types of information related to the records. Each column contains a particular type of data. For example, in a spreadsheet, columns might include categories like name, age, and gender. In ActivityInfo, you might see columns for gender (male, female) or dates and university degrees.
Tables are a combination of rows and columns. They create a structured format for organizing and presenting your data. Each cell in the table, formed by the intersection between the row and the column, holds specific information, forming a cohesive or organized data set. It is important to note that in ActivityInfo, the data is structured within forms and organized in records and fields. When you're conducting analysis, these forms are interpreted as tables. Each row in this table represents a single record, while each column signifies a particular field within the data structure.
00:06:46
Data types and transformation
Understanding data types is crucial when working with functions and manipulating or transforming data within various systems. In ActivityInfo, different data types like string, number, dates, and specialized types like phone numbers carry specific formats and characteristics that affect how they are handled and processed by functions. Understanding your column types in data analysis is vital because it profoundly influences the accuracy, reliability, and variety of analytical methods that can be employed.
The primary goal of Calculated tables in ActivityInfo is to transform data and ensure that it is more convenient and readily usable for analytical purposes. By utilizing Calculated tables, users are able to merge multiple forms into a single comprehensive table, alter the data structure, and even transform columns into rows and vice versa. This transformation ensures that data is optimally organized for more effective and insightful analysis.
Calculated tables are beneficial in various scenarios. First, they are useful for merging multiple forms or sources of data into a cohesive, unified table. For instance, imagine a humanitarian organization collecting data from various sources like surveys, field reports, and external databases. Calculated tables can be used to merge this diverse data into a single, comprehensive view. Second, they assist in creating new tables that selectively include only the pertinent values required for in-depth analysis, aiding focus on specific data elements without the clutter of unnecessary information. Lastly, Calculated tables can facilitate reorienting your fields or columns, transforming data by shifting them between rows and columns.
00:15:46
Creating calculated tables
Calculated tables are created through the use of table functions, which are designed to generate tables based on specific criteria. Employing a Calculated table involves two primary steps. First, you need to specify the table containing the data that you need. The source table can be an existing data set, a combination of previous data sets, or any structured data source. Next, you need to define the transformation of the data. The transformation could involve filtering rows, selecting specific columns, aggregating data, merging data sets, or performing any other manipulation needed to generate the desired output.
It is important to understand the difference between scalar and table values. Scalar values are individual, single values associated with a specific record or entity (e.g., a specific name like "Alice" or age like "42"). Table values encompass multiple related scalar values arranged in a tabular format. While a scalar value refers to a single piece of information, a table value involves a collection of related scalar values organized into a structured table.
00:19:52
Supported functions
ActivityInfo supports specific functions for Calculated tables:
In the ActivityInfo user interface, the creation of Calculated tables happens within pivot tables. These tables are continuously updated, so any changes made to existing records are reflected in the Calculated tables.
00:27:12
Practical examples
We will now go through three real-world scenarios to bring these concepts to life. We will walk through the requirements, the approach, and how to construct the formula in ActivityInfo.
00:28:25
Example 1: Merging columns
In this scenario, we are implementing a program where we need to report on the number of beneficiaries. We are reporting beneficiaries for each region, and our implementing partners are reporting numbers disaggregated by sex and age group (e.g., Adult Male, Elderly Male, Adult Female, Elderly Female). However, our donor requires a report showing the total number of beneficiaries aggregated by sex only (Total Male, Total Female).
To achieve this, we can use the SELECTCOLUMNS function. This allows us to specify the columns we need from our existing data set and define an expression to combine the columns associated with the same sex.
In ActivityInfo, we navigate to the pivot table report design interface and click "Add calculated table." We name the table "Aggregated by Sex." The formula uses SELECTCOLUMNS to select the region field and then generate new columns for "Males" and "Females" by summing the respective adult and elderly fields (e.g., Male + Elderly Male). This creates a new table with the structure we need to report back to the donor.
00:35:55
Example 2: Combining multiple tables
In this scenario, we have information collected in separate tables for data collection. We have different team members submitting records depending on the type of activity they are implementing (e.g., Food Distribution and Medical Outreach). Beneficiaries could participate in more than one activity, so names might appear in multiple forms. Our HQ requires a report showing the total unique number of beneficiaries served across all activities.
Our approach is to use the UNION function to combine the two forms into a single list of all beneficiaries served. Once we have that single list, we can use the COUNTDISTINCTX function to calculate the unique number of beneficiaries.
In ActivityInfo, we select both the Medical Outreach and Food Distribution forms. We create a calculated table called "Combined Beneficiaries" using the UNION function, specifying the two source tables. This generates a new table containing all rows from both sources. To get the unique count, we create a calculated measure using COUNTDISTINCTX, specifying the new combined table and the beneficiary column.
00:45:00
Example 3: Calculating values based on existing data
In this scenario, we are implementing a farm assistance program and running an annual survey to evaluate effectiveness. We use a single survey form to collect data across years (2021 and 2022). We need to produce a report showing the year-to-year change in yield at the individual beneficiary level.
Our approach is to use the SUMMARIZE function to group the survey responses by beneficiary, ensuring one record per farmer. Then, we use the SUMX function to derive the values for the indicator for each specific year. Finally, we calculate the difference between the values for each year.
In the formula editor, we create a table called "Year on Year Change." We use SUMMARIZE to group by the "Farm" field. We then create columns for "2021" and "2022." For the 2021 column, we use SUMX combined with an IF statement to sum the yield only where the year equals 2021. We do the same for 2022. This restructures the data so we have a column for each year. We can then create a calculated measure to subtract the 2021 value from the 2022 value to show the difference or impact.
00:52:10
Q&A session
Q: Regarding the Union function, do the keys have to match for it to work? A: Yes, the Union function requires that you have the same number of columns and that the columns are ordered in the same way across all the tables you are trying to combine. If your tables are not in the same structure, you can use ADDCOLUMNS or SELECTCOLUMNS to transform them into a uniform structure before applying the Union function.
Q: Is it possible to do analysis by fiscal year (October to September)? A: Yes. If your data source only has a year field (e.g., 2021, 2022), you are limited. However, if you collect the specific date of the survey response, you can create a calculated field to categorize that date into a fiscal year. You can then use that fiscal year field as your grouping dimension in the SUMMARIZE function.
Q: In the last example, what happens if one of the values returned for 2021 is empty? A: If there is no entry for 2021 but there is one for 2022, the table will return an empty or null value for the 2021 column, but the row for that farm will still exist. If there are no entries for either year, the row would not be populated.
Q: Can we move from a Union table to Summarize, or do we need to Summarize first? A: It depends on your data structure and required output. You can Union raw data tables first and then Summarize, or you can Summarize individual tables to get a common structure and then Union them. Both approaches are valid depending on the specific scenario.
Q: Can I auto-fill data in one form from another form using a formula? A: You can use the ADDCOLUMNS function in combination with a lookup function to pull data from another form into your table for analysis purposes. For example, you could pull in the sex of a beneficiary from a registration form into your analysis table.
Q: Typically we do a random sample of farmers each year, so they are often different individuals. How does this affect the year-on-year analysis? A: The example assumes you are tracking the same entities. If you need to link individuals to a specific farm across years, your database should be modeled to link individual farmers to a "Farm" entity. You can then summarize based on the Farm entity rather than the individual if the individual respondent changes.
Thank you for joining us. Please take some time to practice using Calculated tables with your own data, and feel free to join our Office Hours next week for any further questions.
Sign up for our newsletter
Sign up for our newsletter and get notified about new resources on M&E and other interesting articles and ActivityInfo news.