SUMMARIZES function summarizes its input table, grouping by one or more columns.
SUMMARIZE (<Table Expression>, <groupBy_columnName>[, <groupBy_columnName>]…[, <name>, <expression>]…)
The summarize function evaluates to a new table. The new table will include any columns that are grouped upon, as well as the new expressions.
If you have a programme with the goal of promoting certain nutritional practices for mothers of children five and under, and have partners reporting aggregated output indicators to you such as:
- Number of mothers reached through door-to-door outreach (
- Number of estimated mothers reached through radio broadcasts (
You may want to estimate your overall reach by combining these two indicators. However, if you have one partner conducting door-to-door outreach and another partner running radio broadcasts in the same geographic area, you don't want to double count those indicators.
In this case, if you added up all of these indicators, you would risk double counting the outputs in the Nothern district where 500 mothers were reached in person, but 2,000 were reached via the radio. It's likely that many mothers received both an in-person visit and heard the message on the radio.
One solution to this is to first add up the results at the district level, and take either the in-person numbers, or the radio numbers, which ever is greater.
We'll first use the SUMMARIZE function to compute a new table with SUMS of the two fields by district.
SUMMARIZE([Partner reports], Province, "TOTAL_INPERSON", SUM(INPERSON), "TOTAL_RADIO", SUM(RADIO))
This gives us a new table that looks like:
Now to find the total reach, we can use the SUMX function to calculate the MAX of each row, and then sum the maxes:
SUMMARIZE([Partner reports], Province, "TOTAL_INPERSON", SUM(INPERSON), "TOTAL_RADIO", SUM(RADIO)) |> SUMX(MAX(TOTAL_INPERSON, TOTAL_RADIO))
This gives us a final result of 2,000 + 1,000 + 5,000 + 300 or 8,300.