UNION

Description

The UNION function calculates a new table that contains all the rows from each of two or more table expressions.

Usage

UNION(<Table Expression>, <Table Expression2>, <Table Expression3>)

The union function calculates a new table that contains all the rows from each of the two table expressions.

  • Each table must have the same number of columns.
  • Columns are combined by position in their respective tables.
  • The column names in the return table will match the column names in the first table argument
  • Duplicate rows are retained.
  • The returned table has lineage where possible. For example, if the first column of each table_expression has lineage to the same base column C1 in the model, the first column in the
    UNION result will have lineage to C1. However, if combined columns have lineage to different base columns, or if
    there is an extension column, the resulting column in UNION will have no lineage.
  • When data types differ, the resulting data type is determined based on the rules for data type coercion.
  • The returned table will not contain columns from related tables.

Examples

Counting unique beneficiaries across multiple activities

The UNION function is useful for combining similar information that is stored in multiple forms. For example, if you are managing a program that provides training and loans to the same group of people, you might be interested in knowing how many unique individuals you have supported each quarter.

If the details about the training and the loans are stored in different forms, then you will need the UNION function to first combine the list of recipients, and then find the number of distinct beneficiaries.

Database layout containing trainings, participants, loans and beneficiaries form
Database layout containing trainings, participants, loans and beneficiaries form

Then you need to combine the date and beneficiary ID from the two forms using union. For example:

UNION(
	SELECTCOLUMNS(participants, 
      "date", @parent.date, 
      "beneficiary", participant),
    
SELECTCOLUMNS(loans,
	  "date", disbursement_date,
	  "beneficiary", recipient)) |>
COUNTDISTINCTX(beneficiary)

In the example above, we have to first reshape the two forms, Participants and Loans so that they have the same fields in the same order. For the training participants, the relevant date actually comes from the parent form and is associated with the training. For the Loans, we choose to use the disbursement date.

The UNION function then gives us one big long list of beneficiary IDs and the COUNTDISTINCTX counts the number of unique beneficiary ids.

Next item
Explanation