- Excel Version
- 365
As long as there are matching fields in the source tables, we can join the associated queries (one-to-one, one-to-many, etc.) by using the Merge feature.
We are going to use the following sample source tables to demonstrate parent & child merging. These tables have a parent-child relationship between each other.
We can see the foreign IDs in the Orders (Customer ID) and the OrderItems (Order ID, Product ID) tables which establish the relationship between tables. This is how the data is sourced in a normalized manner. By simply merging the necessary tables on these key columns, we can easily generate the required reports.
Load all tables as queries into Power Query. M codes for the base queries are provided below. Make sure using the given query names since these queries will be referred to in the other queries. Note: Binary data source used for the corresponding data tables/sources in the codes below to make it easy to test the samples below.
Products
Customers:
Orders:
OrderItems:
There is nothing fancy so far, simply loaded each table as separate queries into the Power Query environment, and made sure these are saved as Connection Only queries.
The first report we need is the ReportOrders which simply lists individual orders showing the customer and total order value for each order. In order to build this report, first we need a helper query that will get the Customer Name from the Customers table (child query) by using the Customer ID field in the Orders table (parent query).
Select the Orders query and click on Combine->Merge Queries->Merge Queries as New.
In the Merge dialog that pops up, the first query is selected as default since you selected the Query before executing the merge command. Select the Customers query in the second query drop-down control. Once the sample data is loaded, click on the Customer ID column headers in both tables to define the key between the two queries. Note: There will be times that you'll need multiple key matching. In this case, you can select multiple columns by holding the Ctrl key down and making sure the key numbers that appear in the header are matching the second table key numbers (there is no key number displayed if there is only one key column).
Left Outer join is the join type to be used for this sample project. Just click on OK and you get the result table with a new column consisting of cells with the Customer table value embedded. Preview the data in these table values by simply clicking on the white space in the cell. Note: Clicking on the Table keyword will extract that table value (Drill-Down) as a new step, that's why you need to click on the white space.
It is time to extract the Customers' table values. Click on the little "two-ways" button on the right side in the Customers header. Select the Customer Name only to extract the name data.
Since this is a one-to-one relationship (One customer for each Order), there will be only one row extracted for each order record. Rename the query as OrdersWithCustomer and load it as a connection-only query as well.
Note: I often work with M code and personally don't prefer to work with step names (identifiers) including spaces. Therefore, I also changed the step name to something more meaningful and without space.
The M code for the OrdersWithCustomer query:
Secondly, we need another helper query that will get the order item totals that are calculated as the multiplication of the Quantity and Product Price values. It means we will join the OrderItems query (parent with Quantity field) with the Products query (child with Price field). Select the OrderItems query and follow the same steps except selecting the Products query as the second query and the key column is "Product ID" this time. Extract the Product Name and Price columns, rename the new query as OrderItemsWithProduct, and load it as a connection-only query. With this new query, you have the Product Name and Product Price from the Products table for each order item.
The M code for the OrderItemsWithProduct query:
Finally, we can create the ReportOrder query. Merge the OrdersWithCustomer and the OrderItemsWithProduct queries on the Order ID field as the key.
Expand the newly added OrderItemsWithProduct column. Only select the Quantity and Product.Price columns to extract. You will notice that Quantity is coming from the OrderItems query and the Product.Price is coming from the joined Product table.
It is good so far, but we have duplicate orders in the table. That's because an order might have more than one order item. A new column is necessary that will be the result of the total amount for that order item, which means Price x Quantity. Select the query and add a new column by using the Custom Column button in the Add Column tab of the ribbon. The formula is the multiplication of the Quantity and the Price columns. Name the column as OrderLineTotal (I would also change the step name to make it more descriptive and space free).
Now there is a total column that could be aggregated as the grouped sum on Order ID. Select the Order ID, Date, and Name columns (leave the Customer ID column since it is not necessary in this report) and click on the Group By button in the Home tab of the ribbon. Note: Use the Ctrl key to select multiple columns in the table.
Change the new column name to "OrderTotal", select Sum as the operation, and finally select the OrderLineTotal as the column to be aggregated.
Click OK and you get the desired result.
The following is the M code for the ReportOrders query.
Additionally, we can create reports for customer total sales and product total sales by using the same method and grouping rows on the parent query. The M codes for those reports are provided below.
ReportCustomers (Please note that the previous report query is used in this query. Remember Orders vs Customer is one-to-one relationship.)
ReportProducts
Finally, we can see the dependencies by using the Query Dependencies button in the View tab of the ribbon.
We are going to use the following sample source tables to demonstrate parent & child merging. These tables have a parent-child relationship between each other.
Products Table | Orders Table | ||||||
Product ID | Name | Price | Order ID | Date | Customer ID | ||
1 | Product A | 10 | 200 | 7/1/23 | 100 | ||
2 | Product B | 9 | 201 | 7/4/23 | 100 | ||
3 | Product C | 13 | 202 | 7/5/23 | 101 | ||
4 | Product D | 20 | 203 | 7/9/23 | 102 | ||
204 | 7/11/23 | 100 | |||||
Customer Table | 205 | 7/11/23 | 103 | ||||
Customer ID | Name | 206 | 7/13/23 | 102 | |||
100 | Customer A | ||||||
101 | Customer B | ||||||
102 | Customer C | OrderItems Table | |||||
103 | Customer D | OrderItem ID | Order ID | Product ID | Quantity | ||
301 | 200 | 1 | 1 | ||||
302 | 200 | 2 | 3 | ||||
303 | 201 | 3 | 2 | ||||
304 | 201 | 1 | 1 | ||||
305 | 201 | 2 | 1 | ||||
306 | 202 | 4 | 2 | ||||
307 | 203 | 1 | 3 | ||||
308 | 203 | 3 | 2 | ||||
309 | 203 | 4 | 1 | ||||
310 | 204 | 2 | 1 | ||||
311 | 204 | 3 | 2 | ||||
312 | 205 | 4 | 2 | ||||
313 | 206 | 2 | 1 | ||||
314 | 206 | 3 | 2 | ||||
315 | 206 | 4 | 3 | ||||
We can see the foreign IDs in the Orders (Customer ID) and the OrderItems (Order ID, Product ID) tables which establish the relationship between tables. This is how the data is sourced in a normalized manner. By simply merging the necessary tables on these key columns, we can easily generate the required reports.
Load all tables as queries into Power Query. M codes for the base queries are provided below. Make sure using the given query names since these queries will be referred to in the other queries. Note: Binary data source used for the corresponding data tables/sources in the codes below to make it easy to test the samples below.
Products
Power Query:
let
// Sample data used as a binary source to make it easier to follow the code
// The following line could be used instead to load data from a worksheet table
// Source = Excel.CurrentWorkbook(){[Name="Products"]}[Content],
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUQooyk8pTS5RcASyDQ2UYnWilYyQhJ2AbEuwqDGSqDNIsTFY2ARJ2AXINgKaEQsA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Product ID" = _t, Name = _t, Price = _t]),
ChangeType = Table.TransformColumnTypes(Source,{{"Product ID", Int64.Type}, {"Name", type text}, {"Price", type number}})
in
ChangeType
Power Query:
let
// Source = Excel.CurrentWorkbook(){[Name="Customers"]}[Content],
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjQwUNJRci4tLsnPTS1ScFSK1QEJGiILOkEFjZAFnaGCxsiCLkqxsQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Customer ID" = _t, Name = _t]),
ChangeType = Table.TransformColumnTypes(Source,{{"Customer ID", Int64.Type}, {"Name", type text}})
in
ChangeType
Power Query:
let
// Source = Excel.CurrentWorkbook(){[Name="Orders"]}[Content],
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjIwUNJRMtc31DcyMDIGMg2BArE6IAlDsIQJpoQRWMIUIWEIlTAGS1giJIygEiYQO7BYYoohYwyVMYPIGKOYFgsA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Order ID" = _t, Date = _t, #"Customer ID" = _t]),
ChangeType = Table.TransformColumnTypes(Source,{{"Order ID", Int64.Type}, {"Date", type date}, {"Customer ID", Int64.Type}})
in
ChangeType
Power Query:
let
// Source = Excel.CurrentWorkbook(){[Name="OrderItems"]}[Content],
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("Xc/BDYAgDIXhXThzoLSgzkLYfw1pi8/qAUI+kp8yRuJCKadaytrJ1syqFaon3sqmZKI3rgKNhQatQbupioTCYcq78Lx2QuNrF1TeLtmkPklQgoYC+d/adwbyv/VfQaCx0KDenvMG", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"OrderItem ID" = _t, #"Order ID" = _t, #"Product ID" = _t, Quantity = _t]),
ChangeType = Table.TransformColumnTypes(Source,{{"OrderItem ID", Int64.Type}, {"Order ID", Int64.Type}, {"Product ID", Int64.Type}, {"Quantity", type number}})
in
ChangeType
There is nothing fancy so far, simply loaded each table as separate queries into the Power Query environment, and made sure these are saved as Connection Only queries.
The first report we need is the ReportOrders which simply lists individual orders showing the customer and total order value for each order. In order to build this report, first we need a helper query that will get the Customer Name from the Customers table (child query) by using the Customer ID field in the Orders table (parent query).
Select the Orders query and click on Combine->Merge Queries->Merge Queries as New.
Merge queries
In the Merge dialog that pops up, the first query is selected as default since you selected the Query before executing the merge command. Select the Customers query in the second query drop-down control. Once the sample data is loaded, click on the Customer ID column headers in both tables to define the key between the two queries. Note: There will be times that you'll need multiple key matching. In this case, you can select multiple columns by holding the Ctrl key down and making sure the key numbers that appear in the header are matching the second table key numbers (there is no key number displayed if there is only one key column).
Select key columns
Left Outer join is the join type to be used for this sample project. Just click on OK and you get the result table with a new column consisting of cells with the Customer table value embedded. Preview the data in these table values by simply clicking on the white space in the cell. Note: Clicking on the Table keyword will extract that table value (Drill-Down) as a new step, that's why you need to click on the white space.
Merge result
It is time to extract the Customers' table values. Click on the little "two-ways" button on the right side in the Customers header. Select the Customer Name only to extract the name data.
Extract merged table value
Since this is a one-to-one relationship (One customer for each Order), there will be only one row extracted for each order record. Rename the query as OrdersWithCustomer and load it as a connection-only query as well.
Note: I often work with M code and personally don't prefer to work with step names (identifiers) including spaces. Therefore, I also changed the step name to something more meaningful and without space.
Orders with customer name
The M code for the OrdersWithCustomer query:
Power Query:
let
Source = Table.NestedJoin(Orders, {"Customer ID"}, Customers, {"Customer ID"}, "Customers", JoinKind.LeftOuter),
ExpandCustomers = Table.ExpandTableColumn(Source, "Customers", {"Name"})
in
ExpandCustomers
Secondly, we need another helper query that will get the order item totals that are calculated as the multiplication of the Quantity and Product Price values. It means we will join the OrderItems query (parent with Quantity field) with the Products query (child with Price field). Select the OrderItems query and follow the same steps except selecting the Products query as the second query and the key column is "Product ID" this time. Extract the Product Name and Price columns, rename the new query as OrderItemsWithProduct, and load it as a connection-only query. With this new query, you have the Product Name and Product Price from the Products table for each order item.
Orders with product name
The M code for the OrderItemsWithProduct query:
Power Query:
let
Source = Table.NestedJoin(OrderItems, {"Product ID"}, Products, {"Product ID"}, "Products", JoinKind.LeftOuter),
ExpandProducts = Table.ExpandTableColumn(Source, "Products", {"Name", "Price"}, {"Products.Name", "Products.Price"})
in
ExpandProducts
Finally, we can create the ReportOrder query. Merge the OrdersWithCustomer and the OrderItemsWithProduct queries on the Order ID field as the key.
Merge queries
Expand the newly added OrderItemsWithProduct column. Only select the Quantity and Product.Price columns to extract. You will notice that Quantity is coming from the OrderItems query and the Product.Price is coming from the joined Product table.
It is good so far, but we have duplicate orders in the table. That's because an order might have more than one order item. A new column is necessary that will be the result of the total amount for that order item, which means Price x Quantity. Select the query and add a new column by using the Custom Column button in the Add Column tab of the ribbon. The formula is the multiplication of the Quantity and the Price columns. Name the column as OrderLineTotal (I would also change the step name to make it more descriptive and space free).
Add the order line total column
Now there is a total column that could be aggregated as the grouped sum on Order ID. Select the Order ID, Date, and Name columns (leave the Customer ID column since it is not necessary in this report) and click on the Group By button in the Home tab of the ribbon. Note: Use the Ctrl key to select multiple columns in the table.
Group by Order ID
Change the new column name to "OrderTotal", select Sum as the operation, and finally select the OrderLineTotal as the column to be aggregated.
Order Line Total
Click OK and you get the desired result.
Orders Report
The following is the M code for the ReportOrders query.
Power Query:
let
Source = Table.NestedJoin(OrdersWithCustomer, {"Order ID"}, OrderItemsWithProduct, {"Order ID"}, "OrderItemsWithProduct", JoinKind.LeftOuter),
ExpandOrderItemsWithProduct = Table.ExpandTableColumn(Source, "OrderItemsWithProduct", {"Quantity", "Products.Price"}, {"OrderItemsWithProduct.Quantity", "OrderItemsWithProduct.Products.Price"}),
OrderLineTotal = Table.AddColumn(ExpandOrderItemsWithProduct, "OrderLineTotal", each [OrderItemsWithProduct.Quantity] * [OrderItemsWithProduct.Products.Price]),
OrderTotal = Table.Group(OrderLineTotal, {"Order ID", "Date", "Name"}, {{"OrderTotal", each List.Sum([OrderLineTotal]), type number}})
in
OrderTotal
Additionally, we can create reports for customer total sales and product total sales by using the same method and grouping rows on the parent query. The M codes for those reports are provided below.
ReportCustomers (Please note that the previous report query is used in this query. Remember Orders vs Customer is one-to-one relationship.)
Power Query:
let
Source = ReportOrders,
CustomerTotals = Table.Group(Source, {"Name"}, {{"OrderTotal", each List.Sum([OrderTotal]), type number}})
in
CustomerTotals
ReportProducts
Power Query:
let
Source = OrderItemsWithProduct,
OrderLineTotal = Table.AddColumn(Source, "OrderLineTotal", each [Quantity] * [Products.Price]),
ProductTotals = Table.Group(OrderLineTotal, {"Products.Name"}, {{"OrderQuantity", each List.Sum([Quantity]), type nullable number}, {"OrderTotal", each List.Sum([OrderLineTotal]), type number}})
in
ProductTotals
Finally, we can see the dependencies by using the Query Dependencies button in the View tab of the ribbon.
Query Dependencies