I analyzed a slow query created from the B/E app and found a costly ShuffleGroupBy operation in its query’s profile.
Then I try reduce the ShuffleGroupBy time by adding the WITH(LEAF_PUSHDOWN=TRUE) hint, however it does not help much. When checking the operation again in the profile (with the leaf_pushdown hint), it shows the following 2 messages:
encoded_group_by_unsupported: “at least one join does not meet requirements for push down for columnstore scan”
encoded_join_unsupported: “join key is not single integer column at join depth 1”
I have some questions about the query performance as below:
- What are the requirements for push down for columnsotre scan?
- Does the table “x_tmp_hz_cv_r_sales_inv_store_by_channel_all” has the proper shard key for this query?
- What is the root cause for the slowness and any suggestions to improve?
Thank you in advance for your time and reply!
Query:
profile select WITH(LEAF_PUSHDOWN=TRUE)
CATALOG_4.parent_id ascatalog_4,
period_id,
sum(sales_inv_hist.purchase_spent) aspurchase_spent,
sum(sales_inv_hist.purchase_discount) aspurchase_discount,
sum(sales_inv_hist.purchase_units) aspurchase_units,
sum(sales_inv_hist.cost_amt) ascost_amt,
sum(sales_inv_hist.gross_margin) asmargin,
sum(sales_inv_hist.purchase_spent) assales_amount,
– sum(case
– when sales_inv_hist.cal_dt = to_date then sales_inv_hist.inventory_on_hand_qty
– end) aseop_inventory_on_hand_qty,
– sum(case
– when sales_inv_hist.cal_dt = to_date then sales_inv_hist.inventory_on_order_qty
– end) asinventory_on_order_eop_qty,
– sum(sales_inv_hist.oos_flg) asoos_flag,
– sum(case
– when sales_inv_hist.cal_dt = to_date then sales_inv_hist.oos_flg
– end) aseop_oos_flag,
– sum(sales_inv_hist.over_stock_flg) asover_stock_flag,
– sum(sales_inv_hist.current_oos_impact) asoos_impact,
– sum(sales_inv_hist.current_low_stock_impact) aslow_stock_impact,
– sum(case
– when sales_inv_hist.cal_dt = to_date then sales_inv_hist.inventory_on_hand_cost_amt
– end) asoh_eop_cost,
– avg(sales_inv_hist.inventory_on_hand_cost_amt) asavg_oh_cost,
– sum(case
– when sales_inv_hist.cal_dt = to_date then sales_inv_hist.inventory_on_hand_amt
– end) astotal_oh_eop,
– sum(price_amt) asprice_amt,
– avg(sales_inv_hist.inventory_on_hand_amt) asavg_oh_retail,
– sum(sales_inv_hist.inventory_on_hand_qty) asinventory_on_hand_qty,
– sum(sales_inv_hist.avg_sales_qty) asavg_sales_qty,
– sum(inventory_on_order_qty) asinventory_on_order_qty,
– avg(sales_inv_hist.inventory_on_hand_qty) asavg_inventory_oh_qty,
count(distinct case
when sales_inv_hist.days_on_hand > 0 then sales_inv_hist.catalog_internal_id
end) asnum_items,
sum(case
when sales_inv_hist.cal_dt = to_date then sales_inv_hist.inventory_on_order_cost_amt
end) aseop_inventory_on_order_cost,
sum(sales_inv_hist.damage_amt) asdamage_amount,
sum(sales_inv_hist.waste_amt) aswaste_amount
from x_tmp_hz_cv_r_sales_inv_store_by_channel_all assales_inv_hist
join x_tmp_hz_dt asperiods
oncal_dt= periods.cal_dt1
straight_join (
select child_id
from (
select distinct child_id
from (
select child_id
from cv_r_all_parents_full_view
where (
hier = 1
and parent_id in (
2, 7)
and parent_level = 4
and child_level = 0
)
) asall_parents1
) asunion1
) asrelevant_products
on sales_inv_hist.catalog_internal_id = relevant_products.child_id
straight_join cv_r_all_parents_full_view asCATALOG_4
on (
CATALOG_4.child_id = sales_inv_hist.catalog_internal_id
and CATALOG_4.child_level = 0
and CATALOG_4.hier = 1
and CATALOG_4.parent_level = 4
)
– where (
– sales_inv_hist.cal_dt between date(‘2022-06-27’) and date(‘2022-11-13’)
– or sales_inv_hist.cal_dt between date(‘2023-06-26’) and date(‘2023-11-13’)
– )
group by
catalog_4,
period_id
;
Schema:
– x_tmp_hz_cv_r_sales_inv_store_by_channel_all definition
CREATE TABLE
x_tmp_hz_cv_r_sales_inv_store_by_channel_all(
cal_dtdate DEFAULT NULL,
cv_pos_idbigint(20) DEFAULT NULL,
catalog_internal_idbigint(20) DEFAULT NULL,
dc_internal_idbigint(20) DEFAULT NULL,
purchase_spentdecimal(19,3) DEFAULT NULL,
purchase_unitsdecimal(19,3) DEFAULT NULL,
purchase_discountdecimal(19,3) DEFAULT NULL,
price_amtdecimal(19,3) DEFAULT NULL,
cost_amtdecimal(19,3) DEFAULT NULL,
gross_margindecimal(19,3) DEFAULT NULL,
net_margindecimal(19,3) DEFAULT NULL,
inventory_on_hand_qtydecimal(19,3) DEFAULT NULL,
inventory_on_hand_amtdecimal(19,3) DEFAULT NULL,
inventory_on_hand_cost_amtdecimal(19,3) DEFAULT NULL,
inventory_on_order_qtydecimal(19,3) DEFAULT NULL,
inventory_on_order_amtdecimal(19,3) DEFAULT NULL,
inventory_on_order_cost_amtdecimal(19,3) DEFAULT NULL,
…
rec_insert_tmspvarchar(100) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,
SORT KEYcal_cv_r_sales_inv_store_by_channel_all_sort_key(cal_dtDESC),
SHARD KEYcal_cv_r_sales_inv_store__by_channel_all_shard_key(cal_dt,catalog_internal_id),
UNIQUE KEYcal_dt(cal_dt,catalog_internal_id,cv_pos_id,dc_internal_id) USING HASH,
KEYhz__catalog_idx(catalog_internal_id) USING HASH,
KEYhz__pos_idx(cv_pos_id) USING HASH
) AUTOSTATS_CARDINALITY_MODE=INCREMENTAL AUTOSTATS_HISTOGRAM_MODE=CREATE AUTOSTATS_SAMPLING=ON SQL_MODE=‘STRICT_ALL_TABLES’;
– x_tmp_hz_dt definition
CREATE TABLE
x_tmp_hz_dt(
cal_dt1date DEFAULT NULL,
period_idbigint(20) DEFAULT NULL,
from_datedate DEFAULT NULL,
to_datedate DEFAULT NULL,
SORT KEY__UNORDERED(),
KEYhz__cal_dt1(cal_dt1) USING HASH,
KEYhz__pcal_dt1(period_id,cal_dt1) USING HASH
, SHARD KEY ()
) AUTOSTATS_CARDINALITY_MODE=INCREMENTAL AUTOSTATS_HISTOGRAM_MODE=CREATE AUTOSTATS_SAMPLING=ON SQL_MODE=‘STRICT_ALL_TABLES’;
Profile:
cannot paste the profile here because of the post size limit. please let me know if there is a way to share.
Here is part of profile with the ShuffleGroupBy operation:
