CREATE FUNCTION BIR_GET_DIM_PROD_ID (p_prod_cde VARCHAR(15), p_date DATE, p_ctry_code CHAR(2))
RETURNS BIGINT
LANGUAGE SQL
BEGIN ATOMIC
DECLARE v_id BIGINT;
SET v_id = (SELECT PROD_ID
FROM HEW_PROD_DIM
WHERE REC_RMOVE_DT_TM IS NULL
AND p_date BETWEEN START_DT AND END_DT
AND PROD_CDE = p_prod_cde
AND CTRY_CDE = p_ctry_code);
IF v_id IS NULL
THEN
SET v_id = -1;
END IF;
RETURN v_id;
END @
Provided that I have already built necessary index on CAMP_CDE and PROD_CDE for the working tables. I need to transform 3 million records to get the necessary surrogate keys for each record. Hence, I wonder which of the following approach has better performance:
Approach 1
Insert all records from one working table to another by once together with 10 functions as follows:
INSERT INTO BIR_WK2_TABLE (REC_ID, CAMP_CDE, CAMP_ID, PROD_CDE, PROD_ID, ... REC_UPDT_DT_TM)
SELECT REC_ID,
CAMP_CDE,
BIR_GET_DIM_CAMP_ID(CAMP_CDE),
PROD_CDE,
BIR_GET_DIM_PROD_ID(PROD_CDE),
....
REC_UPDT_DT_TM
FROM BIR_WK1_TABLE;
Approach 2
Insert all records from one working table to another for 10 times. Each time I only join one dimension table to the working table as follows:
BIR_CAMP_DIM table will be joined for CAMP_ID in the 1st round.
INSERT INTO BIR_WK2_TABLE
(REC_ID, CAMP_CDE, CAMP_ID, PROD_CDE, PROD_ID, ... REC_UPDT_DT_TM)
SELECT a.REC_ID,
a.CAMP_CDE,
b.CAMP_ID,
a.PROD_CDE,
a.PROD_ID,
...
REC_UPDT_DT_TM
FROM BIR_WK1_TABLE A, BIR_CAMP_DIM b
WHERE a.CAMP_CDE = b.CAMP_CDE;
BIR_PROD_DIM table will be joined for PROD_ID in the 2nd round.
INSERT INTO BIR_WK3_TABLE
(REC_ID, CAMP_CDE, CAMP_ID, PROD_CDE, PROD_ID, ... REC_UPDT_DT_TM)
SELECT a.REC_ID,
a.CAMP_CDE,
a.CAMP_ID,
a.PROD_CDE,
b.PROD_ID,
...
REC_UPDT_DT_TM
FROM BIR_WK2_TABLE A, BIR_PROD_DIM b
WHERE a.PROD_CDE = b.PROD_CDE;
Repeat the insert action to join other dimension tables until all surrogate keys are retrieved.