I have two tables table1
and table2
, both with columns group
, element_id1
, element_id2
, date
.For every date
and every group
, there are multiple elements in the group. Each element has 2 ids (every element_id1
is mapped to exactly one element_id2
as vice versa, but sometimes one of them can be NULL
[exists but not recorded in our table for that row]).I want to find elements that are in one table per each group
and date
but not in the other. For checking if two elements are equal, I match them either on element_id1
or element_id2
(depending on which one is non-NULL
).For example let the tables be:
- table1:
group | element_id1 | element_id2 | date |
---|---|---|---|
G1 | 1 | NULL | 1/31/2024 |
G1 | 2 | B | 1/31/2024 |
G1 | 3 | C | 1/31/2024 |
G1 | NULL | A | 2/2/2024 |
G1 | 4 | D | 2/2/2024 |
G2 | 1 | A | 1/31/2024 |
G2 | 5 | NULL | 2/2/2024 |
- table2:
group | element_id1 | element_id2 | date |
---|---|---|---|
G1 | 1 | A | 1/31/2024 |
G1 | NULL | B | 1/31/2024 |
G1 | NULL | A | 2/2/2024 |
G1 | 4 | NULL | 2/2/2024 |
G2 | 1 | A | 1/31/2024 |
G2 | NULL | F | 2/2/2024 |
I want the outcome to be
group | element_id1 | element_id2 | date | Status |
---|---|---|---|---|
G1 | 3 | C | 1/31/2024 | Missing from table2 |
G2 | 5 | NULL | 2/2/2024 | Missing from table2 |
G2 | NULL | F | 2/2/2024 | Missing from table1 |
I want to set INDEX in the tables. What are the good indices here? Should I index on the combination of (group
, date
, element_id1
, element_id2
)? But the problem is sometimes one of the elemet_id
s is NULL
. Sould I index only on group and date in both tables?
The SQL that I use is as follows but is VERY slow:
SELECT COALESCE(t1.group, t2.group) as [Group], COALESCE(t1.element_id1, t2.element_id1) as element_id1, COALESCE(t1.element_id2, t2.element_id2) as element_id2, COALESCE(t1.[date], t2.[date]) as [date], CASE WHEN t1.[date] IS NULL THEN 'Missing from table1' ELSE 'Missing from table2' END AS [Status]FROM table1 t1FULL JOIN table2 t2ON t1.[group] = t2.[group] AND t1.[date] = t2.[date] AND (t1.element_id1 = t2.element_id1 OR t1.element_id2 = t2.element_id2)WHERE t1.[date] IS NULL OR t2.[date] IS NULL