Oracle 12c appears to take the following steps in order to compute the inflection point for a join:

- Identifies the number of rows coming from the "left side" table. Zero will be the lower bound for the inflection point, and the left side row count (minus the number of NULLs in the join field) will be the upper bound.
- Computes the sub-query cost at a left side cardinality of 1,
- for a nested loop join, and
- for a hash join

- Computes the sub-query cost at a left side cardinality of the left side table row count (minus number of NULLs),
- for a nested loop join, and
- for a hash join

- Then picks a row source cardinality value half way between one and the left table row count, and computes the cost
- for a nested loop join, and
- for a hash join

- If the nested loop join cost is less than the hash join cost in 4., then the cardinality in 4. will be used as the lower bound for the inflection point, and a cardinality half way between the new lower bound and the left table row count will be used next. Otherwise, when the hash join cost is the lower, the cardinality in 4. will be used as the upper bound for the inflection point, and a cardinality half way between one and the new upper count will be used next.
- In a similar manner, a binary split approach is used to iterate onto the left row source cardinality which gives the same cost for a nested loop join as for a hash join. This row source cardinality is then the inflection point for the join.

After the first three cardinality cost pairs have been identified, I might have expected the optimiser to fit a curve through the three points to come up with a much better guess at the inflection point. This might reduce the number of iterations, and associated computation effort required to converge on the inflection point.

Consider the join

SELECT count(*) FROM infl_double_twos a JOIN infl_triple_threes b USING (id1) JOIN infl_five_fives c USING (id1);

The statistics which will be of primary interest to the optimiser will be the number of rows in each table, and the number of distinct "ID1"s in each table:

SQL> COLUMN table_name FORMAT a30; SQL> COLUMN column_name FORMAT a30; SQL> SQL> SELECT table_name, num_rows, blocks 2 FROM tabs 3 WHERE table_name IN ('INFL_DOUBLE_TWOS', 'INFL_TRIPLE_THREES', 'INFL_FIVE_FIVES'); TABLE_NAME NUM_ROWS BLOCKS ------------------------------ ---------- ---------- INFL_TRIPLE_THREES 253962 9077 INFL_FIVE_FIVES 169308 5788 INFL_DOUBLE_TWOS 338616 12137 SQL> SQL> SELECT table_name, column_name, num_distinct, num_nulls, density 2 FROM user_tab_columns 3 WHERE table_name IN ('INFL_DOUBLE_TWOS', 'INFL_TRIPLE_THREES', 'INFL_FIVE_FIVES') 4 AND column_name = 'ID1'; TABLE_NAME COLUMN_NAME NUM_DISTINCT NUM_NULLS DENSITY ------------------------------ ------------------------------ ------------ ---------- ---------- INFL_DOUBLE_TWOS ID1 173440 0 5.7657E-06 INFL_FIVE_FIVES ID1 33756 12345 .000029624 INFL_TRIPLE_THREES ID1 88592 0 .000011288

The plan (taken from the 10053 trace) is:

----------------------------------------------------------+-----------------------------------+ | Id | Operation | Name | Rows | Bytes | Cost | Time | ----------------------------------------------------------+-----------------------------------+ | 0 | SELECT STATEMENT | | | | 3398 | | | 1 | SORT AGGREGATE | | 1 | 18 | | | | 2 | HASH JOIN | | 858K | 15M | 3398 | 00:00:41 | | 3 | INDEX FAST FULL SCAN | INFL_DOUBLE_TWOS_PK | 331K | 1984K | 385 | 00:00:05 | | 4 | HASH JOIN | | 439K | 5273K | 2208 | 00:00:27 | | 5 | NESTED LOOPS | | 439K | 5273K | 2208 | 00:00:27 | | 6 | STATISTICS COLLECTOR | | | | | | | 7 | TABLE ACCESS FULL | INFL_FIVE_FIVES | 153K | 920K | 1571 | 00:00:19 | | 8 | INDEX RANGE SCAN | INFL_TRIPLE_THREES_PK| 3 | 18 | 284 | 00:00:04 | | 9 | INDEX FAST FULL SCAN | INFL_TRIPLE_THREES_PK| 248K | 1488K | 284 | 00:00:04 | ----------------------------------------------------------+-----------------------------------+ Predicate Information: ---------------------- 2 - access("A"."ID1"="B"."ID1") 4 - access("B"."ID1"="C"."ID1") 7 - filter("C"."ID1" IS NOT NULL) 8 - access("B"."ID1"="C"."ID1")

Which implies that for this query, Oracle is not considering anything other than a hash join for INFL_DOUBLE_TWOS, but will decide between nested loops and a hash join for the INFL_FIVE_FIVES to INFL_TRIPLE_THREES join at runtime.

The initial bounds for the inflection point are 1 and 156963. 156963 = 169308 - 12345 (number of rows in INFL_FIVE_FIVES minus number of NULLs in the ID1 field in INFL_FIVE_FIVES).

The nested loop cost at a cardinality of 1 is computed as 1572.78, and the hash join cost is computed as 1856.01. The nested loop cost at a cardinality of 156963 is computed as 315559.34, and the hash join cost is computed as 2208.35. At this point we might suspect that the inflection point is rather closer to 1 than to 156963, but Oracle will perform binary split, and try a cardinality of 78481.5 next.

In this example, the iteration then proceeded as follows

Iteration | Lower Bound | Upper Bound | Lower Bound Costs | Upper Bound Costs | ||
---|---|---|---|---|---|---|

Nested Loops | Hash Join | Nested Loops | Hash Join | |||

1 | 1.00 | 156963.00 | 1572.78 | 1856.01 | 315559.34 | 2208.35 |

2 | 1.00 | 78481.50 | 1572.78 | 1856.01 | 158566.06 | 2141.03 |

3 | 1.00 | 39240.75 | 1572.78 | 1856.01 | 80068.42 | 1856.16 |

4 | 1.00 | 19620.38 | 1572.78 | 1856.01 | 40818.60 | 1856.09 |

5 | 1.00 | 9810.19 | 1572.78 | 1856.01 | 21194.69 | 1856.05 |

6 | 1.00 | 4905.09 | 1572.78 | 1856.01 | 11382.73 | 1856.03 |

7 | 1.00 | 2452.55 | 1572.78 | 1856.01 | 6477.76 | 1856.02 |

8 | 1.00 | 1226.27 | 1572.78 | 1856.01 | 4023.27 | 1856.02 |

9 | 1.00 | 613.14 | 1572.78 | 1856.01 | 2797.02 | 1856.02 |

10 | 1.00 | 306.57 | 1572.78 | 1856.01 | 2184.90 | 1856.02 |

11 | 1.00 | 153.28 | 1572.78 | 1856.01 | 1876.84 | 1856.01 |

12 | 76.64 | 153.28 | 1724.81 | 1856.01 | 1876.84 | 1856.01 |

13 | 114.96 | 153.28 | 1800.82 | 1856.01 | 1876.84 | 1856.01 |

14 | 134.12 | 153.28 | 1838.83 | 1856.01 | 1876.84 | 1856.01 |

15 | 134.12 | 143.70 | 1838.83 | 1856.01 | 1858.84 | 1856.01 |

16 | 138.91 | 143.70 | 1848.83 | 1856.01 | 1858.84 | 1856.01 |

17 | 141.31 | 143.70 | 1852.83 | 1856.01 | 1858.84 | 1856.01 |

18 | 141.31 | 142.51 | 1852.83 | 1856.01 | 1856.84 | 1856.01 |

19 | 141.31 | 142.51 | 1852.83 | 1856.01 | 1854.83 | 1856.01 |