I have a nested query and I am not sure if the index I've chosen fits it's needs. Currently I am not happy with it's performance. The table contains > 2 Mio rows counting. I am working with Oracle.
My query:
SELECT
*
FROM
mytable where groupid IN (
SELECT
groupid
FROM
mytable where contractid IN (:contractids:) or predecessorContractId in (:contractIds:)
);
contractId
: lots of different existgroupId
: connected rows (predecessorContractId matches contractId) share the same groupId to be able to select contract chains more efficient. Also lots exist.
My index:
CREATE INDEX "MyIndex" ON "MyTable" ("contractId", "predecessorContractId", "groupId");
Is there a way to improve my index? Maybe also my query?
-
Please include RDBMS name like sql server or mysqlLearning_DBAdmin– Learning_DBAdmin2021年10月15日 08:48:39 +00:00Commented Oct 15, 2021 at 8:48
-
Please post complete table definitions, including PK and FKs. "groupId: connected rows (predecessorContractId matches contractId) share the same groupId to be able to select contract chains more efficient" -This is probably not true and the root of your problem.user212533– user2125332021年10月15日 13:13:50 +00:00Commented Oct 15, 2021 at 13:13
2 Answers 2
If I get this right,
You start with a set of contract id's to collect the set of group id's that correspond to them. Then you want to collect all the rows that have a group id that is in that set.
By definition, this means the DBMS must do a table scan. Your best chances are to create a second index in which group id is the first attribute (e.g. group id, contract id). The system then has the option to go fetch "all the rows that have such a group id" using that second index (after eliminating the duplicate group id's that were found in step 1).
In order to serve the predicate on predecessorContractId, you'd also need a third index where predecessorContractId is the first attribute.
SELECT *
is generally bad even if you really need all or most of the columns, it is even worse if you only need some small number that could be covered by the index- Depending on the optimization the subquery itself might be evaluated many times - once per main table row. But it might be optimized by joining or by materializing too. Execution plan can show you.
- The subquery contains
OR
and that cannot use indexes efficiently (it can do index merge, but not with your index only)
For index merge two indexes could work together
- (contractId, groupId)
- (predecessorContractId, groupId) (or maybe their reverses, depending on the actual plan)
Better imho would be to get rid of the OR - do two conditions with two separate subqueries instead, use EXISTS instead of IN and switch the indexes to have groupId
first. That way each subquery will have specific groupId
and will only check the second column in the index fast.
Another way is to get rid of the subquery entirely, you can JOIN the table to itself (but possibly the optimizer does that for you already?)
Explore related questions
See similar questions with these tags.