Environment
- OceanBase Community Edition 4.2.1 - MySQL mode,3 Availability Zones
I'm evaluating OceanBase for a project that requires complex joins. A query that executes in 200ms on MySQL 5.7 takes over 15 seconds in OceanBase with the same schema and data.
- Schema
-- OceanBase/MySQL
CREATE TABLE users (
id BIGINT PRIMARY KEY,
name VARCHAR(100),
created_at DATETIME
) PARTITION BY HASH(id) PARTITIONS 8;
CREATE TABLE orders (
id BIGINT PRIMARY KEY,
user_id BIGINT,
amount DECIMAL(10,2),
INDEX idx_user (user_id)
) PARTITION BY HASH(id) PARTITIONS 16;
Both tables have proper indexing:
- users.id is the primary key (as shown in the schema)
- orders.user_id has a secondary index (idx_user)
- The problematic query
SELECT u.name, SUM(o.amount)
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE u.created_at > '2023-01-01'
GROUP BY u.id;
With the same data volume (500k users and 5M orders), MySQL 5.7 (InnoDB) completes the query in ~200ms, while OceanBase takes over 15 seconds; EXPLAIN shows different execution plans.
I've tried optimizer hints
SELECT /*+ LEADING(u o) USE_NL(o) */ u.name, SUM(o.amount)
FROM users u JOIN orders o ON u.id = o.user_id...;
and also checked the execution plan differences:
-- OceanBase EXPLAIN shows full partition scans
| ===========================================
| ID | OPERATOR | NAME | EST. ROWS |
| -------------------------------------------
| 0 | HASH GROUP BY | | 100000 |
| 1 | HASH JOIN | | 5000000 |
| 2 | TABLE SCAN (PART) | u | 100000 |
| 3 | TABLE SCAN (PART) | o | 5000000 |
Question
Why does this simple JOIN query perform poorly in OceanBase?
(id, created_at)on theuserstable.