0

I have a DolphinDB table with an array vector column. I need to remove duplicate rows based on subset relationships within that column.

Sample Input:

sym prices
a [3,4,5,6]
a [3,4,5]
a [2,4,5,6]
a [5,6]
a [7,9]
a [7,9]

Expected Output:

sym prices
a [3,4,5,6]
a [2,4,5,6]
a [7,9]

Deduplication Logic:

  1. Subset Removal: If a row's prices array is a subset (i.e., fully contained) of another row's prices array, remove the subset row. In the example, [3,4,5] is a subset of [3,4,5,6], so it is removed; similarly, [5,6] is also a subset of [3,4,5,6] and is removed.

  2. Full Duplicate Removal: If multiple rows have identical prices arrays, keep only one.

What I've Tried:

I considered using group by to remove exact duplicates, but this approach cannot handle subset relationships.

Core Question:
How can I perform this subset-based deduplication?

DarkBee
14.4k9 gold badges86 silver badges135 bronze badges
asked Nov 20, 2025 at 8:00
0

1 Answer 1

0

Disclaimer: I don't know DolphinDB.

You want to remove real subsets from the table. According to the docs (https://docs.dolphindb.com/en/Programming/Operators/OperatorReferences/lt.html) you can use the less-than operator for this:

delete from mytable subset
where exists
(
 select *
 from mytable superset
 where subset.prices < superset.prices
);

(If you only want to compare price vectors for the same sym, you must add and subset.sym = superset.sym to the subquery of course.)

You also want to remove duplicate sets and only keep one. For this you'll need an additional condition for equal sets (=), but then you'll also need some ID to tell one row from the other. In some DBMS there is a unique row ID built in. I don't know how it is in dolphin, so maybe you need a custom ID in your table. Then you can extend above statement as follows:

delete from mytable subset
where exists
(
 select *
 from mytable superset
 where subset.prices < superset.prices
 or (subset.prices = superset.prices and subset.id < superset.id)
);
answered Nov 20, 2025 at 22:01
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.