-
Notifications
You must be signed in to change notification settings - Fork 615
[Improve]: clarify ConditionQuery.condition() semantics for missing, conflicting, and multi-value conditions #2992
Description
Background
While fixing #2933, we found that ConditionQuery.condition() currently mixes multiple different semantics into a single API.
The immediate bug in #2933 has been fixed separately, but this method still has unclear behavior boundaries and several callers implicitly rely on assumptions that are not obvious from the method contract.
Current behavior
ConditionQuery.condition(Object key) may currently produce different kinds of results depending on how conditions are stored:
-
Return
null- when there is no condition for the given key
-
Return
null- when conditions exist for the key, but the intersection becomes empty
-
Return a single value
- for example, a single
EQ
- for example, a single
-
Return the whole
INlist- when there is one
INcondition and noEQ
- when there is one
-
Throw
IllegalStateException- when the intersection still contains multiple candidates
That means the same method is currently used for:
- raw condition lookup
- unique-value extraction
- empty-intersection signaling
- multi-value conflict signaling
Why this is a problem
These states are semantically different, but the API does not distinguish them clearly.
For example:
ConditionQuery q1 = new ConditionQuery(HugeType.EDGE); // no LABEL condition ConditionQuery q2 = new ConditionQuery(HugeType.EDGE); q2.eq(HugeKeys.LABEL, label1); q2.eq(HugeKeys.LABEL, label2); // conflicting LABEL conditions, intersection is empty
Both of these may currently lead to:
q1.condition(HugeKeys.LABEL) == null q2.condition(HugeKeys.LABEL) == null
But they do not mean the same thing:
q1: no such condition existsq2: the condition exists, but it is contradictory
Also, for LABEL, the method can return a raw IN list in some cases, even though many callers use it as if it were always either a unique Id or null.
Evidence in current code
LABEL can be stored as IN
Multi-label edge queries are constructed like this:
query.query(Condition.in(HugeKeys.LABEL, edgeLabels));
See:
GraphTransaction.constructEdgesQuery(...)
many callers assume unique label or null
Examples:
GraphTransaction.matchPartialEdgeSortKeys(...)GraphIndexTransaction.queryByLabel(...)BinarySerializer.writeQueryEdgePrefix(...)HugeTraverser
These call sites typically do:
Id label = query.condition(HugeKeys.LABEL);
which means they implicitly expect condition(HugeKeys.LABEL) to behave like a "resolved unique label" API, not a raw condition accessor.
Suggested direction
This issue is not about changing behavior immediately in a bugfix PR. It is about clarifying and improving the API contract.
Possible follow-up directions:
-
Split responsibilities into different APIs
- one API for raw condition retrieval
- one API for unique-value resolution / intersection result
-
Introduce an explicit result type
- absent
- unique value
- empty intersection
- multiple values
-
Audit call sites of
condition(HugeKeys.LABEL)- identify which ones want raw values
- identify which ones want a unique resolved label
- remove reliance on ambiguous
nullsemantics
Scope
This issue is a follow-up design / refactor task exposed by the fix for #2933.
Related:
- [Bug] Unexpected "Illegal key 'LABEL'..." happens #2933
- PR fix(query): handle conflicting edge label conditions safely #2990
I’d be happy to follow up on this as a separate improvement issue and work on a later refinement PR.