-
Notifications
You must be signed in to change notification settings - Fork 5.5k
feat(optimizer): Unwrap casts in comparison expressions #26326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Reviewer's GuideThis PR implements a new optimizer rule that rewrites comparison expressions by unwrapping safe implicit casts, uses new per-type value ranges to detect out-of-range literals, and wires this feature through configuration, system properties, and supporting utility methods, alongside comprehensive unit and integration tests. Sequence diagram for optimizer rule application in PlanOptimizerssequenceDiagram
participant PlanOptimizers
participant SimplifyRowExpressions
participant UnwrapCastInComparison
participant ExpressionOptimizerManager
PlanOptimizers->>SimplifyRowExpressions: rules()
PlanOptimizers->>UnwrapCastInComparison: rules()
UnwrapCastInComparison->>ExpressionOptimizerManager: optimize(rewrittenExpr, SERIALIZABLE, session)
ExpressionOptimizerManager-->>UnwrapCastInComparison: optimizedExpr
UnwrapCastInComparison-->>PlanOptimizers: rule set
Sequence diagram for UnwrapCastInComparison rule rewriting a comparison expressionsequenceDiagram
participant Optimizer
participant UnwrapCastInComparison
participant Visitor
participant FunctionAndTypeManager
participant FunctionResolution
participant InterpretedFunctionInvoker
Optimizer->>UnwrapCastInComparison: rewrite(expression, session, metadata, exprOptMgr)
UnwrapCastInComparison->>Visitor: rewriteCall(CallExpression, context, treeRewriter)
Visitor->>FunctionResolution: isCastFunction(handle)
Visitor->>FunctionAndTypeManager: lookupCast(CAST, sourceType, targetType)
Visitor->>FunctionAndTypeManager: lookupCast(CAST, targetType, sourceType)
Visitor->>InterpretedFunctionInvoker: invoke(coercion, properties, value)
Visitor-->>UnwrapCastInComparison: rewrittenExpr
UnwrapCastInComparison-->>Optimizer: rewrittenExpr
Class diagram for new and updated Type classes with Range supportclassDiagram
class Type {
+Optional<Range> getRange()
<<interface>>
}
class Type~Range~ {
+Object min
+Object max
+Object getMin()
+Object getMax()
}
Type o-- "1" Range : uses
class BigintType {
+Optional<Range> getRange()
}
class IntegerType {
+Optional<Range> getRange()
}
class SmallintType {
+Optional<Range> getRange()
}
class TinyintType {
+Optional<Range> getRange()
}
class DoubleType {
+Optional<Range> getRange()
}
class RealType {
+Optional<Range> getRange()
}
class AbstractVarcharType {
+Optional<Range> getRange()
}
Type <|.. BigintType
Type <|.. IntegerType
Type <|.. SmallintType
Type <|.. TinyintType
Type <|.. DoubleType
Type <|.. RealType
Type <|.. AbstractVarcharType
Class diagram for UnwrapCastInComparison optimizer ruleclassDiagram
class UnwrapCastInComparison {
+UnwrapCastInComparison(Metadata, ExpressionOptimizerManager)
+PlanRowExpressionRewriter createRewriter(Metadata, ExpressionOptimizerManager)
}
class UnWrapCastInComparisonRewriter {
+static RowExpression rewrite(RowExpression, Session, Metadata, ExpressionOptimizerManager)
}
class Visitor {
+RowExpression rewriteCall(CallExpression, Void, RowExpressionTreeRewriter<Void>)
+RowExpression unwrapCast(RowExpression)
+boolean hasInjectiveImplicitCoercion(Type, Type)
+Object coerce(Object, FunctionHandle)
+int compare(Type, Object, Object)
+RowExpression falseIfNotNull(RowExpression)
+RowExpression trueIfNotNull(RowExpression)
}
UnwrapCastInComparison o-- UnWrapCastInComparisonRewriter : uses
UnWrapCastInComparisonRewriter o-- Visitor : uses
Class diagram for FeaturesConfig and SystemSessionProperties with unwrapCastsclassDiagram
class FeaturesConfig {
-boolean unwrapCasts
+boolean isUnwrapCasts()
+FeaturesConfig setUnwrapCasts(boolean)
}
class SystemSessionProperties {
+static boolean isUnwrapCasts(Session)
}
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
e8c4ab5
to
7cf221d
Compare
7cf221d
to
dc34938
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually wonder if this is true. NaN should be treated as the highest value by convention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tdcmeehan Thanks for the review.
You're right that Nan is treated as the largest value in Presto comparisons and ordering. However, returning NaN as the maximum value of a range would not make sense here. In this specific rule, we unwrap a CAST when it's compared to a constant, and the comparison can be further optimized if the constant exceeds the maximum value of the source type. Since NaN is the maximum value when the source type is real or double, no constant can be greater than it, so empty is directly returned in getRange to skip this check.
Changes adapted from trino/PR#731.
Description
Motivation and Context
Impact
Test Plan
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.
If release note is NOT required, use: