- 
 
- 
  Notifications
 You must be signed in to change notification settings 
- Fork 81
Fix omniv2 custom FileFormat extensibility problem #66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Currently if a custom `FileFormat` is given in `HandlerParams` to omniv2 handler creation call, this `FileFormat` becomes the only FileFormat for the omni handler. This behavior would make it difficult to deal with situation where we have to ingest many types of inputs using omni schema. (Saying it being difficult not impossible is because we can theoretically use 2 Extensions, each of which uses the same omni handler, but the first one with custom `FileFormat` and second with nil, thus the builtin `FileFormat`s will kick in. Possible, but incredibly ugly.) Instead, omni handler's `HandlerParams` should allow a list of custom `FileFormat`s. During schema probing, we will combine them with the builtin `FileFormat`s, with priority given the custom ones. If one `FileFormat` accepts the input then it will be used during ingestion; if not, omni handler will keep on probing the rest of the `FileFormat`s.
| Codecov Report
 @@ Coverage Diff @@ ## master #66 +/- ## ========================================= Coverage 100.00% 100.00% ========================================= Files 29 29 Lines 1219 1225 +6 ========================================= + Hits 1219 1225 +6 
 Continue to review full report at Codecov. 
 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Currently if a custom
FileFormatis given inHandlerParamsto omniv2 handler creation call, thisFileFormatbecomes the only FileFormat for the omni handler. This behavior would make it difficult to deal with situation where we have to ingest many types of inputs using omni schema. (Saying it being difficult not impossible is because we can theoretically use 2 Extensions, each of which uses the same omni handler, but the first one with customFileFormatand second with nil, thus the builtinFileFormats will kick in. Possible, but incredibly ugly.)Instead, omni handler's
HandlerParamsshould allow a list of customFileFormats. During schema probing, we will combine them with the builtinFileFormats, with priority given the custom ones. If oneFileFormataccepts the input then it will be used during ingestion; if not, omni handler will keep on probing the rest of theFileFormats.