-
Notifications
You must be signed in to change notification settings - Fork 395
Sharing an improvement: a High Customizable Text Extractor. #607
sbihaiko
started this conversation in
Show and tell
-
Hey Guys!
Below, you will find an attached file that facilitates the overriding of the extraction method during the customization of a new pipeline. Initially developed for personal use, I believe it might be beneficial for you as well. Here is an illustrative example:
var mbuilder = new MemoryClientBuilder();
var memory = mbuilder.Build();
var orchestrator = mbuilder.GetOrchestrator();
// Replacing the default MsWordDecoder
var textExtractor = new TextExtractionHandler("extraction", orchestrator);
textExtractor.AddExtractor(
(pipeline, file, content, ctoken) => {
// return new MsWordDecoder().DocToText(content);
return new MyDecoder().DocToText(content);
},
MimeTypes.MsWord
);
Best Regards,
Sandro Bihaiko.
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment