ENH: Add option to infer CIFTI-2 intent codes #932

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

mgxd wants to merge 9 commits into nipy:master

from mgxd:enh/cifti-save

Open

ENH: Add option to infer CIFTI-2 intent codes #932

mgxd wants to merge 9 commits into nipy:master from mgxd:enh/cifti-save

Conversation

mgxd

Copy link

Member

@mgxd mgxd commented Jul 6, 2020 •

edited

Loading

Adds support for FileBasedImage methods to_filename and instance_to_filename to support keyword args.

Adds keyword arg validate to CIFTI's to_filename and instance_to_filename methods.

If False (default), the intent code will not be set (except when the intent code is "none")
If True, the first suffix of the output filename will be compared to valid suffixes from within the CIFTI-2 specification. If there is a match, the corresponding intent code will be set, and each IndexMap within the header matrix is checked.

@codecov

Copy link

codecov bot commented Jul 6, 2020 •

edited

Loading

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.84%. Comparing base (917afab) to head (af7265c).
Report is 1255 commits behind head on master.

Additional details and impacted files

@@ Coverage Diff @@
## master #932 +/- ##
==========================================
- Coverage 91.91% 91.84% -0.07% 
==========================================
 Files 99 99 
 Lines 12514 12536 +22 
 Branches 2204 2208 +4 
==========================================
+ Hits 11502 11514 +12 
- Misses 678 684 +6 
- Partials 334 338 +4

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

effigies

effigies reviewed

Jul 6, 2020

View reviewed changes

Copy link

Member

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a quick look and overall looks reasonable. I'll find a little time to think through the API more thoroughly later.

nibabel/cifti2/cifti2.py Outdated Show resolved Hide resolved

@mgxd mgxd force-pushed the enh/cifti-save branch 2 times, most recently from 71f6526 to b55a363 Compare

October 2, 2020 20:26

@mgxd

Copy link

Member Author

@effigies Totally forgot about this PR - I liked your suggestion in #932 (comment) so I went ahead and added that. Since that now includes IndexMap information, I reworked the original infer_intent kwarg into validate, which checks for the presence of the expected IndexMaps (and also the correct order) on top of setting the intent code.

effigies

effigies reviewed

Oct 2, 2020

View reviewed changes

nibabel/cifti2/tests/test_cifti2.py Outdated Show resolved Hide resolved

@mgxd mgxd force-pushed the enh/cifti-save branch 3 times, most recently from a6c3049 to 021e5de Compare

October 5, 2020 15:11

effigies

effigies reviewed

Oct 5, 2020

View reviewed changes

nibabel/loadsave.py Show resolved Hide resolved

@mgxd mgxd force-pushed the enh/cifti-save branch from ae835d2 to 743ebd3 Compare

October 5, 2020 16:53

@mgxd

Copy link

Member Author

mgxd commented Oct 5, 2020

I believe the failing tests are unrelated - this is ready for review.

@effigies

Copy link

Member

effigies commented Oct 5, 2020

Yup, opened pydicom/pydicom#1214 for those failures.

@effigies

Copy link

Member

effigies commented Oct 16, 2020

Overall, I'm not sure I like this UX. If I try to save a generic_cifti.nii without validate=False, this is going to give me a KeyError.

My inclination is that we want warnings, not errors, by default. And inferring intent codes and validating axes feel like they shouldn't be controlled by the same parameter. @MichielCottaar @satra Would be interested in your opinions (or anyone else who uses CIFTI a lot) on how we can be helpful without being burdensome.

I haven't checked this, but I suspect that the logic will fail if someone tries to save a compressed .nii.gz, which we've previously allowed, even though CIFTI strongly discourages it.

@effigies effigies mentioned this pull request

Oct 16, 2020

REL: 3.2.0 #960

Merged

12 tasks

@MichielCottaar

Copy link

Contributor

MichielCottaar commented Oct 17, 2020

Thanks for looking into these intent codes and filename extensions. I would suggest to turn around the logic to use the map types as a ground truth of what the user intends rather than the file extension. At least for the current CIFTI file types, it is possible to determine what the intent code should be for any set of map types. I think such an inference could just added in without most users noticing (ideally it would not overwrite any intent code explicitly set by the user).

I do like the idea of also checking that the filename extension matches what is being stored. However, I agree with effigies that it is probably better to keep this as a warning.

@effigies

Copy link

Member

effigies commented Oct 18, 2020

I think using the axis types as a more authoritative indicator of intent than the extension makes a lot of sense. So something like:

try:
 expected_intent = CIFTI_CODES.niistring[axis_types]
 expected_ext = CIFTI_CODES.extension[axis_types]
except KeyError:
 expected_intent = "NIFTI_INTENT_CONNECTIVITY_UNKNOWN"
 expected_ext = ".nii"
if validate and expected_intent != found_intent:
 warn
 set intent
if validate and not fname.endswith(expected_ext):
 warn

This logic is simplified, since axis type tuples are not unique determinants. (NIFTI_INTENT_CONNECTIVITY_DENSE_SERIES applies to dtseries.nii and dfan.nii, (CIFTI_INDEX_TYPE_SCALARS, CIFTI_INDEX_TYPE_BRAIN_MODELS) applies to dscalar.nii and dfan.nii, and dfibersamp.nii and dfansamp.nii share both intents and axes.)

This would also treat .nii.gz the same as a mismatched .<subext>.nii, and warn, but not fail, which feels a bit kinder to users.

With this mode, I think I'm okay with controlling both behaviors with a single parameter. validate seems fine to me...

@satra

Copy link

Member

satra commented Oct 18, 2020

@effigies - your plan looks reasonable. the one thing perhaps to think a bit more about is exception vs warning. warnings in python are mostly useless, i think, in dataflows and large scale processing, which is where such mismatches are likely to show up. the question is whether a reasonable assumption is being made by the tool when processing when ignoring a warning (since warnings are never trapped by downstream code only filtered out).

@effigies

Copy link

Member

effigies commented Oct 18, 2020 •

edited

Loading

That's a fair point. What would be a case in which you'd want to see an error? Trying to think through what the default behaviors should be. I think there should be something between "error" and "no helping", so if we do want to promote these checks to exceptions, then I'm back to wanting separate switches for inferring the intent code and performing validation.

@satra

Copy link

Member

satra commented Oct 18, 2020

this is probably going to take this to a different PR/problem space, so first i will say that true, false, warn/except is a good starting way of handling this.

so the simple use cases of matrix dimension flips and others should be choices for exception (e.g., timeseries by vertices vs vertices by timeseries). although i'm not sure as i write this if cifti actually enforces this or hcp-flavored cifti uses this as convention (i think the latter).

more general approach, let validate take a callable, and one could build up an hcp-flavored validator (this can check on extension to internal mismatches, dimension mismatches, even have heuristics for certain types of information depending on intent codes).

most algorithms assume that the file given to it is the correct file instead of checking to see if it is. for example, i load a dtseries and i assume that the dimensions are vertices/voxels by time. but it may be impossible for nibabel in general to know what they intent of the algorithm/tool designer was. so let the tool designer decide what is valid for their application and nibabel, if it can validate basic things like dimensions, availability of appropriate metadata (e.g., a dtseries should have a samplingstep inside the header that is reasonable).

@mgxd

Copy link

Member Author

mgxd commented Oct 19, 2020 •

edited

Loading

@MichielCottaar @satra thanks for the reviews 😄

I would suggest to turn around the logic to use the map types as a ground truth of what the user intends rather than the file extension

I like this!

warnings in python are mostly useless, i think, in dataflows and large scale processing, which is where such mismatches are likely to show up

This was my first thought when creating this, but looking back, it may be too strict of an enforcement - I'm fine with reducing the severity of "incorrect" CIFTIs to a warning. I think warning handling is important, but that can (should?) be handled by the specific workflow (i.e. using warnings.simplefilter to treat them as errors).

mgxd added 9 commits

October 19, 2020 13:36

@mgxd


 RF: Allow kwargs within to_filename and instance_to_filename ...

dff774e

...methods

@mgxd


 ENH: Add option to allow inferring of intent code based on output fil...

f78d7f1

...e suffixes

@mgxd


 ENH: Add validation when saving CIFTI2 images

2b6b328

- Enabled by default, validation will parse the output filename for a valid CIFTI2 extension.
- If found, the intent code of the image will be set. Also, the CIFTI2Header will be check for compliant index maps for the intent code

@mgxd


 RF: Avoid validation when testing CIFTI saving

ad6adfa

@mgxd


 RF: Extract nifti header update from to_filename

2a20433

@mgxd


 FIX: remove validation for remaining tests

619295e

@mgxd


 ENH: Allow kwargs in generic save method

62be507

@mgxd


 FIX: Lower severity of failed validation checks

03ef43d

@mgxd


 TST: Clean up test

af7265c

@mgxd mgxd force-pushed the enh/cifti-save branch from 743ebd3 to af7265c Compare

October 19, 2020 17:36

@effigies

Copy link

Member

effigies commented Oct 19, 2020

@mgxd Are you shooting to get this in for the release?

@mgxd

Copy link

Member Author

mgxd commented Oct 19, 2020

yes, I was hoping to - but this failing test reveals a problem with the approach of looking up based on map configuration.

(CIFTI_INDEX_TYPE_SCALARS, CIFTI_INDEX_TYPE_BRAIN_MODELS) applies to dscalar.nii and dfan.nii

when trying to validate the configuration for a *dscalar.nii file, the information for *dfan.nii is fetched. How can we avoid these collisions?

@effigies

Copy link

Member

effigies commented Oct 19, 2020

We might need to modify recorders to return multiple options if there are multiple matches.

@effigies

Copy link

Member

effigies commented Oct 20, 2020

I think we'll need to push this off to another release. I've held up 3.2.0 for too long and almost certainly won't have time to review more this week.