-
Notifications
You must be signed in to change notification settings - Fork 13.7k
compiler: Fix "power alignment" problems on AIX #142310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compiler: Fix "power alignment" problems on AIX #142310
Conversation
r? @wesleywiser
rustbot has assigned @wesleywiser.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.
Use r?
to explicitly pick a reviewer
These commits modify compiler targets.
(See the Target Tier Policy.)
This comment has been minimized.
This comment has been minimized.
This lint was based on a false premise: LLVM lacks a correct datalayout, but rustc assumed that the AIX datalayout was correct.
78009f5
to
6abc782
Compare
This comment has been minimized.
This comment has been minimized.
It does not fix upstream
I assume you are referring to llvm/llvm-project#133599 here? Always good to leave some cross-references. :)
Please also add comments in the code referencing that (both in the aix target triple, and the special exception for the data layout consistency check). Right now, just looking at the code after applying this diff, one would have a very hard time figuring out what happens.
aye aye capitan
As discussed on IRLO, as-is this patch would make some types more correct and others less correct. Hard to say whether that's overall a net positive...
It does fix every type that doesn't have f64
as its recursive-first-element, so I'm feeling pretty good about it alone.
Not all types which start with f64
would be affected, just those that also need at least 4 bytes of padding at the end of the struct, all of which could easily be fixed by adding _padding: MaybeUninit<u32>
at the end of the relevant struct (a lint could be added for this case with a suggestion if desired). Compared to the #112480-like issues that giving f64
an alignment of 8 causes, I think this PR is a definite improvement.
Are there really more types with f64 in some later field vs the first field?
I think so, when we're considering that it includes all nested aggregates?
Yes, the problem after this patch is now "possible-to-fix-in-bindgen"-tier.
all of which could easily be fixed by adding _padding: MaybeUninit at the end of the relevant struct
Ah, that's a good point.
Hi! Thanks for this patch. I wanted to test this on an AIX machine, but if I try to build it, I get an assertion on the LLVM side:
. . .
Assertion failed: Target.isCompatibleDataLayout(getDataLayout()) && "Can't create a MachineFunction using a Module with a " "Target-incompatible DataLayout attached\n", file rust/src/llvm-project/llvm/lib/CodeGen/MachineFunction.cpp, line 248, void llvm::MachineFunction::init()()
rustc exited with signal: 6 (SIGABRT)
Did not run successfully: signal: 6 (SIGABRT)
error: could not compile `compiler_builtins` (lib)
. . .
Since the datalayout string in LLVM does not match the datalayout string in rustc. If I make the strings match to unblock the build, I see the internal compiler bug message that was added in the patch:
error: internal compiler error: compiler/rustc_codegen_llvm/src/context.rs:223:17: LLVM got fixed, please remove this exception in cg_llvm!
In any case, I tried to work around these issues to test the patch a bit. There are a few concerns from our end:
- This change is not buildable on AIX as-is without the backend change but also not buildable due to the rustc exception.
- As mentioned in the previous comments, some structs that start with
f64
would require the extra 4 bytes of padding at the end of the struct so that the size is correct. - It seems the struct sizing is also affected when we're not using
repr(C)
. For something like the following, this should have a size of 24, but now we would report 20 for this struct:
pub struct Floats {
a: f64,
b: u32,
c: f64,
}
Ideally, we would only want these changes for repr(C)
structs and do not want to affect normal Rust structs at all.
FYI @daltenty
For something like the following, this should have a size of 24, but now we would report 20 for this struct:
That's a Rust layout struct. Why should it have AIX-specific layout? We use the same (undocumented, can-change-any-day) algorithm for repr(Rust) on all targets and we really want to keep it that way for the sake of everyone's sanity.
This change is not buildable on AIX as-is without the backend change but also not buildable due to the rustc exception.
Ah, sorry. I'll just remove the bug!
then.
@amy-kwan New patch to try out, should be less comical to test.
Ideally, we would only want these changes for repr(C) structs and do not want to affect normal Rust structs at all.
While I may make further changes to the layout algorithm that may overalign in some cases for the performance reasons you note, the fundamental detail is that it doesn't matter: once any f64 may be underaligned (read: 4, the ABI alignment on AIX), our codegen will notice they all are at that alignment and our reads and writes of that type will be generated with such an alignment annotation for LLVM. LLVM will only be able to upgrade the alignment as an optimization.
That is not an optimization I believe we should ourselves perform on our LLVMIR. This is because it would be extremely fragile, as repr(Rust)
structs and repr(C)
structs can be nested within each other and everything must still make sense when we do that. Because of this, we should not use the knowledge we have entered a struct with a certain repr
to modify the way we do accesses to types.
This is also partly because any reasoning that we overaligned things depends on specifics of the layout algorithm that we are allowed to undermine. It would be globally embedding a local assumption from another part of the code. So for this:
pub struct Floats { a: f64, b: u32, c: f64, }
If the true alignment of f64
is 4
, then you can neither rely on us choosing this layout:
#[repr(linear)] pub struct Floats { a: f64, c: f64, b: u32, }
nor can you rely on us choosing this layout:
#[repr(linear)] pub struct Floats { b: u32, a: f64, c: f64, }
We are allowed to pick either actual effective layout. You do not have a "should" you can rely on. And in general neither should we: if we make our code break with our own rules, then we make it harder to update.
Also you could probably have built the previous commit by disabling assertions for LLVM but I can understand not wanting to, for, you know, testing-the-patch purposes. :^)
Co-authored-by: beetrees <b@beetr.ee>
This comment has been minimized.
This comment has been minimized.
@nikic as usual is off being a gentleman and a scholar and has opened llvm/llvm-project#144673
☔ The latest upstream changes (presumably #144044) made this pull request unmergeable. Please resolve the merge conflicts.
@nikic as usual is off being a gentleman and a scholar and has opened llvm/llvm-project#144673
can the review of this patch proceed in parallel or do you prefer the LLVM to be first merged?
@nikic as usual is off being a gentleman and a scholar and has opened llvm/llvm-project#144673
can the review of this patch proceed in parallel or do you prefer the LLVM to be first merged?
Hi, my apologies on the delay with respect to this patch.
We were reviewing and testing llvm/llvm-project#144673, but found some issues internally when testing it. Currently, we are actively trying to investigate and see if these issues can be fixed, but ideally, we would like to review/test the LLVM patch first prior to proceeding with the thorough review/test of this patch.
Hello, this is a localized rustc-focused fix for the AIX "power alignment" issue. It does not fix upstream because I expect that to be a more annoying experience and would take some time to propagate into the release. I mostly wish to remove the "power alignment" lint so we do not have to work it into updates to the "improper-ctypes" lint, but it feels wrong to do so without actually fixing the codegen issue, especially since it's such a small change.
cc @daltenty @gilamn5tr @mustartt @amy-kwan Can you confirm whether this change allows rustc to do FFI correctly with C code compiled using the default AIX ABI?