Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

dtype system and integrating record types #321

aldanor started this conversation in Ideas
Discussion options

I've been looking at how record types can be integrated in rust-numpy and here's an unsorted collection of thoughts for discussion.

Let's look at Element:

pub unsafe trait Element: Clone + Send {
 const DATA_TYPE: DataType;
 fn is_same_type(dtype: &PyArrayDescr) -> bool;
 fn npy_type() -> NPY_TYPES { ... }
 fn get_dtype(py: Python) -> &PyArrayDescr { ... }
}
  • npy_type() is used in PyArray::new() and the like. Instead, one should use PyArray_NewFromDescr() to make use of the custom descriptor. Should all places where npy_type() is used split between "simple type, use New" and "user type, use NewFromDescr"? Or, alternatively, should arrays always be constructed from descriptor? (in which case, npy_type() becomes redundant and should be removed)
  • Why is same_type() needed at all? It is only used in FromPyObject::extract where one could simply use PyArray_EquivTypes (like it's done in pybind11). Isn't it largely redundant? (or does it exist for optimization purposes? In which case, is it even noticeable performance-wise?)
  • DATA_TYPE constant is really only used to check if it's an object or not in 2 places, like this:
    if T::DATA_TYPE != DataType::Object
    Isn't this redundant as well? Given that one can always do
    T::get_dtype().get_datatype() != Some(DataType::Object)
    // or, can add something like: T::get_dtype().is_object()
  • With all the notes above, Element essentially is just
     pub unsafe trait Element: Clone + Send {
     fn get_dtype(py: Python) -> &PyArrayDescr;
     }
  • For structured types, do we want to stick the type descriptor into DataType? E.g.:
    enum DataType { ..., Record(RecordType) }
    Or, alternatively, just keep it as DataType::Void? In which case, how does one recover record type descriptor? (it can always be done through numpy C API of course, via PyArrayDescr).
  • In order to enable user-defined record dtypes, having to return &PyArrayDescr would probably require:
    • Maintaining a global static thread-safe registry of registered dtypes (kind of like it's done in pybind11)
    • Initializing this registry somewhere
    • Any other options?
  • Element should probably be implemented for tuples and fixed-size arrays.
  • In order to implement structured dtypes, we'll inevitably have to resort to proc-macros. A few random thoughts and examples of how it can be done (any suggestions?):
    • #[numpy(record)]
      #[derive(Clone, Copy)]
      #[repr(packed)]
      struct Foo { x: i32, u: Bar } // where Bar is a registered numpy dtype as well
      // dtype = [('x', '<i4'), ('u', ...)]
    • We probably have to require either of #[repr(C)], #[repr(packed)] or #[repr(transparent)]
    • If repr is required, it can be an argument of the macro, e.g. #[numpy(record, repr = "C")]. (or not)
    • We also have to require Copy? (or not? technically, you could have object-type fields inside)
    • For wrapper types, we can allow something like this:
    • #[numpy(transparent)]
      #[repr(transparent)]
      struct Wrapper(pub i32);
      // dtype = '<i4'
    • For object types, the current suggestion in the docs is to implement a wrapper type and then impl Element for it manually. This seems largely redundant, given that the DATA_TYPE will always be Object. It would be nice if any #[pyclass]-wrapped types could automatically implement Element, but it would be impossible due to orphan rule. An alternative would be something like this:
      #[pyclass]
      #[numpy] // i.e., #[numpy(object)]
      struct Foo {}
    • How does one register dtypes for foreign (remote) types? I.e., OrderedFloat<f32> or Wrapping<u64> or some PyClassFromOtherCrate? We can try doing something like what serde does for remote types.
You must be logged in to vote

Replies: 59 comments 1 reply

Comment options

Related: #234, #186.

You must be logged in to vote
0 replies
Comment options

W.r.t. the question on why Element looks it does, I can only answer

DATA_TYPE constant is really only used to check if it's an object or not in 2 places, like this:

for lack of involvement in the other design decisions. Here, we discussed that a simple const TRIVIALLY_COPYABLE: bool; could serve the same purpose but having the enum does not really hurt either. In any case, this being a compile-time constant is certainly nice as it will completely remove one of the two branches from the generated code where this is checked

Maintaining a global static thread-safe registry of registered dtypes (kind of like it's done in pybind11)

I think having a GIL-protected lazily initialized static like in

fn type_object_raw(py: pyo3::Python) -> *mut ffi::PyTypeObject {
static TYPE_OBJECT: LazyStaticType = LazyStaticType::new();
TYPE_OBJECT.get_or_init::<Self>(py)
}

might be preferable to a single global data structure?

Should all places where npy_type() is used split between "simple type, use New" and "user type, use NewFromDescr"?

While the overhead of producing a PyArrayDesc from a suitably synchronized data structure is probably small, I suspect that not having to produce it is still faster. Or does NumPy resort to PyArray_DescrFromType internally in any case?

You must be logged in to vote
0 replies
Comment options

Or does NumPy resort to PyArray_DescrFromType internally in any case?

That seems to be the case indeed:

https://github.com/numpy/numpy/blob/1798a7d463590de6dc0dfdad69735d92288313f3/numpy/core/src/multiarray/ctors.c#L1105

So we can probably replace our calls to PyArray_New. The only remaining difference I see is that we have to go through PY_ARRAY_API which implies an additional acquire load.

In order to implement structured dtypes, we'll inevitably have to resort to proc-macros.

I think it would actually be helpful to ignore this part completely for now and design the necessary types without any macro support first. The proc-macros should essentially only relieve programmers from writing repetitive and error prone code, shouldn't they?

You must be logged in to vote
0 replies
Comment options

we discussed that a simple const TRIVIALLY_COPYABLE: bool; could serve the same purpose but having the enum does not really hurt either. In any case, this being a compile-time constant is certainly nice as it will completely remove one of the two branches from the generated code where this is checked

Then my initial guess was correct. I wonder though whether it's actually noticeable, i.e. has anyone tried to measure it? With all the other Python cruft on top, I would strongly suspect there'd be no visible difference at all. I understand where it comes from, it's just there's another implicit logical requirement in the Element trait - and it's that Self::get_dtype(py).get_datatype() == Some(Self::DATA_TYPE) - and this requirement is imposed only (?) to have DATA_TYPE as a compile time thing instead of runtime (where at runtime it's a single load dtype_ptr->type_num).

I think it would actually be helpful to ignore this part completely for now and design the necessary types without any macro support first. The proc-macros should essentially only relieve programmers from writing repetitive and error prone code, shouldn't they?

With record types, it's pretty much a requirement; anything else would be extremely boilerplate-ish and unsafe (with the user having to provide all offsets and descriptors for all fields manually). It might be nice to leave an option to construct descriptors manually at runtime, but I can't see a single real use case for it off top of my head. So, the necessary types and the infrastructure have to be designed in a way that would make it trivial to integrate them into proc macros, from the get go.

Or does NumPy resort to PyArray_DescrFromType internally in any case?

Yes, I think it does (and so does pybind11).

I think having a GIL-protected lazily initialized static like in ... might be preferable to a single global data structure?

Yep - thanks for pointing that out, something like that should be sufficient.

You must be logged in to vote
0 replies
Comment options

Just checked it out: if one replaces

if T::DATA_TYPE != DataType::Object {

with something runtime like

 if !T::get_dtype(py).is_object() {

the difference in from_slice runtime for an array of 5 elements is 1ns (20ns vs 21ns).

For comparison, creating arange(1, 5, 1) is 31ns.

So yea... you could say it's noticeable (but only if you create hundreds millions of arrays).

You must be logged in to vote
0 replies
Comment options

and this requirement is imposed only (?) to have DATA_TYPE as a compile time thing instead of runtime

ATM, the DATA_TYPE member is used to generate default implementations for the other methods, i.e. it is a utility to simplify the implementation. If same_type and npy_type are dropped in favour of using get_dtype we could certainly remove remove this member. But I also do not really see a significant cost to having at least a const TRIVIAL: bool; member.

(where at runtime it's a single load dtype_ptr->type_num).

I think if anything has measurable overhead, then it is acquiring the reference to a PyArrayDesc itself, either by accessing PY_ARRAY_API.PyArray_DescrFromType or by accessing a lazily initialized static.

It might be nice to leave an option to construct descriptors manually at runtime, but I can't see a single real use case for it off top of my head. So, the necessary types and the infrastructure have to be designed in a way that would make it trivial to integrate them into proc macros, from the get go.

proc-macros generate code that is injected into the dependent crate, i.e. they have to use this crate's public API and all the code they produce must have been possible to be written by hand. The point is not that I expect user code to manually produce those trait implementations (but avoiding the build time overhead of proc-macros would be one reason to do so), but that we can focus the discussion on the mechanisms and leave the proc-macro UI for later.

You must be logged in to vote
0 replies
Comment options

So yea... you could say it's noticeable (but only if you create hundreds millions of arrays).

One possible caveat with a targeted benchmark here is that deciding this at compile time is about removing code and thereby reducing instruction cache pressure which could only be noticeable when other code shares the address space.

You must be logged in to vote
0 replies
Comment options

As a first actionable result of your findings, why not start with a PR to simplify the Element trait to avoid having same_type and npy_type and directly producing a descriptor via PyArrayDescr::from_npy_type? I think that would certainly be a nice simplification. Personally, I would like this combined with a const TRIVIAL: bool flag, but I understand that I am most likely not the only one with an opinion on this.

You must be logged in to vote
0 replies
Comment options

For object types, the current suggestion in the docs is to implement a wrapper type and then impl Element for it manually. This seems largely redundant, given that the DATA_TYPE will always be Object.

I fear that it might actually be unsound: During extraction, we only check the data type to be object, but not whether all the elements are of type Py<T> for the givenT: PyClass. So e.g. .readonly().as_array() would enable safe code to call methods on T even though completely different - possibly heterogeneously typed - objects are stored in the array.

I fear we need to remove this suggestion and limit ourselves to Py<PyAny> as the only reasonable Element implementation for now. I also wonder how checking for homogeneity during extraction is best handled if we do want to properly support PyArray<Py<T>, D>.

You must be logged in to vote
0 replies
Comment options

One possible caveat with a targeted benchmark here is that deciding this at compile time is about removing code and thereby reducing instruction cache pressure which could only be noticeable when other code shares the address space.

While I'm often guilty in micro-optimizing Rust code down to every single branch hint myself, I just don't think it matters here, the last thing you're going to want to be doing in numpy bindings is instantiate millions of array objects in a loop - that was kind of my point.

Here's a few more thoughts on what would need to happen if const DATA_TYPE remains in place:

  • NB: DATA_TYPE != DataType::Object would no longer be sufficient - because record dtypes can be both trivially copyable (requires all fields to be trivially copyable) and not.
  • If we want to keep the data type const, this would imply DataType::Record(<...>) would need to be const-constructible as well. This implies that e.g. all fields names need to be &'static, all offsets are const and so on. It's probably doable but will be tough (see below).
  • Note also that MSRV of 1.48.0 doesn't allow us access to min const generics, so messing with consts is going to be even harder. Also, std::ptr::addr_of!() has been only stabilized in the same release (1.51.0), if I remember correctly.

Oh yea, and also, record types can be nested - that would be fun to implement as const as well (at runtime we can just box it all on the heap).

You must be logged in to vote
0 replies
Comment options

The note above only applies if it remains as DATA_TYPE, of course.

If it's changed to IS_POD: bool, then DataType can become a complex non-const type (to host the record dtype descriptor), etc.

You must be logged in to vote
0 replies
Comment options

all the elements are of type Py<T> for the givenT: PyClass. So e.g. .readonly().as_array() would enable safe code to call methods on T even though completely different - possibly heterogeneously typed - objects are stored in the array.

I fear we need to remove this suggest and limit ourselves to Py<PyAny> as the only reasonable Element implementation for now. I also wonder how checking for homogeneity during extraction is best handled if we do want to properly support PyArray<Py<T>, D>.

Yea, I was asking myself whether I was reading that code correctly - it only checks for Object dtype but not the type of objects themselves.

There's three options:

  • Always safe: Py<PyAny>, in which case it's always good and safe.
  • Py<T>, in which case, unfortunately, we need to isinstance-check every single element in order to keep it safe.
  • Unchecked and unsafe version of Py<T> which would be nice to keep available.
You must be logged in to vote
0 replies
Comment options

As a first actionable result of your findings, why not start with a PR to simplify the Element trait to avoid having same_type and npy_type and directly producing a descriptor via PyArrayDescr::from_npy_type? I think that would certainly be a nice simplification. Personally, I would like this combined with a const TRIVIAL: bool flag, but I understand that I am most likely not the only one with an opinion on this.

Yes, I've already started on this, as a first step, will try to push something shortly. There's actually a few more subtle dtype-related bugs I've uncovered in the process that I'll fix as well (the tests are very sparse so I guess noone noticed anything so far...).

You must be logged in to vote
0 replies
Comment options

(This is a bit of a wall of text, so thanks in advance to whoever reads through it in its entirety. I tried to keep it as organized as possible.)

tagging @adamreichold @kngwyu @davidhewitt

Ok, now that step 1 (#256) is ready to be merged, let's think about next steps. I'll use pybind11 as reference here since that's what I'm most familiar with having implemented a big chunk of that myself in the past.

Descriptor (format) strings and PEP 3118

In pybind11, dtypes can be constructed from buffer descriptor strings (see PEP 3118 for details). It actually calls back to NumPy's Python API to do that (numpy.core._internal._dtype_from_pep3118), but it's possible to reimplement it manually if we want to - I would actually learn towards the latter, provided we borrow and verify all the related tests from numpy itself.

The benefit of using descriptor strings - they're easy to hash if a registry is used, there's no nesting, etc - it's just a simple string that can cover any dtype possible, including scalar types, object types, multi-dimensionals subarrays, recarrays, and any combination of the above. We can also generate them with proc-macros at compile time and make them &'static if need be.

Now, if we go this route, we'll have to delve into "format chars", "type kinds", "buffer descriptor strings" etc. There's obviously a big overlap with pyo3::buffer module as there's ElementType::from_format and all the stuff around it, but it won't be sufficient without some reworking. Which raises the question: do we want to copy/paste stuff in pyo3::buffer and reimplement it to suit our needs? Or do we want to make it pybind11-like so that numpy arrays and buffers are tightly bound together, sharing the same infrastructure? See the next section for more details.

Buffers, pyo3::buffer and rust-numpy

In pybind11, buffers and arrays are bound together (like they should be), i.e. you can instantiate one from another (both ways) while controlling who owns the data. There's no such thing in rust-numpy and I think it's a pretty big gap.

There's pyo3::buffer that we can try to integrate with, but, off top of my head, there's some issues with it. But before we continue, consider an example:

>>> dt = np.dtype([('x', '<i8'), ('y', '<U1'), ('z', 'O', (1, 2))])
>>> arr = np.rec.fromarrays(
... [[1, 2], ['a', 'b'], [[[int, str]], [[float, list]]]], 
... dtype=dt,
... )
>>> arr
rec.array([(1, 'a', [[<class 'int'>, <class 'str'>]]),
 (2, 'b', [[<class 'float'>, <class 'list'>]])],
 dtype=[('x', '<i8'), ('y', '<U1'), ('z', 'O', (1, 2))])
>>> arr.data.format
'T{=q:x:@1w:y:(1,2)O:z:}'

Here, arr.data is the matching Python buffer (which you can also get via memoryview(arr)). Normally, you can do np.array(arr.data) and get the identical array back, but funnily enough, this exact example triggers the padding bug I have already reported in numpy upstream 6 years ago (numpy/numpy#7797) and which was then patched manually in pybind11 by hand.

Back to pyo3::buffer:

pub enum ElementType {
 SignedInteger { bytes: usize },
 UnsignedInteger { bytes: usize },
 Bool,
 Float { bytes: usize },
 Unknown,
}

Here's what's wrong with this enum:

  • It allows unsupported types, like Float(3)
  • It doesn't support complex types
  • It doesn't support struct types
  • It doesn't support subarrays
  • It doesn't support object types

It's basically a distant relative of DataType that we have just removed from rust-numpy in favor of a single dtype pointer. With these limitations, there's no way we can make proper use of pyo3::buffer and we'll end up duplicating lots of its logic in rust-numpy.

And here's the trait:

pub unsafe trait Element: Copy {
 fn is_compatible_format(format: &CStr) -> bool;
}

Note that this is similar to numpy::dtype::Element and is_compatible_format is equivalent to same_type() which we have just cut out. And again, there's some issues with it:

  • Copy is unnecessary as it blocks object types
    • ... and if Copy bound is removed, then some methods like PyBuffer::copy_to_slice() are no longer valid and should be rewritten same way as in rust-numpy (taking IS_COPY into account)
  • It's impossible to recover the format descriptor from a type (e.g. u32) - kind of like we can do with numpy::dtype::<u32>(py).

One thing that could be done to improve it would be this:

pub unsafe trait Element: Clone + Send {
 const IS_COPY: bool;
 fn format() -> &'static CStr;
}
fn can_convert(from: &CStr, to: &CStr) -> bool { ... } // or whichever other way it's done

And then ElementType can be removed. Alternatively, it can be replaced with FormatString(&CStr) which would then be returned by Element::format().

Note how close it is now to the newly reworked numpy::Element - the only difference being in nul-terminated string pointer being returned instead of the dtype pointer.

Note also that any valid T: pyo3::buffer::Element is (logically speaking) also a valid numpy::dtype::Element since dtype ptr can always be built from a valid format string (the inverse is not always true, e.g. due to M and m types; but in most cases it can be worked around if need be).

What to do next

Should the buffers be touched up in pyo3 first, so we can then use that in rust-numpy to add support for record types, subarrays, etc? In which case, what exactly should be done?

One interesting point for discussion may be - could we share some of the functionality between buffers and numpy arrays? (in pybind11, array is derived from buffer and array_t is derived from array - which is quite nice and avoid lots of duplication) Just some random thinking, e.g. maybe buffer-like types like the buffer itself or numpy array could deref to some common base type which would provide shared functionality. Or maybe we could somehow try and connect the two Element traits together.

You must be logged in to vote
0 replies
Comment options

Just my first thoughts after reading:

  • Personally, I am neither fond of stringly-typed mini languages nor of global registries. (I don't think hashing is an issue having derive(Hash) and while nesting is usually handled via Box, I think it could also be handled via references to produce static data.)
  • I do like the idea of integrating the Element trait here with the Element trait from pyo3::buffer, even though I would prefer to keep using a strongly-typed representation of the format like ElementType.
You must be logged in to vote
0 replies
Comment options

Some more progress - due to datetime64 and timedelta64 support being previously requested, and them being the only two types that diverge from the stdlib buffer format, I think it makes sense to add proper support for them. There's multiple ways to do it, I've sketched one of them here.

In a nutshell, you can just use Datetime64<units::Nanosecond> and it will translate to datetime64[ns] automatically. We can add a little bit of functionality later on, like basic arithmetics for datetime/timedelta within linear time units, maybe conversions to stdlib and/or time crate types, etc.

You must be logged in to vote
0 replies
Comment options

Some more progress updates (this is kinda where all of this has been heading from the start) - here's one of the most recent tests.

This compiles and runs:

#[derive(Clone, numpy::Record)]
#[repr(C)]
struct Foo<T, const N: usize> {
 x: [[T; N]; 3],
 y: (PyObject, Timedelta64<units::Nanosecond>),
}
assert_eq!(
 Foo::<i64, 2>::type_descriptor(),
 td!({"x":0 => [(3, 2); <i64], "y":48 => {0 => O, 8 => m8[ns] [16, 8]} [64, 8]})
);

There's still lots of edge cases and bugs to be squashed and tests to be written etc, but on the surface, the core of it seems to be working quite nicely.

You must be logged in to vote
0 replies
Comment options

So, I guess the question is, in the TypeDescriptor, should we just check whether the struct type is potentially an "aligned struct" and, if so, just set NPY_ALIGNED_STRUCT flag because we can?..

So far I went with the latter (leaving some todos around) - if a struct layout fits all three criteria, we will just mark it as aligned.

If the semantics of the same type definition with and without the flag are different on the Python side, maybe we should give the user access to it, e.g. make it a parameter to dtype_from_type_descriptor()? Or do we need it is part of TypeDescriptor when converting a dtype into it?

Do we want to implement Element<S> for [T; N] where T: Element<S>? That would be very logical and nice. Here's a few immediate issues though:

And we already have some implementations for const generics which automatically get enabled on new enough compilers

I think we should definitely add this and do it in the same way as std or PyO3: Add impls up to length 32 using macros on older compilers and use const generics on newer ones dropping the small array macros when our MSRV includes support for const generics.

like basic arithmetics for datetime/timedelta within linear time units, maybe conversions to stdlib and/or time crate types, etc.

Personally, I would very much like this crate to provide only a minimal and efficient conversion API without providing any operations on the wrapper types itself. (I think this also applies to arrays itself and we should aim to focus the API to getting from/to ndarray and nalgebra types.)

You must be logged in to vote
0 replies
Comment options

Personally, I would very much like this crate to provide only a minimal and efficient conversion API without providing any operations on the wrapper types itself. (I think this also applies to arrays itself and we should aim to focus the API to getting from/to ndarray and nalgebra types.)

Yea, I had the same feeling. For now I've just added transparent wrapper types themselves that can only do two things (aside from being registered in the dtype system) - be constructed from i64 and be converted into i64; the rest is up to whoever's using them.

I think we should definitely add this and do it in the same way as std or PyO3: Add impls up to length 32 using macros on older compilers and use const generics on newer ones dropping the small array macros when our MSRV includes support for const generics.

I've added the min-const-generics part of it, but I can also add the macro-based impl for up-to-length-32 arrays for older compiler versions. Personally, I'm not a huge fan of min_const_generics-like features all over the place, as I think it's cleaner to add a temporary dependency (until MSRV is high enough) and do it like so:

#[rustversion::since(1.51)]
// impl via min const geneics
#[rustversion::before(1.51)]
// impl via macros

The main benefit is that we won't break some dependent crates the day we kill the min_const_generics feature, as all of this logic stays completely internal to our crates and can be viewed as an implementational detail.

If the semantics of the same type definition with and without the flag are different on the Python side, maybe we should give the user access to it, e.g. make it a parameter to dtype_from_type_descriptor()? Or do we need it is part of TypeDescriptor when converting a dtype into it?

I think, first of all, we need to categorise how a type descriptor can be attached to a type. There will be two ways:

  1. #[derive(numpy::Record)] - this is probably what would be used 99% of time.
  2. Same as above but for remote types (see serde docs for remote/foreign types as an example, we might copy that)
  3. unsafe impl manually. Not sure why you'd want to do that, but perhaps there's some reason (e.g. you're codegen-ing something).

There's an alignment: Option<usize> field in RecordDescriptor which we allow to be set to None - this should cover case 3 as you're in control of that yourself. In cases 1 and 2 the derive macro would automatically set it to Some(std::mem::align_of::<T>()). So perhaps, if you really want a numpy dtype with align = False, we can allow something like a #[record(align = false)] attribute in the derive macro - this would cover cases 1 and 2. All that's left then is to ensure NPY_ALIGNED_STRUCT is not set when alignment.is_none()?

You must be logged in to vote
0 replies
Comment options

Personally, I'm not a huge fan of min_const_generics-like features all over the place, as I think it's cleaner to add a temporary dependency (until MSRV is high enough) and do it like so:

Yeah, I think using a crate like rustversion instead of a public feature is fine.

There's an alignment: Option field in RecordDescriptor which we allow to be set to None

I guess this where I am not sure ATM. Reading the NumPy C API, I get the impression that the idea is that PyArray_Descr::alignment (that one stored in fields or parent) should always be set to the necessary alignment even if the field offset does not respect that. Which is why I am not sure whether tying alignment == None to NPY_ALIGNED_STRUCT makes sense (or actually whether alignment should be an Option at all).

I think we should always have an alignment (even if trivial, i.e. one) and we should set the NPY_ALIGNED_STRUCT field whenever the contents of aRecordDescriptor are indeed aligned (by checking the alignment of the field type descriptors against the offsets of the fields).

My understanding of the align argument to the dtype constructor is at this moment basically like "here is a record type but I can't be bothered to work out the correct offsets so please do the padding for me" which would be a courtesy that TypeDescriptor just does not need to provide.

You must be logged in to vote
0 replies
Comment options

I guess this where I am not sure ATM. Reading the NumPy C API, I get the impression that the idea is that PyArray_Descr::alignment (that one stored in fields or parent) should always be set to the necessary alignment even if the field offset does not respect that. Which is why I am not sure whether tying alignment == None to NPY_ALIGNED_STRUCT makes sense (or actually whether alignment should be an Option at all).

The way it works in numpy itself (_convert_from_dict),

PyArray_Descr *new = PyArray_DescrNewFromType(NPY_VOID); // <-- new->alignment := 1
...
if (align) { // <-- align comes from either align=True or from being called recursively
 new->alignment = maxalign; // <-- maxalign is computed C-alignment if offsets are not provided
}
... 
/* Structured arrays get a sticky aligned bit */
if (align) {
 new->flags |= NPY_ALIGNED_STRUCT;
}

IIUC, descr->alignment will be set to something different from 1 only if align=True is passed in explicitly, and in that case NPY_ALIGNED_STRUCT flag will always be set.

There's 4 cases in numpy constructor itself:

  • You provide no offsets and align=False; it's treated as 'packed', alignment is 1, flag is not set
  • You provide no offsets and align=True; numpy simulates C and computes repr(C) offsets, flag is set
  • You provide offsets and align=False; nothing is verified, alignment is 1, flag is not set
  • You provide offsets and align=True; numpy checks that your offsets match alignment (but may be different from what it would have computed since there may be gaps etc), flag is set

So, the following must hold true: "if descr->alignment != 1, then NPY_ALIGNED_STRUCT is set". The reverse, obviously, does not need to hold true

Back to our business - I guess we can do something like this?

struct RecordDescriptor<T> {
 ...
 alignment: usize, // always set
 is_aligned: Option<bool>,
}

Where is_aligned can be:

  • Some(true) = set descr->flag; set descr->alignment; when creating dtype, verify that all fields are indeed C-aligned
  • Some(false) = don't set the flag; don't check anything; (set descr->alignment or not? probably set it to 1?)
  • None = current behaviour and the default: if all fields are C-aligned, then set the flag and alignment field; otherwise, don't set the flag (and leave descr->alignment as 1?)
You must be logged in to vote
0 replies
Comment options

Here's an illustration:

def check_alignment(*, offsets=None, align=False):
 print(f'offsets={offsets}, align={align} => ', end='')
 try:
 args = {
 'names': ['x', 'y'],
 'formats': ['u2', 'u8'],
 }
 if offsets is not None:
 args['offsets'] = offsets
 dtype = np.dtype(args, align=align)
 dtype_offsets = [dtype.fields[n][1] for n in args['names']]
 print(
 f'alignment={dtype.alignment}, '
 f'isalignedstruct={dtype.isalignedstruct}, '
 f'offsets={dtype_offsets}'
 )
 except BaseException as e:
 print(f'error: "{e}"')
which outputs

You must be logged in to vote
0 replies
Comment options

Back to our business - I guess we can do something like this?

The semantics look reasonable to me. I do still wonder whether is_aligned should be part of RecordDescriptor instead of an argument to dtype_from_type_descriptor? For round-tripping from dtype to dtype via TypeDescriptor?

offsets=[0, 8], align=False => alignment=1, isalignedstruct=False, offsets=[0, 8]

So this means that NumPy treats this like a repr(packed) struct even if the fields are indeed aligned? Weird. (And does this mean the relevant fields will be read using the moral equivalent of read_unaligned, e.g. through a char* pointer?)

You must be logged in to vote
0 replies
Comment options

The semantics look reasonable to me. I do still wonder whether is_aligned should be part of RecordDescriptor instead of an argument to dtype_from_type_descriptor? For round-tripping from dtype to dtype via TypeDescriptor?
That part I'm unsure about myself. On the one hand, stuffing an extra argument which would almost never be used into dtype_from_type_descriptor is not very nice. As for roundtripping...

  • We can fish out whether align=True or not from the isalignedstruct flag.
  • There's no way to figure RecordDescriptor::alignment from a numpy dtype though. If descr->alignment == 1, it does not mean a single thing, it's simply an alignment of char.
  • If you want roundtripping, I think we're back to a single alignment: Option<usize> which is what I suggested initially 😄 If isalignedstruct is false, then descr->alignment is surely 1 which corresponds to None in our world. Otherwise, it's Some(descr->alignment) which is now a meaningful value. And vice versa, so it should roundtrip, no?

So this means that NumPy treats this like a repr(packed) struct even if the fields are indeed aligned?

Yes, it sort of does. But I don't think it makes a distinction between aligned and unaligned reads, all that is left to the compiler. You can see that the usage of NPY_ALIGNED_STRUCT doesn't go too far beyond descriptor.c and creating dtypes.

You must be logged in to vote
0 replies
Comment options

Ok, re: the last comment, maybe I wasn't entirely clear. In some cases, NumPy does distinguish between aligned and unaligned arrays.

array_method.c:

 /* TODO: We may need to distinguish aligned and itemsize-aligned */
 aligned &= PyArray_ISALIGNED(arrays[i]);
 }
 if (!aligned && !(self->method->flags & NPY_METH_SUPPORTS_UNALIGNED)) {
 PyErr_SetString(PyExc_ValueError,
 "method does not support unaligned input.");
 return NULL;
 }

Whether the array is aligned or not, e.g. if you select a field like arr['y'] from one of the examples above, does not depend on NPY_ALIGNED_STRUCT flag, but rather on the actual pointer offsets, strides, etc.

So, in the example above, examples 2, 3 and 4 will have arr.flags.aligned = True for field 'y' (regardless of what the descr->alignment is set to), whereas 1 and 5 will have arr.flags.aligned = False.

You must be logged in to vote
0 replies
Comment options

If descr->alignment == 1, it does not mean a single thing, it's simply an alignment of char.

My impression is that indeed this means that NumPy will store these records unaligned? So resulting type descriptor would match a Rust type with repr(packed) if the offsets are indeed packed or something that cannot be written down as a struct (basically "repr(packed)" but with offsets as if without it). (Which seems to be true generally, i.e. there a lot of NumPy dtypes and TypeDescriptor instances which cannot be produced via #[derive(numpy::record)] on any Rust struct.)

You must be logged in to vote
0 replies
Comment options

(Sorry, didn't mention that: Personally, I don't think we need roundtripping. I am basically only interested in the TypeDescriptor to dtype direction which is why I am asking if we can simplify things by ignoring the reverse mapping.)

You must be logged in to vote
0 replies
Comment options

Which seems to be true generally, i.e. there a lot of NumPy dtypes and TypeDescriptor instances which cannot be produced via #[derive(numpy::record)] on any Rust struct.

Well, to start with... NumPy allows overlapping fields. As long as neither of them contains PyObject*.

if we can simplify things by ignoring the reverse mapping

I think we have to do the reverse mapping, which will be used in FromPyObject and the like. Instead of pre-existing "EquivTypes" logic in numpy API (which is overburdened, slow and extremely spaghetti), we can do whatever we want. So, it could work like this:

  • For the array object coming from outside, decode its dtype into a type descriptor. If we can't decode it, we can't use it
  • Compare it to our type descriptor and return true/false

My impression is that indeed this means that NumPy will store these records unaligned

What exactly do you mean by that? You can pass offsets [0, 8] with align=False as in example above - and the resulting dtype will have descr->alignment=1, but, in reality, it will be aligned. Also, the array made of this dtype will have NPY_ARRAY_ALIGNED flag on. And both of the field slices will have this flag on.

It will only be 'packed' if you don't specify offsets. In this case align flag lets you choose either way of computing them (this is the part we really don't care about as we have our own offsets).

You must be logged in to vote
0 replies
Comment options

Instead of pre-existing "EquivTypes" logic in numpy API (which is overburdened, slow and extremely spaghetti), we can do whatever we want.

I have to think about this. My first instinct is that this would rather be an argument for improving the NumPy code instead of rolling our own.

What exactly do you mean by that? You can pass offsets [0, 8] with align=False as in example above - and the resulting dtype will have descr->alignment=1, but, in reality, it will be aligned. Also, the array made of this dtype will have NPY_ARRAY_ALIGNED flag on. And both of the field slices will have this flag on.

From your example in #254 (comment), I got the impression that align=True is not necessarily recursive. So if you have a record type with actually aligned offsets but alignment=1 and put that into an outer record and set align=True that this subrecord could be placed misaligned thereby also misaligning its fields whereas if the subrecord type itself had alignment > 1, padding would be added as necessary and the alignment of the subrecord field would imply alignment of its fields.

You must be logged in to vote
0 replies
Comment options

Hey. Just wondering if there were any implementation results from this discussion. I'd like to use structured data types and I got stuck when trying to implement Element and figuring out how to describe the correct dtype. Essentially, I'd like to have a tuple of primitive types inside a PyArray. Is there any example where this has been achieved? Thanks!

You must be logged in to vote
1 reply
Comment options

At the moment, record types are not supported, neither in a released version nor on the main branch. There is some preliminary implementation work available at https://github.com/aldanor/pyo3-type-desc and the plan was to extend PyO3's buffer module first and then to tie in rust-numpy's support for record types into that.

For now, if you really needs this, using NumPy's C API via the npyffi module is the only option I am aware of.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Ideas
Labels
None yet
Converted from issue

This discussion was converted from issue #254 on April 12, 2022 07:04.

AltStyle によって変換されたページ (->オリジナル) /