Tensors#

Dense Tensors#

classTensor#

Subclassed by arrow::NumericTensor< TYPE >

Public Functions

Tensor(conststd::shared_ptr<DataType >&type, conststd::shared_ptr<Buffer >&data, conststd::vector<int64_t>&shape)#: Constructor with no dimension names or strides, data assumed to be row-major.

Tensor(conststd::shared_ptr<DataType >&type, conststd::shared_ptr<Buffer >&data, conststd::vector<int64_t>&shape, conststd::vector<int64_t>&strides)#: Constructor with non-negative strides.

Tensor(conststd::shared_ptr<DataType >&type, conststd::shared_ptr<Buffer >&data, conststd::vector<int64_t>&shape, conststd::vector<int64_t>&strides, conststd::vector<std::string>&dim_names)#: Constructor with non-negative strides and dimension names.

int64_tsize()const#: Total number of value cells in the tensor.

inlineboolis_mutable()const#: Return true if the underlying data buffer is mutable.

boolis_contiguous()const#: Either row major or column major.

boolis_row_major()const#: AKA "C order".

boolis_column_major()const#: AKA "Fortran order".

Result <int64_t>CountNonZero()const#: Compute the number of non-zero values in the tensor.

template<typenameValueType> inlineconstValueType ::c_type&Value(conststd::vector<int64_t>&index)const#: Returns the value at the given index without data-type and bounds checks.

Public Static Functions

staticinlineResult <std::shared_ptr<Tensor >>Make(conststd::shared_ptr<DataType >&type, conststd::shared_ptr<Buffer >&data, conststd::vector<int64_t>&shape, conststd::vector<int64_t>&strides={}, conststd::vector<std::string>&dim_names={})#

Create a Tensor with full parameters.

This factory function will return Status::Invalid when the parameters are inconsistent

Parameters:

type – [in] The data type of the tensor values
data – [in] The buffer of the tensor content
shape – [in] The shape of the tensor
strides – [in] The strides of the tensor (if this is empty, the data assumed to be row-major)
dim_names – [in] The names of the tensor dimensions

staticinlineint64_tCalculateValueOffset(conststd::vector<int64_t>&strides, conststd::vector<int64_t>&index)#: Return the offset of the given index on the given strides.

template<typenameTYPE> classNumericTensor:publicarrow::Tensor #

Public Functions

inlineNumericTensor(conststd::shared_ptr<Buffer >&data, conststd::vector<int64_t>&shape, conststd::vector<int64_t>&strides, conststd::vector<std::string>&dim_names)#: Constructor with non-negative strides and dimension names.

inlineNumericTensor(conststd::shared_ptr<Buffer >&data, conststd::vector<int64_t>&shape)#: Constructor with no dimension names or strides, data assumed to be row-major.

inlineNumericTensor(conststd::shared_ptr<Buffer >&data, conststd::vector<int64_t>&shape, conststd::vector<int64_t>&strides)#: Constructor with non-negative strides.

Public Static Functions

staticinlineResult <std::shared_ptr<NumericTensor <TYPE >>>Make(conststd::shared_ptr<Buffer >&data, conststd::vector<int64_t>&shape, conststd::vector<int64_t>&strides={}, conststd::vector<std::string>&dim_names={})#

Create a NumericTensor with full parameters.

This factory function will return Status::Invalid when the parameters are inconsistent

Parameters:

data – [in] The buffer of the tensor content
shape – [in] The shape of the tensor
strides – [in] The strides of the tensor (if this is empty, the data assumed to be row-major)
dim_names – [in] The names of the tensor dimensions

Sparse Tensors#

enumarrow::SparseTensorFormat::type#

EXPERIMENTAL: The index format type of SparseTensor.

Values:

enumeratorCOO#: Coordinate list (COO) format.

enumeratorCSR#: Compressed sparse row (CSR) format.

enumeratorCSC#: Compressed sparse column (CSC) format.

enumeratorCSF#: Compressed sparse fiber (CSF) format.

classSparseIndex#

EXPERIMENTAL: The base class for the index of a sparse tensor.

SparseIndex describes where the non-zero elements are within a SparseTensor.

There are several ways to represent this. The format_id is used to distinguish what kind of representation is used. Each possible value of format_id must have only one corresponding concrete subclass of SparseIndex.

Subclassed by arrow::internal::SparseIndexBase< SparseCOOIndex >, arrow::internal::SparseIndexBase< SparseCSCIndex >, arrow::internal::SparseIndexBase< SparseCSFIndex >, arrow::internal::SparseIndexBase< SparseCSRIndex >, arrow::internal::SparseIndexBase< SparseIndexType >

Public Functions

inlineSparseTensorFormat::type format_id()const#: Return the identifier of the format type.

virtualint64_tnon_zero_length()const=0#: Return the number of non zero values in the sparse tensor related to this sparse index.

virtualstd::stringToString()const=0#: Return the string representation of the sparse index.

classSparseCOOIndex:publicarrow::internal::SparseIndexBase<SparseCOOIndex >#

EXPERIMENTAL: The index data for a COO sparse tensor.

A COO sparse index manages the location of its non-zero values by their coordinates.

Public Functions

explicitSparseCOOIndex(conststd::shared_ptr<Tensor >&coords, boolis_canonical)#: Construct SparseCOOIndex from column-major NumericTensor.

inlineconststd::shared_ptr<Tensor >&indices()const#

Return a tensor that has the coordinates of the non-zero values.

The returned tensor is a N x D tensor where N is the number of non-zero values and D is the number of dimensions in the logical data. The column at index i is a D-tuple of coordinates indicating that the logical value at those coordinates should be found at physical index i.

inlinevirtualint64_tnon_zero_length()constoverride#: Return the number of non zero values in the sparse tensor related to this sparse index.

inlineboolis_canonical()const#

Return whether a sparse tensor index is canonical, or not.

If a sparse tensor index is canonical, it is sorted in the lexicographical order, and the corresponding sparse tensor doesn’t have duplicated entries.

virtualstd::stringToString()constoverride#: Return a string representation of the sparse index.

inlineboolEquals(constSparseCOOIndex &other)const#: Return whether the COO indices are equal.

Public Static Functions

staticResult <std::shared_ptr<SparseCOOIndex >>Make(conststd::shared_ptr<Tensor >&coords, boolis_canonical)#: Make SparseCOOIndex from a coords tensor and canonicality.

staticResult <std::shared_ptr<SparseCOOIndex >>Make(conststd::shared_ptr<Tensor >&coords)#: Make SparseCOOIndex from a coords tensor with canonicality auto-detection.

staticResult <std::shared_ptr<SparseCOOIndex >>Make(conststd::shared_ptr<DataType >&indices_type, conststd::vector<int64_t>&indices_shape, conststd::vector<int64_t>&indices_strides, std::shared_ptr<Buffer >indices_data)#: Make SparseCOOIndex from raw properties with canonicality auto-detection.

staticResult <std::shared_ptr<SparseCOOIndex >>Make(conststd::shared_ptr<DataType >&indices_type, conststd::vector<int64_t>&indices_shape, conststd::vector<int64_t>&indices_strides, std::shared_ptr<Buffer >indices_data, boolis_canonical)#: Make SparseCOOIndex from raw properties.

staticResult <std::shared_ptr<SparseCOOIndex >>Make(conststd::shared_ptr<DataType >&indices_type, conststd::vector<int64_t>&shape, int64_tnon_zero_length, std::shared_ptr<Buffer >indices_data)#

Make SparseCOOIndex from sparse tensor’s shape properties and data with canonicality auto-detection.

The indices_data should be in row-major (C-like) order. If not, use the raw properties constructor.

staticResult <std::shared_ptr<SparseCOOIndex >>Make(conststd::shared_ptr<DataType >&indices_type, conststd::vector<int64_t>&shape, int64_tnon_zero_length, std::shared_ptr<Buffer >indices_data, boolis_canonical)#

Make SparseCOOIndex from sparse tensor’s shape properties and data.

The indices_data should be in row-major (C-like) order. If not, use the raw properties constructor.

classSparseCSRIndex:publicarrow::internal::SparseCSXIndex<SparseCSRIndex ,internal::SparseMatrixCompressedAxis::ROW>#

EXPERIMENTAL: The index data for a CSR sparse matrix.

A CSR sparse index manages the location of its non-zero values by two vectors.

The first vector, called indptr, represents the range of the rows; the i-th row spans from indptr[i] to indptr[i+1] in the corresponding value vector. So the length of an indptr vector is the number of rows + 1.

The other vector, called indices, represents the column indices of the corresponding non-zero values. So the length of an indices vector is same as the number of non-zero-values.

classSparseTensor#

EXPERIMENTAL: The base class of sparse tensor container.

Subclassed by arrow::SparseTensorImpl< SparseIndexType >

Public Functions

inlinestd::shared_ptr<DataType >type()const#: Return a value type of the sparse tensor.

inlinestd::shared_ptr<Buffer >data()const#: Return a buffer that contains the value vector of the sparse tensor.

inlineconstuint8_t*raw_data()const#: Return an immutable raw data pointer.

inlineuint8_t*raw_mutable_data()const#: Return a mutable raw data pointer.

inlineconststd::vector<int64_t>&shape()const#: Return a shape vector of the sparse tensor.

inlineconststd::shared_ptr<SparseIndex >&sparse_index()const#: Return a sparse index of the sparse tensor.

inlineintndim()const#: Return a number of dimensions of the sparse tensor.

inlineconststd::vector<std::string>&dim_names()const#: Return a vector of dimension names.

conststd::string&dim_name(inti)const#: Return the name of the i-th dimension.

int64_tsize()const#: Total number of value cells in the sparse tensor.

inlineboolis_mutable()const#: Return true if the underlying data buffer is mutable.

inlineint64_tnon_zero_length()const#: Total number of non-zero cells in the sparse tensor.

boolEquals(constSparseTensor &other, constEqualOptions&=EqualOptions::Defaults())const#: Return whether sparse tensors are equal.

Result <std::shared_ptr<Tensor >>ToTensor(MemoryPool *pool)const#

Return dense representation of sparse tensor as tensor.

The returned Tensor has row-major order (C-like).

template<typenameSparseIndexType> classSparseTensorImpl:publicarrow::SparseTensor #

EXPERIMENTAL: Concrete sparse tensor implementation classes with sparse index type.

Public Functions

inlineSparseTensorImpl(conststd::shared_ptr<SparseIndexType >&sparse_index, conststd::shared_ptr<DataType >&type, conststd::shared_ptr<Buffer >&data, conststd::vector<int64_t>&shape, conststd::vector<std::string>&dim_names)#: Construct a sparse tensor from physical data buffer and logical index.

inlineSparseTensorImpl(conststd::shared_ptr<DataType >&type, conststd::vector<int64_t>&shape, conststd::vector<std::string>&dim_names={})#: Construct an empty sparse tensor.

Public Static Functions

staticinlineResult <std::shared_ptr<SparseTensorImpl <SparseIndexType >>>Make(conststd::shared_ptr<SparseIndexType >&sparse_index, conststd::shared_ptr<DataType >&type, conststd::shared_ptr<Buffer >&data, conststd::vector<int64_t>&shape, conststd::vector<std::string>&dim_names)#: Create a SparseTensor with full parameters.

staticinlineResult <std::shared_ptr<SparseTensorImpl <SparseIndexType >>>Make(constTensor &tensor, conststd::shared_ptr<DataType >&index_value_type, MemoryPool *pool=default_memory_pool ())#

Create a sparse tensor from a dense tensor.

The dense tensor is re-encoded as a sparse index and a physical data buffer for the non-zero value.

usingarrow::SparseCOOTensor=SparseTensorImpl <SparseCOOIndex >#: EXPERIMENTAL: Type alias for COO sparse tensor.

usingarrow::SparseCSCMatrix=SparseTensorImpl <SparseCSCIndex>#: EXPERIMENTAL: Type alias for CSC sparse matrix.

usingarrow::SparseCSFTensor=SparseTensorImpl <SparseCSFIndex>#: EXPERIMENTAL: Type alias for CSF sparse matrix.

usingarrow::SparseCSRMatrix=SparseTensorImpl <SparseCSRIndex >#: EXPERIMENTAL: Type alias for CSR sparse matrix.