lenskit.data#
Data abstractions and data set access.
Submodules#
Data accumulation support
Legacy location of the Amazon import functions.
Classes for working with matrix data.
Legacy location of the MovieLens import functions.
Support for the MSWeb datasets.
Utility functions for implementing __str__ and __repr__ methods
Pydantic models for LensKit data schemas. These models define define the data
Import code for various data sources.
Basic data types used in data representations.
Attributes#
A generic collection key with no bounds or type information. Key types must
Types that can be converted to a query by RecQuery.create().
Valid sources for query items.
Classes#
Base class for an attribute associated with an entity class. This class
Iterator over a range by batches.
Construct data sets from data and tables.
A collection of item lists. This protocol defines read access to the
Collect item lists with associated keys, as in ItemListCollection.
Mutable item list collection backed by a Python list.
Intersection type of ItemListCollection and
Key type for query IDs. This is used for :ref:`item list collections
A general container for the data backing a dataset.
Representation of a data set for LensKit training, evaluation, etc. Data can
Representation of a set of entities from the dataset. Obtained from
Representation of a (usually ordered) list of items, possibly with scores
Representation of a the data available for a recommendation query. This is
Two-entity relationships without duplicates, accessible in matrix form.
Representation for a set of relationship records. This is the class for
Vocabularies of entity identifiers for the LensKit data model.
Functions#
from_interactions_df(df, *[, user_col, item_col, ...])
Create a dataset from a data frame of ratings or other user-item
key_dict(kt)
flatten_dict(data)
unflatten_dict(data, *[, sep])
Package Contents#
- lenskit.data.from_interactions_df(df, *, user_col=None, item_col=None, rating_col=None, timestamp_col=None, users=None, items=None, class_name='rating')#
Create a dataset from a data frame of ratings or other user-item interactions.
- Stability:
- Caller (see Stability Levels ).
- Parameters:
df (pandas.DataFrame) – The user-item interactions (e.g. ratings). The dataset code takes ownership of this data frame and may modify it.
user_col (str | None) – The name of the user ID column. By default, looks for columns named
user,user_id, oruserId, with several case variants.item_col (str | None) – The name of the item ID column. By default, looks for columns named
item,item_id, oritemId, with several case variants.rating_col (str | None) – The name of the rating column.
timestamp_col (str | None) – The name of the timestamp column.
user_ids – A vocabulary of user IDs. The data frame is subset to this set of IDs.
item_ids – A vocabulary of item IDs. The data frame is subset to this set of IDs.
name – The interaction class name.
users (lenskit.data.types.IDSequence | pandas.Index | Iterable[lenskit.data.types.ID ] | lenskit.data._vocab.Vocabulary | None)
items (lenskit.data.types.IDSequence | pandas.Index | Iterable[lenskit.data.types.ID ] | lenskit.data._vocab.Vocabulary | None)
class_name (str)
- Returns:
The initiated data set.
- Return type:
- typelenskit.data.GenericKey=tuple [ID ,...]#
A generic collection key with no bounds or type information. Key types must also be named tuples (the Python type system does not allow us to express this).
- lenskit.data.key_dict(kt)#
- Parameters:
kt (tuple [lenskit.data.types.ID , Ellipsis])
- Return type:
- lenskit.data.unflatten_dict(data, *, sep='.')#
- typelenskit.data.QueryInput=RecQuery |ID |ItemList |None #
Types that can be converted to a query by
RecQuery.create().
- typelenskit.data.QueryItemSource=Literal['history','session','context']#
Valid sources for query items.
Exported Aliases#
- exceptionlenskit.data.FieldError#
Re-exported alias for
lenskit.diagnostics.FieldError.
- lenskit.data.load_ms_web()#
Re-exported alias for
lenskit.data.msweb.load_ms_web().
- lenskit.data.load_amazon_ratings()#
Re-exported alias for
lenskit.data.sources.amazon.load_amazon_ratings().
- lenskit.data.load_movielens()#
Re-exported alias for
lenskit.data.sources.movielens.load_movielens().
- lenskit.data.load_movielens_df()#
Re-exported alias for
lenskit.data.sources.movielens.load_movielens_df().
- lenskit.data.ID#
Re-exported alias for
lenskit.data.types.ID.
- lenskit.data.NPID#
Re-exported alias for
lenskit.data.types.NPID.
- lenskit.data.FeedbackType#
Re-exported alias for
lenskit.data.types.FeedbackType.