A framework built on top of Ploomber that allows code-first definition of pipelines. No YAML needed!
To get the minimum code needed to use the pipelines, install it from PyPI:
pip install code-first-pipelines
import pandas as pd from sklearn import datasets from cf_pipelines import Pipeline iris_pipeline = Pipeline("My Cool Pipeline") @iris_pipeline.step("Data ingestion") def data_ingestion(): d = datasets.load_iris() df = pd.DataFrame(d["data"]) df.columns = d["feature_names"] df["target"] = d["target"] return {"raw_data.csv": df} iris_pipeline.run()
See the tutorial notebook for a more comprehensive example.
import pandas as pd from sklearn import datasets from cf_pipelines.ml import MLPipeline iris_pipeline = MLPipeline("My Cool Pipeline") @iris_pipeline.data_ingestion def data_ingestion(): d = datasets.load_iris() df = pd.DataFrame(d["data"]) df.columns = d["feature_names"] df["target"] = d["target"] return {"raw_data.csv": df} iris_pipeline.run()
See the tutorial notebook for a more comprehensive example.
Once installed, you can create a new pipeline template by running:
pipelines new [pipeline name]