Installation
SuperAlign SDK & CLI can be directly installed using pip.For additional project requirements we will need to install the following packages
You can use the following command to install the packages.requirements.txt
file with the following contents
Download and load your dataset
Download your dataset from here. Start by creating a function to load the dataset into a DataFrame. We will use the @load_data() decorator from SuperAlign SDK.Preprocess the data
We can add a few more functions to preprocess the data. We will use the @transformer() decorator from SuperAlign SDK. Add the following additional imports to the top of your file.A transformer can have multiple parents. In this case, the transformer will be
executed after all the parents have been executed. The output of the parents will
be passed as input to the transformer.
Creating a dataset
We can now create a dataset from the pipeline. The dataset will be created by executing the pipeline and saving the output of the last transformer in the pipeline. The dataset can be created by using the@dataset
decorator. The decorator takes the following arguments:
label
: The name of the datasetparent
: The name of the transformer whose output will be saved as the datasetupload
: IfTrue
, the dataset will be uploaded to the cloud. IfFalse
, the dataset will be saved locally.
Creating a model to classify the dataset
With the SuperAlign model module, you can perform a variety of actions related to creating and managing models. SuperAlign assists you with training and tracking all of your machine learning project information, including ML models and datasets, using semantic versioning and full artifact logging. We can make a separate python file for the model. The model file will contain the model definition and the training code. Let’s start by adding the required imports.@model
decorator. The decorator takes the model name as the argument in the format model_name
.
The
pureml.log
function is used here to log the metrics and parameters of the
model.Add prediction to your model
For registered models, prediction function along with its requirements and resources can be logged to be used for further processes like evaluating and packaging. SuperAlign predict module has a method add. Here we are using the following arguments:label
: The name of the model (model_name:version)paths
: The path to the predict.py file and requirements.txt file.
You can know more about the prediction process here
Create your first Evaluation
SuperAlign has an eval method that runs a task_type on a label_model using a label_dataset.Deploy with FastAPI
You evaluated model can now be deployed using FastAPI simply by using thepureml.fastapi.run
method.
Deploy using Docker
Alternatively, you can also use Docker for deployment: To deploy on docker you need to create an .env file with the following variables:You can get your API token and org id from the settings
page. If you do not remember/have any API token then you can generate one
pureml.docker.create
method.