What is ProjectVM

Where you code, develop, build, test and stage models

Project VM is a flexible package of hardware and software resources needed to perform different data science and machine learning tasks. A single user can create and manage multiple projects each with its own set of compute resources. All the compute resources created within these projects are the responsibility of the user.

TL;DR

Users can perform different tasks on the project compute using JupyterLab, RStudio or VSCode Integrated Development Environments (IDEs). The project compute can be scaled vertically or horizontally based on the stage of their data pipeline. Users can securely access different data sources like S3 buckets and database credentials to build their data pipelines and machine learning models. Each project comes with a managed MLflow server to keep track of experiments, code, and models. Users can deploy models as containers along with a REST API endpoint for integrating machine learning models to business applications.

Data Persistence

An Elastic File Share (EFS) is created for every user during the creation of the first project. All projects will have access to EFS to share code and data across projects.

In the following sections, we will discuss different project states:

Create State

During project creation an IAM role with access to different data sources, an EC2 instance (hereafter referred to as Master Node) of default size determined by your administrator (based on a pre-configured AMI = Amazon Machine Image), an RDS instance to track experiments, runs, models, and a folder in S3 bucket to save artifacts.

ON State

In ON state, users can access project compute resources via IDEs like JupyterLab, RStudio, and VSCode. Users can also access all the resources permitted through the IAM role using these IDEs. An experiment tracking server based on mlflow automatically starts on the EC2 instance. A tracking URL will be shown in the project details page and the same is accessible as an environment variable TRACKING_URL in JupyterLab.

Updating Master State

In ON state, the master node can be scaled up or down based on the available compute instances. Open the side panel by clicking on the floating power button on the right and then click on the Update Master Node button. Select one of the available nodes and click on Update. The master node will transition to “Updating Master” state. Once the update is complete, the master node will go back to ON state. In the example below, we have updated the master node from a 2 core, 4 GB instance to a 4 core, 16 GB instance.

Creating Cluster State

When in ON state, users can horizontally scale their project compute by adding child nodes. Open the right-hand side panel, click on Add Child Node, select one of the instance types with the required CPUs and memory, select the number of instances, and click on Update. The project will switch to Creating Cluster state. Cluster creation will take 10-15 minutes based on the instance type and number of nodes in the cluster. Project will reach ON state once the cluster creation is complete.

The figure below shows an architecture diagram of the resources that are orchestrated during cluster creation.

Scaling Cluster State - Remove child nodes

Clusters can be scaled up, down and moved back to a single node.

To remove all the child nodes, click on Remove Child Node. The project will switch to Scaling Cluster state. Once the child nodes are removed, the project will come back to ON state, only with the master node.

Scaling Cluster State - Update child nodes

In order to add more child nodes or update their instance type, click on Update Child Node. Users can pick the instance type, the number of nodes and click on Update. Project will switch to the “Scaling Cluster” state. Once scaling is complete, the project will go back to ON state.

Starting and Stopping States

Transitioning from ON → OFF will be shown as Stopping and OFF → ON as Starting.

OFF State

Users can turn off the entire project using the toggle button on the right side panel. Project will move to Stopping state and then to OFF state.

Deleting State

Users can delete a project that is in OFF state. The “Delete project” option appears by hovering the cursor over the project name. This option will be grayed-out when the project is turned on, and will only be active when the project is in OFF state. This action will delete all the project’s resources: all virtual machines, databases, and project folders. For that reason, the application will require that the user types the project name to confirm the delete action. The project will switch to “Deleting” state and then be removed from the list of projects.

PreviousWhat is RocketML NextAccessing Data

Last updated 4 years ago

Was this helpful?