Python Example
This tutorial shows you how to create a simple Python project with Pixi. Pixi supports two manifest formats: pixi.toml
and pyproject.toml
. This tutorial will use the pixi.toml
. The pixi.toml
is the Pixi project configuration file, also known as the project manifest. This tutorial will create a deep learning project that classifies the is a Convolutional Neural Network (CNN) built using PyTorch to classify images from the CIFAR-10 dataset, which consists of 60,000 32x32 color images across 10 classes (like planes, cars, birds, etc.) using Pixi.
Create pixi.toml
File
To use Pixi on Discovery, you need to load its module.
module load pixi
Then, create a pixi.toml
file by running the following command:
$ pixi init pixi_pytorch
✔ Created /fs1/project/hpcteam/tahat/pixi/pixi_pytorch/pixi.toml
This will create a folder named pixi_pytorch
and a pixi.toml
file that contains the following:
[project]
authors = ["Mohammad <tahat@nmsu.edu>"]
channels = ["conda-forge"]
description = "Add a short description here"
name = "pixi_pytorch"
platforms = ["linux-64"]
version = "0.1.0"
[tasks]
[dependencies]
You can update the options of the project table. For example, update the description with appropriate description.
The channels
and platforms
are added to the [project]
section. Conda-forge manages packages similar to PyPI, but it offers a wider range of packages for different programming languages. The platforms
keyword dictates the supported platform for the project.
Then, change directory to project directory pixi_pytorch
.
cd pixi_pytorch
Add Project Dependency
This project depends on two software packages: cuda
, pytorch
and torchvision
. To add these dependencies, run the following command:
$ pixi add cuda pytorch torchvision
✔ Added cuda >=12.6.1,<13
✔ Added pytorch >=2.4.0,<3
✔ Added torchvision >=0.19.0,<0.20
pixi add
adds dependencies to the pixi.toml
. It will only add the dependency if the package with its version constraint is able to work with rest of the dependencies in the project.
== Install Dependencies
Install the dependencies you added in the previous section by running the following command:
pixi install
✔ The default environment has been installed.
Using pixi list
, you can see what’s in the environment.
$ pixi list
...
c-compiler 1.7.0 hd590300_1 6.2 KiB conda c-compiler-1.7.0-hd590300_1.conda
ca-certificates 2024.8.30 hbcca054_0 155.3 KiB conda ca-certificates-2024.8.30-hbcca054_0.conda
cuda 12.6.1 ha804496_0 26.1 KiB conda cuda-12.6.1-ha804496_0.conda
cuda-cccl_linux-64 12.6.37 ha770c72_0 1 MiB conda cuda-cccl_linux-64-12.6.37-ha770c72_0.conda
...
python_abi 3.12 5_cp312 6.1 KiB conda python_abi-3.12-5_cp312.conda
pytorch 2.4.0 cpu_generic_py312h1576ffb_1 24.6 MiB conda pytorch-2.4.0-cpu_generic_py312h1576ffb_1.conda
readline 8.2 h8228510_1 274.9 KiB conda readline-8.2-h8228510_1.conda
sleef 3.6.1 h1b44611_3 1.8 MiB conda sleef-3.6.1-h1b44611_3.conda
sympy 1.13.2 pypyh2585a3b_103 4.4 MiB conda sympy-1.13.2-pypyh2585a3b_103.conda
sysroot_linux-64 2.17 h4a8ded7_16 14.8 MiB conda sysroot_linux-64-2.17-h4a8ded7_16.conda
tk 8.6.13 noxft_h4845f30_101 3.2 MiB conda tk-8.6.13-noxft_h4845f30_101.conda
torchvision 0.19.0 cpu_py312hdb59fe3_0 10 MiB conda torchvision-0.19.0-cpu_py312hdb59fe3_0.conda
typing_extensions 4.12.2 pyha770c72_0 39 KiB conda typing_extensions-4.12.2-pyha770c72_0.conda
...
When installing environments on old versions of Linux. You may encounter the following error:
× The current system has a mismatching virtual package. The project requires '__linux' to be at least version '5.10' but the system has version '4.18.0'
To fix this, edit the pixi.toml
file and add the following to lower their system requirements for the project:
[system-requirements]
linux = "4.12.14"
Create Python Script
Create a Python script and name it image_classifier.py
in your desired folder, and add the following code to it:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
# Set the device to GPU if available, otherwise CPU
print(torch.version.cuda)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")
print(torch.cuda.is_available())
# Define the transformation to normalize the data
transform = transforms.Compose([
transforms.ToTensor(), # Convert the image to a PyTorch tensor
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)) # Normalize the images
])
# Load the CIFAR-10 training dataset
trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=100,
shuffle=True, num_workers=2)
# Load the CIFAR-10 test dataset
testset = torchvision.datasets.CIFAR10(root='./data', train=False,
download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=100,
shuffle=False, num_workers=2)
# Define the classes in CIFAR-10
classes = ('plane', 'car', 'bird', 'cat', 'deer',
'dog', 'frog', 'horse', 'ship', 'truck')
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
# First convolutional layer: input channels = 3, output channels = 32, kernel size = 3
self.conv1 = nn.Conv2d(3, 32, 3, padding=1)
# Second convolutional layer: input channels = 32, output channels = 64, kernel size = 3
self.conv2 = nn.Conv2d(32, 64, 3, padding=1)
# Max pooling layer with a 2x2 window
self.pool = nn.MaxPool2d(2, 2)
# Fully connected layer: input features = 64*8*8, output features = 512
self.fc1 = nn.Linear(64 * 8 * 8, 512)
# Fully connected layer: input features = 512, output features = 10 (for 10 classes)
self.fc2 = nn.Linear(512, 10)
def forward(self, x):
# Apply conv1, followed by ReLU activation, then max pooling
x = self.pool(torch.relu(self.conv1(x)))
# Apply conv2, followed by ReLU activation, then max pooling
x = self.pool(torch.relu(self.conv2(x)))
# Flatten the feature map for the fully connected layer
x = x.view(-1, 64 * 8 * 8)
# Apply fc1 followed by ReLU activation
x = torch.relu(self.fc1(x))
# Apply the output layer (fc2)
x = self.fc2(x)
return x
# Instantiate the network and move it to the GPU
net = Net().to(device)
# Use CrossEntropyLoss which combines Softmax and Negative Log-Likelihood Loss
criterion = nn.CrossEntropyLoss()
# Use Stochastic Gradient Descent (SGD) with a learning rate of 0.001 and momentum of 0.9
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
# Loop over the dataset multiple times (epochs)
for epoch in range(5): # Train for 10 epochs
running_loss = 0.0
# Iterate over data in batches
for i, data in enumerate(trainloader, 0):
inputs, labels = data # Get the inputs and labels
inputs, labels = inputs.to(device), labels.to(device) # Move to GPU
optimizer.zero_grad() # Zero the parameter gradients
outputs = net(inputs) # Forward pass
loss = criterion(outputs, labels) # Compute the loss
loss.backward() # Backward pass
optimizer.step() # Optimization step
running_loss += loss.item()
if i % 100 == 99: # Print every 100 mini-batches
print(f'Epoch {epoch + 1}, Batch {i + 1}: loss {running_loss / 100:.3f}')
running_loss = 0.0
print('Finished Training')
correct = 0
total = 0
with torch.no_grad():
for data in testloader:
images, labels = data
images, labels = images.to(device), labels.to(device) # Move to GPU
outputs = net(images) # Forward pass
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print(f'Accuracy of the network on the 10000 test images: {100 * correct / total} %')
Create and Submit a Submission Script
To run the Python script on Discovery, create a submission script (sub_pixi.py
) using Slurm in the same directory of the Python script, and add the following lines:
#!/bin/bash
#SBATCH --job-name=tensorflow
#SBATCH --output=modle-%j.out
#SBATCH --ntasks=1
#SBATCH --gpus-per-task=1
##SBATCH --ntasks-per-node=1
#SBATCH --mem-per-gpu=5G
#SBATCH -p normal
#SBATCH --time 00:30:00
#SBATCH --constraint=v100-32g
module purge
module load pixi/2024a
cd /fs1/project/hpcteam/tahat/pixi/pixi_pytorch
pixi run python image_classifier.py
You can also add a task and run that task in the submission script. |
Then, submit the script using the following command:
sbatch sub_pixi.py
To run your GPU experiments on discovery-g1 make sure that the CUDA version is 11.
|
Then you can check the output of the job output file.