Skip to content

Commit

Permalink
improved article
Browse files Browse the repository at this point in the history
  • Loading branch information
JohnnieJnr committed Aug 20, 2024
1 parent aa38c09 commit 7574d84
Show file tree
Hide file tree
Showing 2 changed files with 193 additions and 91 deletions.
277 changes: 191 additions & 86 deletions articles/20240820_gpu_utilisation.md
Original file line number Diff line number Diff line change
@@ -1,179 +1,283 @@
### GPU Utilization in Containerized Environments: A Practical Guide
---
title: "GPU Utilization in Containerized Environments: A Comprehensive Guide"
description: "A step-by-step guide on setting up and configuring containers for GPU-intensive tasks such as LLM inference or fine-tuning, using nanoGPT as a demo project."
author: "Your Name"
date: "2024-08-20"
tags: ["GPU", "Containerization", "LLM Inference", "nanoGPT", "Docker", "Daytona"]
---

## TL;DR

As the demand for machine learning, particularly large language models (LLMs), continues to grow, the need for powerful GPU resources has become increasingly critical. GPUs enable faster inference and fine-tuning, making them essential for developers working with LLMs. However, managing GPU resources efficiently, especially in containerized environments, can be challenging. This article provides a step-by-step guide on setting up and configuring containers for GPU-intensive tasks, with a demo project using [nanoGPT](https://github.com/karpathy/nanoGPT) running in Daytona.
Setting up and configuring containers for GPU-intensive tasks, such as LLM inference, involves several key steps:

1. **Prerequisites**: Ensure you have an NVIDIA GPU, CUDA drivers, Docker, and NVIDIA Container Toolkit installed.
2. **Docker Setup**: Install Docker and the NVIDIA Container Toolkit to enable GPU support in containers.
3. **Dockerfile Configuration**: Create a Dockerfile with a suitable base image, install necessary libraries, and set up your application environment.
4. **Demo Project**: Use the [nanoGPT](https://github.com/karpathy/nanoGPT) project as an example. Create a Dockerfile for nanoGPT, build the Docker image, and run it in Daytona.
5. **Challenges**: Address common issues such as driver incompatibilities, container crashes, and performance degradation. Utilize tools like `nvidia-smi` for monitoring.
6. **Optional**: Benchmark GPU performance and apply optimization tips for improved efficiency.

---

### Setting Up the Environment
## GPU Utilization in Containerized Environments: A Comprehensive Guide

As the demand for machine learning, particularly large language models (LLMs), continues to grow, the need for powerful GPU resources has become increasingly critical. GPUs enable faster inference and fine-tuning, making them essential for developers working with LLMs. However, managing GPU resources efficiently, especially in containerized environments, can be challenging. This article provides an in-depth guide on setting up and configuring containers for GPU-intensive tasks, with a detailed example using [nanoGPT](https://github.com/karpathy/nanoGPT) running in Daytona.

#### Prerequisites
### Prerequisites and Setup

Before diving into containerization, ensure you have the following prerequisites:
- **Hardware**: A compatible NVIDIA GPU.
#### Hardware and Software Requirements

Before starting, ensure that your setup meets the following requirements:

- **Hardware**: A system with an NVIDIA GPU that supports CUDA (e.g., Tesla, Quadro, GeForce series).
- **Software**:
- CUDA drivers installed on the host machine.
- Docker installed.
- NVIDIA Container Toolkit for Docker.
- **CUDA Drivers**: Installed on the host machine. Check compatibility with your GPU and CUDA version.
- **Docker**: Installed and running on your system.
- **NVIDIA Container Toolkit**: Necessary for enabling GPU support in Docker containers.

##### Checking GPU Compatibility

To confirm your GPU’s compatibility with CUDA, use the following command:

#### Installing Necessary Tools
```bash
lspci | grep -i nvidia
```

If you see an output listing your NVIDIA GPU, you’re good to go. Otherwise, ensure that the GPU is correctly installed and recognized by your system.

#### Installing Docker

1. **Install Docker**:
Ensure you have Docker installed. You can install Docker on Ubuntu using:
Docker is essential for creating and managing containers. Install Docker by following these steps:

1. **Update Your Package List**:

```bash
sudo apt-get update
```

2. **Install Docker**:

On Ubuntu, install Docker with the following command:

```bash
sudo apt-get install docker-ce docker-ce-cli containerd.io
```

2. **Install NVIDIA Container Toolkit**:
To enable GPU support in Docker, you need to install the NVIDIA Container Toolkit:
3. **Start and Enable Docker**:

```bash
sudo systemctl start docker
sudo systemctl enable docker
```

To verify the installation, run:

```bash
docker --version
```

#### Installing NVIDIA Container Toolkit

The NVIDIA Container Toolkit allows Docker to access your GPU. To install it:

1. **Set Up the Package Repository**:

```bash
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
```

2. **Install the NVIDIA Container Toolkit**:

```bash
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
```

3. **Verify GPU Support in Docker**:
After installing the toolkit, verify that Docker can access your GPU:
3. **Verify GPU Access in Docker**:

Run the following command to ensure Docker can access your GPU:

```bash
docker run --gpus all nvidia/cuda:11.0-base nvidia-smi
```

This command should display your GPU information, confirming that Docker can use the GPU.
This should display the details of your GPU, confirming that Docker is configured correctly.

### Configuring Docker for GPU Access

#### Configuring Docker for GPU Access
Once Docker and the NVIDIA toolkit are set up, the next step is to configure a Dockerfile that leverages GPU resources. The Dockerfile is essentially a blueprint for creating Docker images, defining the environment in which your application will run.

To build containers with GPU support, you'll need to create a Dockerfile that includes the necessary CUDA libraries and Python dependencies.
#### Example Dockerfile

Example Dockerfile for a Python-based GPU application:
Here’s an example Dockerfile designed to set up a Python environment with GPU support. This configuration is suitable for running machine learning models that require heavy computational resources:

```Dockerfile
# Use an official NVIDIA CUDA image as a base
FROM nvidia/cuda:11.3.1-cudnn8-devel-ubuntu20.04

# Install Python and dependencies
# Install Python and pip
RUN apt-get update && apt-get install -y python3 python3-pip

# Install necessary Python packages
RUN pip3 install torch torchvision torchaudio

# Set working directory
# Set the working directory
WORKDIR /app

# Copy your application code
# Copy your application code into the container
COPY . /app

# Run your application
# Command to run your application
CMD ["python3", "your_script.py"]
```

##### Breakdown of the Dockerfile

- **Base Image**: We start with an NVIDIA CUDA base image that includes development libraries and tools (`nvidia/cuda:11.3.1-cudnn8-devel-ubuntu20.04`).
- **Python Installation**: The `apt-get` commands install Python and pip.
- **Python Packages**: The `pip3` commands install the necessary Python libraries, such as PyTorch, which supports GPU acceleration.
- **Working Directory**: The `WORKDIR` command sets the working directory within the container.
- **Application Code**: The `COPY` command transfers your application code into the container.
- **Execution Command**: The `CMD` command specifies the script to run when the container starts.

### Demo Project: Running nanoGPT in Daytona

For this guide, we'll use [nanoGPT](https://github.com/karpathy/nanoGPT) by Andrej Karpathy as a demo project. nanoGPT is a lightweight implementation of GPT, making it ideal for experimentation in containerized environments.
To demonstrate how to utilize GPUs in containerized environments, we'll use [nanoGPT](https://github.com/karpathy/nanoGPT), a lightweight implementation of the GPT model by Andrej Karpathy. This project is ideal for developers experimenting with LLM inference in containers.

#### Project Overview
#### Cloning the nanoGPT Repository

nanoGPT is a minimalistic implementation of GPT, designed to be simple and easy to run. It's perfect for developers who want to experiment with LLM inference in a containerized environment.
Start by cloning the nanoGPT repository from GitHub:

#### Building the Project
```bash
git clone https://github.com/karpathy/nanoGPT.git
cd nanoGPT
```

1. **Clone the Repository**:
Start by cloning the nanoGPT repository:
#### Creating a Dockerfile for nanoGPT

```bash
git clone https://github.com/karpathy/nanoGPT.git
cd nanoGPT
```
To run nanoGPT in a containerized environment, create a Dockerfile with the following content:

2. **Create a Dockerfile**:
Create a Dockerfile within the nanoGPT directory to containerize the project:
```Dockerfile
# Use an official NVIDIA CUDA image as a base
FROM nvidia/cuda:11.3.1-cudnn8-devel-ubuntu20.04

```Dockerfile
FROM nvidia/cuda:11.3.1-cudnn8-devel-ubuntu20.04
# Install Python, pip, and git
RUN apt-get update && apt-get install -y python3 python3-pip git

# Install Python and dependencies
RUN apt-get update && apt-get install -y python3 python3-pip git
RUN pip3 install torch torchvision torchaudio
# Install necessary Python packages
RUN pip3 install torch torchvision torchaudio

# Clone nanoGPT repository
RUN git clone https://github.com/karpathy/nanoGPT.git /nanoGPT
WORKDIR /nanoGPT
# Clone the nanoGPT repository into the container
RUN git clone https://github.com/karpathy/nanoGPT.git /nanoGPT

# Run nanoGPT training or inference
CMD ["python3", "train.py", "--dataset", "path/to/your/dataset"]
```
# Set the working directory to nanoGPT
WORKDIR /nanoGPT

3. **Build the Docker Image**:
Build the Docker image for nanoGPT:
# Command to run nanoGPT training or inference
CMD ["python3", "train.py", "--dataset", "path/to/your/dataset"]
```

```bash
docker build -t nanogpt-gpu .
```
This Dockerfile sets up the environment required to run nanoGPT, including CUDA and PyTorch, which are essential for leveraging GPU resources.

#### Building the Docker Image

To build the Docker image for nanoGPT, execute the following command within the directory containing your Dockerfile:

```bash
docker build -t nanogpt-gpu .
```

#### Running the Project in Daytona
This command builds the Docker image with the tag `nanogpt-gpu`. The build process may take a few minutes, depending on your system’s resources.

Daytona is a container orchestration platform that simplifies running containerized applications, especially those requiring GPU support.
#### Running nanoGPT in Daytona

1. **Special Instructions for Daytona**:
- Ensure that the Dockerfile is compatible with Daytona's environment.
- Set environment variables specific to Daytona, such as `CUDA_VISIBLE_DEVICES` to allocate GPUs.
Daytona is a container orchestration platform that simplifies running containerized applications, especially those requiring GPU support. To run nanoGPT in Daytona, follow these steps:

2. **Run the Container in Daytona**:
Use Daytona's interface or CLI to deploy the nanoGPT container. Ensure that the GPU resources are correctly allocated.
1. **Deploy the Container in Daytona**:
Use Daytonas interface or CLI to deploy the container. Ensure that the GPU resources are correctly allocated by setting the appropriate environment variables.

Example command:

```bash
daytona run --gpus all nanogpt-gpu
```

3. **Verify GPU Utilization**:
After deploying the container, use `nvidia-smi` within the container to verify GPU utilization.
This command deploys the nanoGPT container in Daytona with full GPU access.

2. **Verify GPU Utilization**:
After deploying the container, you can verify that it is utilizing the GPU by accessing the container and running `nvidia-smi`:

```bash
docker exec -it <container_id> nvidia-smi
```

#### Accessing Logs and Metrics
This command will show the GPU's current usage, confirming that your container is making use of GPU resources.

### Troubleshooting Common Challenges

Running GPU-intensive tasks in containers can present several challenges. Below are some common issues and their solutions:

Monitoring GPU usage is crucial for optimizing performance. Use Daytona's built-in monitoring tools or integrate with third-party tools like Prometheus and Grafana to track GPU metrics.
#### Driver Incompatibilities

### Challenges and Solutions
One of the most common issues is a mismatch between the CUDA version in the container and the NVIDIA driver version on the host machine. If you encounter errors related to drivers, ensure that the CUDA version in your Docker image matches the driver version on your host.

#### Common Issues
You can check your CUDA version with:

```bash
nvcc --version
```

1. **Driver Incompatibilities**:
- Ensure that the CUDA version in the container matches the driver version on the host machine.
Ensure that the version displayed matches the one specified in your Dockerfile.

2. **Container Crashes**:
- Check for memory allocation issues or misconfigured environment variables.
#### Container Crashes

If your

container crashes during runtime, it might be due to insufficient memory or misconfigured environment variables. You can inspect the container logs for any errors:

```bash
docker logs <container_id>
```

3. **Performance Degradation**:
- Monitor GPU utilization and optimize code or CUDA settings if needed.
Make sure that your system has enough memory to handle the workload, and adjust your environment variables as necessary.

#### Troubleshooting Guide
#### Performance Degradation

- **Driver Issues**: If you encounter driver issues, verify that the correct version of the NVIDIA driver is installed and compatible with the CUDA version in your Docker image.
- **Memory Management**: Optimize memory usage within your code to prevent out-of-memory errors, especially when working with large models.
If your application is not performing as expected, monitor the GPU utilization using tools like `nvidia-smi`. If the GPU utilization is low, consider optimizing your code or adjusting CUDA settings.

#### Security Considerations
### Security Considerations

When running GPU-intensive tasks in a multi-tenant environment, it's essential to secure GPU resources. Ensure that containers are isolated and that only authorized containers can access the GPU.
When running GPU-intensive tasks in multi-tenant environments, security is paramount. Ensure that containers are isolated, and only authorized containers have access to GPU resources. Use Docker’s built-in security features, such as user namespaces and capabilities, to limit the privileges of your containers.

### (Optional) Performance Benchmarks and Optimization Tips

If you wish to go further, you can benchmark nanoGPT's performance in different environments:
To get the most out of your GPU in a containerized environment, consider running performance benchmarks and applying optimization techniques.

1. **Benchmarking GPU Performance**:
- Use tools like `nvprof` or `nsys` to profile GPU usage and identify bottlenecks.
#### Benchmarking GPU Performance

2. **Optimization Tips**:
- Reduce Docker image size by using lightweight base images.
- Adjust CUDA settings to maximize GPU utilization.
Use tools like `nvprof` or NVIDIA Nsight Systems (`nsys`) to profile your application’s GPU usage. These tools can help you identify bottlenecks and optimize your code.

Example command for profiling:

```bash
nvprof python3 train.py --dataset path/to/your/dataset
```

This command will generate a detailed report on how your application uses GPU resources, helping you to pinpoint areas for improvement.

#### Optimization Tips

- **Reduce Docker Image Size**: Use smaller base images and multi-stage builds to minimize the size of your Docker images, reducing the overhead when deploying containers.
- **Adjust CUDA Settings**: Fine-tune CUDA environment variables to optimize GPU utilization. For example, you can adjust the `CUDA_VISIBLE_DEVICES` variable to control which GPUs your container can access.

### Conclusion

In this guide, we’ve walked through the process of setting up and configuring containers for GPU-intensive tasks, using nanoGPT as a demo project. By following these steps, developers can efficiently run LLM inference in a containerized environment, specifically within Daytona. While there may be challenges along the way, such as driver issues or performance bottlenecks, the solutions provided here should help you navigate them effectively.
In this comprehensive guide, we’ve walked through the process of setting up and configuring containers for GPU-intensive tasks, using nanoGPT as a practical example. By following these detailed steps, developers can efficiently run LLM inference in a containerized environment, particularly within Daytona.

We encourage you to experiment with nanoGPT in Daytona, explore advanced configurations, and share your findings with the community.
We encourage you to experiment with nanoGPT in Daytona, explore advanced configurations, and share your findings with the community. By optimizing your setup and addressing common challenges, you can fully leverage the power of GPUs in containerized environments.

---

Expand All @@ -182,8 +286,9 @@ We encourage you to experiment with nanoGPT in Daytona, explore advanced configu
- **[NVIDIA Documentation](https://docs.nvidia.com/)**: Official guides on setting up NVIDIA GPUs with Docker.
- **[CUDA Toolkit Documentation](https://developer.nvidia.com/cuda-toolkit)**: For understanding CUDA setup and GPU programming.
- **[Docker Documentation](https://docs.docker.com/)**: General containerization practices and GPU support.
- **[Docker High Memory Usage Debug](https://www.benjaminrancourt.ca/how-to-debug-high-memory-usage-in-docker/)**: Troubleshooting High Memory Usage in Docker.
- **[nanoGPT GitHub Repository](https://github.com/karpathy/nanoGPT)**: The nanoGPT project by Andrej Karpathy.
- **[Daytona Documentation](https://daytona.readthedocs.io/)**: For running and optimizing containerized applications in Daytona.

This article provides a comprehensive guide for developers looking to harness GPU resources in containerized environments, particularly for LLM inference using the nanoGPT project.
---

This article provides a comprehensive and detailed guide for developers looking to harness GPU resources in containerized environments, specifically for LLM inference using the nanoGPT project.
7 changes: 2 additions & 5 deletions authors/johnnie_oduro.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,5 @@
Author: Johnnie Oduro Jnr.
Title: DevOps Engineer
Description: Johnnie Oduro Jnr is a skilled DevOps Engineer with expertise in system monitoring, automation, and infrastructure as code. Currently exploring DevOps, SRE, and Data Analysis, Johnnie is committed to continuous learning and collaboration, backed by certifications in AWS and proficiency in Python, Linux, and Git.
Company Name:
Company Description:
Author Image: authors/assets/johnnie-oduro-jnr.JPG
Company Logo Dark:
Company Logo White:
Company Name: BlueSPACE Africa Technologies
Author Image: authors/assets/johnnie-oduro-jnr.JPG

0 comments on commit 7574d84

Please sign in to comment.