Skip to content

SriPrarabdha/ray_tutorials

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

A practical workshop on scaling ML workflows with the Ray ecosystem

πŸš€ Overview

Modern AI systems are no longer bottlenecked by models β€” they are bottlenecked by infrastructure. Training models, processing terabytes of data, deploying LLMs, and orchestrating GPU clusters all require tooling that simplifies distributed systems.

Ray is that tooling.

This repository contains everything used in my 60-minute workshop on AI Infrastructure with Ray. Each folder contains:

  • A baseline implementation using traditional Python / PyTorch / multiprocessing
  • A Ray-powered implementation showing how the same workflow becomes scalable, cleaner, and fault-tolerant

If you are new to Ray, this repo will help you understand not just APIs β€” but how Ray changes the way ML engineers build systems.

πŸ“š What You’ll Learn

Inside this repo you will learn how to:

πŸ”Ή Scale Python code to clusters without rewriting it

Ray Tasks & Actors turn normal Python functions into distributed workloads effortlessly.

πŸ”Ή Process huge datasets with Ray Data

Load, stream, preprocess, and batch terabytes of text or images using a completely Pythonic interface.

πŸ”Ή Train models across multiple GPUs & nodes

Use Ray Train to scale PyTorch/HuggingFace/PEFT models with fault tolerance and automatic checkpointing.

πŸ”Ή Serve ML models (including LLMs) in production

Ray Serve lets you deploy, autoscale, route, batch, and version models β€” including vLLM deployments.

πŸ”Ή Run high-throughput LLM inference

Use vLLM with Ray Serve for blazing-fast, production-grade LLM inference.


πŸ—‚ Repository Structure

ray_tutorials/
β”œβ”€β”€ ray_core/
β”‚   β”œβ”€β”€ baseline              # Standard Python multiprocessing / threading examples
β”‚   β”œβ”€β”€ ray_tasks             # Same examples rewritten using Ray Tasks & Actors
β”‚   └── ray_actors            # How to run on real Ray clusters (VMs, LAN, K8s)
β”‚
β”œβ”€β”€ ray_data/
β”‚   β”œβ”€β”€ baseline              # Pandas, plain Python data pipelines
β”‚   └── ray_version           # Ray Data: distributed loading, batching, streaming
β”‚
β”œβ”€β”€ ray_train/
β”‚   β”œβ”€β”€ baseline              # Single-GPU PyTorch training (DDP optional)
β”‚   └── ray_version           # Ray Train distributed training, FT, checkpoints
β”‚
β”œβ”€β”€ ray_serve/
β”‚   β”œβ”€β”€ baseline              # Simple Flask/FastAPI serving patterns
β”‚   └── ray_version           # Ray Serve deployments, autoscaling, routing, batching
β”‚
β”œβ”€β”€ ray_tune/
β”‚   └── examples              # will be added soon
β”‚
└── vllm_examples/
    

Every module includes:

βœ” Baseline Python code βœ” Ray implementation βœ” Explanations + comments βœ” Cluster-ready examples


🧩 Who Is This For?

πŸŽ“ Students & Researchers

  • Learn how to scale experiments without rewriting everything
  • Build reproducible ML pipelines
  • Run multi-GPU training in your lab or on the cloud

πŸ›  ML Engineers

  • Build data pipelines, training pipelines, and serving pipelines
  • Turn your laptop code into distributed code
  • Deploy LLMs with autoscaling and batching

🏒 Tech Teams / Startups

  • Build production ML infra without managing complex distributed systems
  • Replace 5–6 tools with a unified Ray-based workflow
  • Save engineering time and avoid infrastructure glue code

πŸ— Philosophy of This Workshop

This workshop is built around a simple idea:

β€œDistributed ML should not require learning distributed systems.”

Ray lets you scale your code using:

  • your existing Python functions
  • your existing PyTorch models
  • your existing HuggingFace workflows
  • your existing serving patterns

No MPI. No Kubernetes YAML. No Spark jobs. No complicated Docker setups (unless you want them).

You get the power of a large-scale distributed system with the simplicity of standard Python.

🀝 Contributions

Feel free to open issues or PRs if you have improvements, bug fixes, or new examples.



About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages