Low-Cost, High-Performance Llama 2 ML Pipelines on GCP

Video Sections Show timestamps

Tutorial details

Dive deep into the intricacies of running Llama-2 in machine learning pipelines. We unpack the challenges and showcase how to maintain a serverless approach, optimize costs, leverage hardware accelerators, and ensure swift model download.

Highlights:

Setting up Vertex AI Pipelines on Google Cloud
Implementing the Llama model in pipelines using Python within the Kubeflow framework
Using the Hugging Face Transformers PyTorch GPU image
Streamlining the process of model download and chat text generation
Troubleshooting and refining your pipeline

Resources:

Demonstration Code and Diagram

Aug 18, 2023

2113

Views

Challenging

Related Skills:

Vertex AI
Python Programming
Machine Learning
Artificial Intelligence

Low-Cost, High-Performance Llama 2 ML Pipelines on GCP

Video Sections Show timestamps

Tutorial details

Resources:

Related Skills:

Share:

Related Video Tutorials

Gemini in Google Cloud (Hands-On with New Vertex AI LLMs)

NGINX with Redis Caching (Simple Serverless Setup)

Hands-On Introduction to Ray with Vertex AI (Google Cloud)

Low-Cost, High-Performance Llama 2 ML Pipelines on GCP

Video Sections Show timestamps

Tutorial details

Resources:

Related Skills:

Share:

Related Video Tutorials

Gemini in Google Cloud (Hands-On with New Vertex AI LLMs)

NGINX with Redis Caching (Simple Serverless Setup)

Hands-On Introduction to Ray with Vertex AI (Google Cloud)

Cookie Policy