Saturday, 18 January 2025

Hour 11 - Practical Applications with Llama: Hugging Face Models

Lecture Notes: 


1. Concepts

What is Hugging Face?

Hugging Face is an open-source AI community and platform that provides powerful tools for NLP tasks. It provides access to pre-trained models like BERT, GPT, and Llama for a wide variety of tasks, such as text generation, translation, summarization, and more. It also simplifies the integration of these models with APIs for easy deployment in applications.


Llama Models on Hugging Face

Llama, a family of LLMs (Large Language Models) by Meta, can be easily accessed on Hugging Face. Hugging Face’s transformers library provides seamless integration of these models, allowing you to fine-tune and deploy them for various NLP tasks.


Practical Applications of Llama Models

Llama models, when fine-tuned for specific tasks, can be applied to real-world scenarios in industries like:

  1. Text Summarization
    • Summarizing long articles, reports, research papers, and news.
  2. Question Answering (QA)
    • Building intelligent chatbots or QA systems.
  3. Text Generation
    • Generating creative writing, code, or completing unfinished sentences.
  4. Named Entity Recognition (NER)
    • Extracting names, dates, locations, and other entities from text.
  5. Translation
    • Language translation and localization.
  6. Sentiment Analysis
    • Determining the sentiment (positive/negative) in customer reviews, social media posts, etc.

2. Key Aspects of Practical Applications

  1. Pre-trained Models

    • Hugging Face provides a range of pre-trained models for various NLP tasks. These models are fine-tuned on diverse datasets and can be used immediately for real-world applications.
  2. Model Fine-Tuning

    • Fine-tuning pre-trained models on task-specific datasets enhances their performance. You can adapt a general-purpose model (like Llama) to your use case.
  3. APIs and Integration

    • Hugging Face offers APIs for easy model deployment and integration with applications. These can be used to make predictions in real-time via HTTP requests or integrated into chatbots, websites, etc.
  4. Datasets for Training

    • Hugging Face also provides datasets for training and fine-tuning models, along with utilities to preprocess data.
  5. Optimized Infrastructure

    • Hugging Face’s infrastructure (Model Hub and Inference API) allows for easy deployment on the cloud with optimized models, saving you from setting up your own infrastructure.

3. Implementation of Llama Models from Hugging Face

Prerequisites:

  • Install necessary Python packages.
    pip install transformers datasets torch
    

Example 1: Using Pre-trained Llama Model for Text Generation

This example demonstrates how to use a pre-trained Llama model from Hugging Face to generate text based on a given prompt.

from transformers import LlamaForCausalLM, LlamaTokenizer

# Load the Llama model and tokenizer
model_name = "meta-llama-7b-hf"  # This is the pre-trained Llama model available on Hugging Face
model = LlamaForCausalLM.from_pretrained(model_name)
tokenizer = LlamaTokenizer.from_pretrained(model_name)

# Example prompt for text generation
prompt = "In the field of artificial intelligence,"

# Tokenize the input
inputs = tokenizer(prompt, return_tensors="pt")

# Generate text based on the input prompt
outputs = model.generate(inputs['input_ids'], max_length=100)

# Decode and print the generated text
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)

Explanation:

  • The model is initialized from Hugging Face's meta-llama-7b-hf pre-trained model.
  • The tokenizer is used to preprocess the input text (prompt).
  • The model generates text based on the prompt using model.generate().
  • Finally, the output text is decoded using the tokenizer.

Example 2: Fine-Tuning a Llama Model for Text Classification (Sentiment Analysis)

In this example, we fine-tune the Llama model for a sentiment classification task.

  1. Dataset: We will use the IMDb dataset for sentiment analysis, available on Hugging Face Datasets.

  2. Fine-Tuning: We will fine-tune the pre-trained model on the sentiment classification dataset.

from transformers import LlamaForSequenceClassification, Trainer, TrainingArguments
from datasets import load_dataset
from transformers import LlamaTokenizer

# Load pre-trained model and tokenizer for sequence classification
model_name = "meta-llama-7b-hf"
model = LlamaForSequenceClassification.from_pretrained(model_name, num_labels=2)
tokenizer = LlamaTokenizer.from_pretrained(model_name)

# Load the IMDb dataset
dataset = load_dataset("imdb")

# Preprocess the dataset (tokenization)
def tokenize_function(examples):
    return tokenizer(examples["text"], padding="max_length", truncation=True)

tokenized_datasets = dataset.map(tokenize_function, batched=True)

# Split the dataset into training and evaluation sets
train_dataset = tokenized_datasets["train"]
eval_dataset = tokenized_datasets["test"]

# Training arguments
training_args = TrainingArguments(
    output_dir="./results",
    evaluation_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=3,
    weight_decay=0.01,
    logging_dir="./logs",
)

# Initialize Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    tokenizer=tokenizer,
)

# Train the model
trainer.train()

# Evaluate the model
results = trainer.evaluate()
print(f"Evaluation results: {results}")

Explanation:

  • We load the IMDb dataset for sentiment analysis, which contains movie reviews labeled as positive or negative.
  • The tokenize_function is used to preprocess the text data, making it compatible with the Llama model.
  • We set up the Trainer class with the model and training parameters, and fine-tune the model on the sentiment dataset.
  • Finally, the model is evaluated on a test set.

Example 3: Question Answering with Llama Model

In this example, we fine-tune a Llama model for a Question Answering task.

from transformers import LlamaForQuestionAnswering, LlamaTokenizer
from datasets import load_dataset

# Load pre-trained model and tokenizer for Question Answering
model_name = "meta-llama-7b-hf"
model = LlamaForQuestionAnswering.from_pretrained(model_name)
tokenizer = LlamaTokenizer.from_pretrained(model_name)

# Load the SQuAD dataset
dataset = load_dataset("squad")

# Preprocess the dataset (tokenization)
def preprocess_data(examples):
    return tokenizer(examples["question"], examples["context"], truncation=True, padding="max_length")

tokenized_data = dataset.map(preprocess_data, batched=True)

# Example Question and Context
context = "The capital of France is Paris, a city known for its culture and history."
question = "What is the capital of France?"

# Tokenize the inputs
inputs = tokenizer(question, context, return_tensors="pt")

# Get model outputs
outputs = model(**inputs)
start_scores = outputs.start_logits
end_scores = outputs.end_logits

# Get the most likely start and end positions of the answer
start_idx = torch.argmax(start_scores)
end_idx = torch.argmax(end_scores)

# Decode the answer
answer = tokenizer.decode(inputs["input_ids"][0][start_idx:end_idx+1], skip_special_tokens=True)
print(f"Answer: {answer}")

Explanation:

  • This code uses the SQuAD dataset (Stanford Question Answering Dataset) for fine-tuning the model for a question answering task.
  • The model is fine-tuned using the LlamaForQuestionAnswering class from Hugging Face's Transformers library.
  • After fine-tuning, we provide a context and a question, and the model predicts the answer by identifying the most likely span of text in the context.

4. Real-Life Example: Text Generation for Customer Support Chatbot

In this example, we use the pre-trained Llama model for generating responses in a customer support chatbot. The chatbot takes customer queries and generates text responses.

  1. Objective: Use a pre-trained Llama model to simulate a customer support agent.
  2. Use Case: Handle queries like "How do I reset my password?" or "Where can I find my order history?"
from transformers import LlamaForCausalLM, LlamaTokenizer

# Load pre-trained model and tokenizer
model_name = "meta-llama-7b-hf"
model = LlamaForCausalLM.from_pretrained(model_name)
tokenizer = LlamaTokenizer.from_pretrained(model_name)

# Simulate customer query
customer_query = "How do I reset my password?"

# Tokenize input
inputs = tokenizer(customer_query, return_tensors="pt")

# Generate response
response = model.generate(inputs["input_ids"], max_length=50)

# Decode and print the response
generated_response = tokenizer.decode(response[0], skip_special_tokens=True)
print(f"Customer Support Response: {generated_response}")

Explanation:

  • A customer query is tokenized and passed to the pre-trained Llama model.
  • The model generates a response based on the input query.
  • This setup can be scaled to create intelligent chatbots that can handle a wide variety of queries.

5. Summary

  • Hugging Face's Transformers provides an easy way to deploy and fine-tune Llama models for practical applications.
  • Llama models can be utilized for tasks such as text generation, sentiment analysis, question answering, and chatbot development.
  • Fine-tuning these models on task-specific datasets allows them to adapt and excel in real-world applications.
  • Hugging Face makes model deployment and API integration simple, enabling businesses and developers to leverage powerful NLP models easily.

6. Homework/Practice

  1. Fine-tune a Llama model on a custom dataset for a real-world application (e.g., email classification, FAQ answering).
  2. Build a small chatbot using Llama for customer support, implementing features such as handling product-related questions.
  3. Explore Hugging Face’s Model Hub to experiment with different models and tasks.
  4. Investigate how to deploy a fine-tuned Llama model using Hugging Face’s Inference API.

This concludes Hour 11 on Practical Applications with Llama and Hugging Face Models.

No comments:

Post a Comment

OpenWebUI - Beginner's Tutorial

  OpenWebUI Tutorial: Setting Up and Using Local Llama 3.2 with Ollama Introduction This tutorial provides a step-by-step guide to setting...