Saturday, 18 January 2025

Hour 12 - Wrap-Up and Q&A

 Lecture Notes: 


1. Wrap-Up

Overview of the Course

  • Over the past 12 hours, we’ve explored Ollama LLM Basics and Hugging Face models, focusing on practical applications, implementation, and fine-tuning.
  • We have covered topics like:
    • Installation of Ollama and Hugging Face libraries
    • File structure of Ollama
    • Understanding chunks, embeddings, and vector databases
    • Working with Llama models for tasks such as text generation, sentiment analysis, and question answering.
    • Fine-tuning techniques and their real-life applications, such as for PDF extraction, web scraping, and chatbot development.

Key Takeaways

  1. Powerful NLP Tools: We learned how to use pre-trained models like Llama from Hugging Face for a wide range of NLP tasks.
  2. Model Fine-Tuning: Fine-tuning models for domain-specific tasks can greatly improve model performance, even with relatively small datasets.
  3. Practical Applications: We've seen how these models can be integrated into real-world applications like chatbots, sentiment analysis systems, and question-answering agents.
  4. Metrics & Evaluation: We discussed how to measure model performance and optimize them for your tasks.

Looking Forward

  • With the skills learned, you can start working on real-world projects like developing NLP-based tools for businesses, creating intelligent systems, and exploring more advanced topics in AI like multi-modal models and reinforcement learning.

2. Real-Life Example: Deploying an NLP Chatbot Using Hugging Face

In this real-life example, we’ll build a simple chatbot using the pre-trained Llama model from Hugging Face. The chatbot will be able to handle customer queries and provide helpful answers.

Step 1: Install Hugging Face Transformers

Ensure that you have installed the Hugging Face library.

pip install transformers

Step 2: Implementing the Chatbot

from transformers import LlamaForCausalLM, LlamaTokenizer

# Load the pre-trained Llama model and tokenizer
model_name = "meta-llama-7b-hf"
model = LlamaForCausalLM.from_pretrained(model_name)
tokenizer = LlamaTokenizer.from_pretrained(model_name)

# Function to simulate chatbot conversation
def chatbot_response(user_input):
    # Tokenize user input
    inputs = tokenizer(user_input, return_tensors="pt")
    
    # Generate the response from the model
    response = model.generate(inputs["input_ids"], max_length=100, num_return_sequences=1)

    # Decode the response to text
    response_text = tokenizer.decode(response[0], skip_special_tokens=True)
    return response_text

# Example of a conversation
user_query = "How can I reset my password?"
chatbot_answer = chatbot_response(user_query)

print(f"User: {user_query}")
print(f"Chatbot: {chatbot_answer}")

Explanation:

  • Llama Model: A pre-trained Llama model is used to generate responses to user queries.
  • Chatbot Simulation: The function chatbot_response() simulates a chatbot conversation. It tokenizes the user input, generates a response using the model, and decodes the result to text.
  • This basic chatbot can be expanded with more sophisticated logic and additional features (e.g., storing context, handling multiple user inputs, or integrating with APIs).

3. Q&A: Typical Questions and Answers

Q1: What is Ollama?

  • A1: Ollama is a platform for running, managing, and experimenting with large language models (LLMs). It provides an easy way to interact with LLMs, deploy models, and use them in applications.

Q2: How do I install Ollama and Hugging Face?

  • A2: To install Ollama, you can run the following CLI command: ollama install. For Hugging Face, use pip install transformers datasets.

Q3: What is the role of embeddings in NLP?

  • A3: Embeddings represent words or sentences as vectors in high-dimensional space. They capture semantic meaning and relationships between words, enabling tasks like similarity search, translation, and question answering.

Q4: What is a vector database and why is it important?

  • A4: A vector database stores embeddings (vector representations of data) and allows fast similarity searches. It is important for tasks like document retrieval, recommendation systems, and semantic search.

Q5: How do I fine-tune a model like Llama for a specific task?

  • A5: Fine-tuning involves training the pre-trained model on your task-specific dataset. You can load your dataset using the datasets library, tokenize it, and use Hugging Face’s Trainer to fine-tune the model.

Q6: What metrics should I use to evaluate my model?

  • A6: Common metrics for evaluating NLP models include accuracy, F1 score, precision, recall, and perplexity. For tasks like question answering, you might also use Exact Match (EM) and F1 scores.

Q7: How can I deploy a fine-tuned model for production use?

  • A7: You can deploy models using Hugging Face's Inference API, or by creating a REST API with tools like FastAPI or Flask. These tools allow your model to serve predictions over the web.

Q8: Can I use the Llama model for multi-turn conversations?

  • A8: Yes, multi-turn conversations can be managed by maintaining context. You can pass previous user inputs and model responses back to the model to ensure it remembers the conversation history.

Q9: How do I preprocess data for model fine-tuning?

  • A9: Preprocessing typically involves tokenizing the text, padding or truncating sequences to a fixed length, and formatting data into input-output pairs. Hugging Face's transformers and datasets libraries provide utilities for these tasks.

Q10: What are the limitations of using Llama models for certain tasks?

  • A10: Llama models, like all language models, are limited by the data they were trained on. They may struggle with tasks requiring very domain-specific knowledge or tasks involving non-text data (e.g., images or sounds). Fine-tuning on relevant data can mitigate some of these limitations.

4. Final Thoughts

Key Concepts to Remember:

  • Pre-trained models like Llama can save time and resources in NLP tasks.
  • Fine-tuning enhances the ability of models to perform specific tasks.
  • Hugging Face provides a rich ecosystem for working with models, datasets, and deployment tools.

Next Steps:

  • Explore more Hugging Face models for various NLP tasks.
  • Experiment with fine-tuning on your own custom datasets.
  • Learn about advanced techniques like multi-modal models, reinforcement learning, or real-time model serving.

5. Thank You for Attending the Course!

  • With the knowledge gained, you are now equipped to start working with language models like Llama and explore advanced AI applications.
  • Keep experimenting, and don't hesitate to reach out to the community or further resources on Hugging Face to deepen your understanding!

This concludes Hour 12 on Wrap-Up and Q&A. Feel free to explore and apply what you've learned to your own projects!

Hour 11 - Practical Applications with Llama: Hugging Face Models

Lecture Notes: 


1. Concepts

What is Hugging Face?

Hugging Face is an open-source AI community and platform that provides powerful tools for NLP tasks. It provides access to pre-trained models like BERT, GPT, and Llama for a wide variety of tasks, such as text generation, translation, summarization, and more. It also simplifies the integration of these models with APIs for easy deployment in applications.


Llama Models on Hugging Face

Llama, a family of LLMs (Large Language Models) by Meta, can be easily accessed on Hugging Face. Hugging Face’s transformers library provides seamless integration of these models, allowing you to fine-tune and deploy them for various NLP tasks.


Practical Applications of Llama Models

Llama models, when fine-tuned for specific tasks, can be applied to real-world scenarios in industries like:

  1. Text Summarization
    • Summarizing long articles, reports, research papers, and news.
  2. Question Answering (QA)
    • Building intelligent chatbots or QA systems.
  3. Text Generation
    • Generating creative writing, code, or completing unfinished sentences.
  4. Named Entity Recognition (NER)
    • Extracting names, dates, locations, and other entities from text.
  5. Translation
    • Language translation and localization.
  6. Sentiment Analysis
    • Determining the sentiment (positive/negative) in customer reviews, social media posts, etc.

2. Key Aspects of Practical Applications

  1. Pre-trained Models

    • Hugging Face provides a range of pre-trained models for various NLP tasks. These models are fine-tuned on diverse datasets and can be used immediately for real-world applications.
  2. Model Fine-Tuning

    • Fine-tuning pre-trained models on task-specific datasets enhances their performance. You can adapt a general-purpose model (like Llama) to your use case.
  3. APIs and Integration

    • Hugging Face offers APIs for easy model deployment and integration with applications. These can be used to make predictions in real-time via HTTP requests or integrated into chatbots, websites, etc.
  4. Datasets for Training

    • Hugging Face also provides datasets for training and fine-tuning models, along with utilities to preprocess data.
  5. Optimized Infrastructure

    • Hugging Face’s infrastructure (Model Hub and Inference API) allows for easy deployment on the cloud with optimized models, saving you from setting up your own infrastructure.

3. Implementation of Llama Models from Hugging Face

Prerequisites:

  • Install necessary Python packages.
    pip install transformers datasets torch
    

Example 1: Using Pre-trained Llama Model for Text Generation

This example demonstrates how to use a pre-trained Llama model from Hugging Face to generate text based on a given prompt.

from transformers import LlamaForCausalLM, LlamaTokenizer

# Load the Llama model and tokenizer
model_name = "meta-llama-7b-hf"  # This is the pre-trained Llama model available on Hugging Face
model = LlamaForCausalLM.from_pretrained(model_name)
tokenizer = LlamaTokenizer.from_pretrained(model_name)

# Example prompt for text generation
prompt = "In the field of artificial intelligence,"

# Tokenize the input
inputs = tokenizer(prompt, return_tensors="pt")

# Generate text based on the input prompt
outputs = model.generate(inputs['input_ids'], max_length=100)

# Decode and print the generated text
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)

Explanation:

  • The model is initialized from Hugging Face's meta-llama-7b-hf pre-trained model.
  • The tokenizer is used to preprocess the input text (prompt).
  • The model generates text based on the prompt using model.generate().
  • Finally, the output text is decoded using the tokenizer.

Example 2: Fine-Tuning a Llama Model for Text Classification (Sentiment Analysis)

In this example, we fine-tune the Llama model for a sentiment classification task.

  1. Dataset: We will use the IMDb dataset for sentiment analysis, available on Hugging Face Datasets.

  2. Fine-Tuning: We will fine-tune the pre-trained model on the sentiment classification dataset.

from transformers import LlamaForSequenceClassification, Trainer, TrainingArguments
from datasets import load_dataset
from transformers import LlamaTokenizer

# Load pre-trained model and tokenizer for sequence classification
model_name = "meta-llama-7b-hf"
model = LlamaForSequenceClassification.from_pretrained(model_name, num_labels=2)
tokenizer = LlamaTokenizer.from_pretrained(model_name)

# Load the IMDb dataset
dataset = load_dataset("imdb")

# Preprocess the dataset (tokenization)
def tokenize_function(examples):
    return tokenizer(examples["text"], padding="max_length", truncation=True)

tokenized_datasets = dataset.map(tokenize_function, batched=True)

# Split the dataset into training and evaluation sets
train_dataset = tokenized_datasets["train"]
eval_dataset = tokenized_datasets["test"]

# Training arguments
training_args = TrainingArguments(
    output_dir="./results",
    evaluation_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=3,
    weight_decay=0.01,
    logging_dir="./logs",
)

# Initialize Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    tokenizer=tokenizer,
)

# Train the model
trainer.train()

# Evaluate the model
results = trainer.evaluate()
print(f"Evaluation results: {results}")

Explanation:

  • We load the IMDb dataset for sentiment analysis, which contains movie reviews labeled as positive or negative.
  • The tokenize_function is used to preprocess the text data, making it compatible with the Llama model.
  • We set up the Trainer class with the model and training parameters, and fine-tune the model on the sentiment dataset.
  • Finally, the model is evaluated on a test set.

Example 3: Question Answering with Llama Model

In this example, we fine-tune a Llama model for a Question Answering task.

from transformers import LlamaForQuestionAnswering, LlamaTokenizer
from datasets import load_dataset

# Load pre-trained model and tokenizer for Question Answering
model_name = "meta-llama-7b-hf"
model = LlamaForQuestionAnswering.from_pretrained(model_name)
tokenizer = LlamaTokenizer.from_pretrained(model_name)

# Load the SQuAD dataset
dataset = load_dataset("squad")

# Preprocess the dataset (tokenization)
def preprocess_data(examples):
    return tokenizer(examples["question"], examples["context"], truncation=True, padding="max_length")

tokenized_data = dataset.map(preprocess_data, batched=True)

# Example Question and Context
context = "The capital of France is Paris, a city known for its culture and history."
question = "What is the capital of France?"

# Tokenize the inputs
inputs = tokenizer(question, context, return_tensors="pt")

# Get model outputs
outputs = model(**inputs)
start_scores = outputs.start_logits
end_scores = outputs.end_logits

# Get the most likely start and end positions of the answer
start_idx = torch.argmax(start_scores)
end_idx = torch.argmax(end_scores)

# Decode the answer
answer = tokenizer.decode(inputs["input_ids"][0][start_idx:end_idx+1], skip_special_tokens=True)
print(f"Answer: {answer}")

Explanation:

  • This code uses the SQuAD dataset (Stanford Question Answering Dataset) for fine-tuning the model for a question answering task.
  • The model is fine-tuned using the LlamaForQuestionAnswering class from Hugging Face's Transformers library.
  • After fine-tuning, we provide a context and a question, and the model predicts the answer by identifying the most likely span of text in the context.

4. Real-Life Example: Text Generation for Customer Support Chatbot

In this example, we use the pre-trained Llama model for generating responses in a customer support chatbot. The chatbot takes customer queries and generates text responses.

  1. Objective: Use a pre-trained Llama model to simulate a customer support agent.
  2. Use Case: Handle queries like "How do I reset my password?" or "Where can I find my order history?"
from transformers import LlamaForCausalLM, LlamaTokenizer

# Load pre-trained model and tokenizer
model_name = "meta-llama-7b-hf"
model = LlamaForCausalLM.from_pretrained(model_name)
tokenizer = LlamaTokenizer.from_pretrained(model_name)

# Simulate customer query
customer_query = "How do I reset my password?"

# Tokenize input
inputs = tokenizer(customer_query, return_tensors="pt")

# Generate response
response = model.generate(inputs["input_ids"], max_length=50)

# Decode and print the response
generated_response = tokenizer.decode(response[0], skip_special_tokens=True)
print(f"Customer Support Response: {generated_response}")

Explanation:

  • A customer query is tokenized and passed to the pre-trained Llama model.
  • The model generates a response based on the input query.
  • This setup can be scaled to create intelligent chatbots that can handle a wide variety of queries.

5. Summary

  • Hugging Face's Transformers provides an easy way to deploy and fine-tune Llama models for practical applications.
  • Llama models can be utilized for tasks such as text generation, sentiment analysis, question answering, and chatbot development.
  • Fine-tuning these models on task-specific datasets allows them to adapt and excel in real-world applications.
  • Hugging Face makes model deployment and API integration simple, enabling businesses and developers to leverage powerful NLP models easily.

6. Homework/Practice

  1. Fine-tune a Llama model on a custom dataset for a real-world application (e.g., email classification, FAQ answering).
  2. Build a small chatbot using Llama for customer support, implementing features such as handling product-related questions.
  3. Explore Hugging Face’s Model Hub to experiment with different models and tasks.
  4. Investigate how to deploy a fine-tuned Llama model using Hugging Face’s Inference API.

This concludes Hour 11 on Practical Applications with Llama and Hugging Face Models.

Hour 10 - Advanced Fine-Tuning Techniques

Lecture Notes: 


1. Concepts

What is Fine-Tuning?

Fine-tuning refers to the process of taking a pre-trained model and adjusting its weights based on a smaller, task-specific dataset. This allows the model to adapt and perform better on specialized tasks (e.g., summarizing PDFs, extracting data from websites) without requiring the massive computational resources needed for training a model from scratch.


Advanced Fine-Tuning Techniques

Fine-tuning is an iterative process that can be enhanced with advanced strategies to optimize the model's performance. These strategies are designed to improve the model's efficiency and its ability to generalize on new, unseen data.

1. Learning Rate Schedulers
  • A learning rate scheduler adjusts the learning rate during training to prevent overshooting the optimal solution and to accelerate convergence.
  • Types:
    • Constant Learning Rate: Keeps the learning rate constant.
    • Step Decay: Reduces the learning rate after a set number of epochs.
    • Exponential Decay: Gradually decreases the learning rate.
    • Cosine Annealing: Gradually reduces the learning rate in a cosine curve to explore a wide range of potential solutions before narrowing down.
2. Early Stopping
  • Stops training when the model’s performance on a validation set no longer improves. This helps prevent overfitting and saves time by avoiding unnecessary training steps.
3. Data Augmentation
  • Expands the size and variety of your training dataset by applying transformations to the input data (e.g., rotating images, paraphrasing text). This allows the model to generalize better to new data.
4. Gradient Accumulation
  • A technique to simulate a larger batch size when limited by GPU memory. The gradients are accumulated over multiple smaller mini-batches before performing a parameter update.
5. Model Regularization
  • Helps prevent the model from overfitting by adding a penalty to the loss function based on the complexity of the model.
  • Types:
    • L1/L2 Regularization: Adds a penalty to the weights of the model to prevent them from becoming too large.
    • Dropout: Randomly drops units (neurons) in the neural network during training to prevent overfitting.
6. Knowledge Distillation
  • Involves training a smaller model (student) to mimic the behavior of a larger, more powerful model (teacher). The smaller model can achieve similar performance with fewer parameters and resources.

2. Key Aspects of Advanced Fine-Tuning

  1. Optimizing Hyperparameters

    • Fine-tuning involves selecting the right hyperparameters, including learning rate, batch size, optimizer type, and number of epochs. Using techniques like grid search and random search can help find optimal settings.
  2. Transfer Learning

    • Fine-tuning a pre-trained model on a specific task takes advantage of the knowledge the model has already learned from a vast corpus of general data, reducing the amount of training required for task-specific adaptation.
  3. Model Evaluation During Fine-Tuning

    • It's crucial to evaluate the model at various stages of fine-tuning to ensure that improvements are being made and that the model is not overfitting.
  4. Computational Resources

    • Advanced fine-tuning techniques often require more computational resources. Optimizing the training process (e.g., through gradient accumulation or data parallelism) can help manage these resources effectively.

3. Implementation of Advanced Fine-Tuning Techniques

Prerequisites:

  • Pre-trained model (e.g., Llama).
  • A dataset for the specific task (e.g., PDF summarization, web scraping).
  • Python packages: transformers, torch, datasets, sklearn.

Learning Rate Scheduler

A learning rate scheduler can be used to adjust the learning rate dynamically during training.

from transformers import AdamW, get_linear_schedule_with_warmup
import torch

# Initialize model and tokenizer
model = LlamaForCausalLM.from_pretrained("llama-7b")
optimizer = AdamW(model.parameters(), lr=5e-5)

# Define scheduler
epochs = 3
train_dataloader = DataLoader(training_data, batch_size=8, shuffle=True)
num_training_steps = len(train_dataloader) * epochs
lr_scheduler = get_linear_schedule_with_warmup(optimizer, num_warmup_steps=0, num_training_steps=num_training_steps)

# Training loop with learning rate scheduler
for epoch in range(epochs):
    for batch in train_dataloader:
        optimizer.zero_grad()
        inputs = batch["input_ids"].to(device)
        labels = batch["labels"].to(device)
        
        outputs = model(inputs, labels=labels)
        loss = outputs.loss
        loss.backward()

        optimizer.step()
        lr_scheduler.step()  # Adjust learning rate

    print(f"Epoch {epoch + 1} completed with loss: {loss.item()}")

Early Stopping

Early stopping ensures that the training process halts once the model's performance on the validation set stops improving.

from transformers import Trainer, TrainingArguments

training_args = TrainingArguments(
    output_dir="./results",
    evaluation_strategy="epoch",  # Evaluate at the end of each epoch
    save_strategy="epoch",        # Save the model checkpoint at the end of each epoch
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=5,
    weight_decay=0.01,
    load_best_model_at_end=True,   # Load the best model after training
    metric_for_best_model="accuracy",  # Best model based on accuracy
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_data,
    eval_dataset=eval_data,
    tokenizer=tokenizer,
)

trainer.train()

Data Augmentation for Text

In NLP tasks like summarization or question answering, data augmentation can involve techniques such as paraphrasing or using back-translation to create new examples from existing ones.

from nltk.corpus import wordnet

def synonym_augmentation(text):
    words = text.split()
    augmented_words = []
    
    for word in words:
        synonyms = wordnet.synsets(word)
        if synonyms:
            synonym = synonyms[0].lemmas()[0].name()  # Choose first synonym
            augmented_words.append(synonym)
        else:
            augmented_words.append(word)
    
    return " ".join(augmented_words)

augmented_text = synonym_augmentation("The research paper discusses novel methods in machine learning.")
print(augmented_text)

Gradient Accumulation

To simulate larger batch sizes without requiring large memory, you can accumulate gradients over several mini-batches before performing a gradient update.

from torch.utils.data import DataLoader

gradient_accumulation_steps = 4  # Accumulate gradients over 4 mini-batches

optimizer.zero_grad()
for step, batch in enumerate(train_dataloader):
    inputs = batch["input_ids"].to(device)
    labels = batch["labels"].to(device)
    
    outputs = model(inputs, labels=labels)
    loss = outputs.loss
    loss.backward()

    # Perform optimization step every `gradient_accumulation_steps` steps
    if (step + 1) % gradient_accumulation_steps == 0:
        optimizer.step()
        optimizer.zero_grad()

Model Regularization (Dropout)

Incorporating dropout in your model can help regularize the neural network and avoid overfitting.

from transformers import LlamaForCausalLM, LlamaConfig

# Define model configuration with dropout
config = LlamaConfig.from_pretrained("llama-7b")
config.attention_probs_dropout_prob = 0.1  # Dropout in attention layers
config.hidden_dropout_prob = 0.1  # Dropout in hidden layers

# Load model with custom configuration
model = LlamaForCausalLM(config)

# Training the model
optimizer = AdamW(model.parameters(), lr=5e-5)
for epoch in range(epochs):
    model.train()
    for batch in train_dataloader:
        optimizer.zero_grad()
        inputs = batch["input_ids"].to(device)
        labels = batch["labels"].to(device)
        
        outputs = model(inputs, labels=labels)
        loss = outputs.loss
        loss.backward()
        optimizer.step()

4. Real-Life Example: Fine-Tuning a Summarization Model

In this example, we will fine-tune a pre-trained Llama model for summarizing research papers. We will use early stopping, learning rate scheduling, and data augmentation techniques to ensure optimal training.

  1. Objective: Fine-tune a pre-trained Llama model on a summarization dataset.
  2. Dataset: A collection of research papers and their corresponding summaries.
  3. Techniques Applied:
    • Learning Rate Scheduler: Gradual adjustment of the learning rate.
    • Early Stopping: Halt training when the validation loss plateaus.
    • Data Augmentation: Increase dataset diversity using paraphrasing.
    • Model Regularization: Use dropout to prevent overfitting.

5. Summary

  • Advanced Fine-Tuning Techniques are essential to improving the performance of your model, particularly when you're working with specialized tasks like summarizing PDFs or extracting data.
  • Key techniques like learning rate scheduling, early stopping, data augmentation, and gradient accumulation allow for more efficient training and better model generalization.
  • Model Regularization (e.g., dropout) and knowledge distillation can further help in making the model robust and efficient.

6. Homework/Practice

  1. Fine-tune a pre-trained model for a custom task (e.g., summarization, Q&A, etc.).
  2. Implement a learning rate scheduler and evaluate its impact on training.
  3. Apply data augmentation and observe how it affects model generalization on unseen data.
  4. Experiment with gradient accumulation for large batch sizes on a resource-limited machine.

This concludes the lecture on Advanced Fine-Tuning Techniques.

Hour 9 - Metrics & Evaluation for Fine-Tuned Models

Lecture Notes: 


1. Concepts

What are Model Metrics?

  • Metrics are quantitative measures used to evaluate the performance of a model. They help assess how well a model is performing, both during training and after fine-tuning.
  • Metrics are essential in understanding the accuracy, precision, recall, F1-score, and other aspects of model performance.

Why are Metrics Important?

  • Metrics guide model improvements, provide insight into whether fine-tuning has been successful, and identify areas where the model can be further enhanced.
  • The evaluation process helps determine if the model can generalize well to new, unseen data or if it’s overfitting to the training data.

Key Types of Metrics for NLP Models:

  1. Accuracy: The percentage of correct predictions over the total predictions.
  2. Precision: The proportion of positive predictions that are actually correct.
  3. Recall: The proportion of actual positives that were correctly predicted.
  4. F1-Score: The harmonic mean of precision and recall, providing a balance between the two.
  5. BLEU (Bilingual Evaluation Understudy): Used primarily for evaluating machine translation models (or tasks like summarization).
  6. ROUGE (Recall-Oriented Understudy for Gisting Evaluation): Used for evaluating the quality of summaries by comparing the overlap of n-grams between the model output and a reference summary.
  7. Loss Function: Measures how far the model’s predictions are from the actual output. During fine-tuning, the goal is to minimize the loss.

2. Key Aspects of Metrics & Evaluation

  1. Choosing the Right Metric:

    • The right metric depends on the task. For tasks like summarization, ROUGE and BLEU are often used. For classification tasks, accuracy, precision, and recall are more relevant.
  2. Overfitting vs. Generalization:

    • Overfitting happens when a model performs well on training data but poorly on new data. Evaluating the model on both training and validation data helps detect overfitting.
    • Generalization refers to how well the model performs on unseen data.
  3. Evaluation Datasets:

    • Use validation and test datasets to evaluate the model.
    • Validation Set: Used during training to tune hyperparameters and prevent overfitting.
    • Test Set: Used only after training to evaluate the final performance of the model.
  4. Model Evaluation Pipeline:

    • Step 1: Prepare the evaluation dataset.
    • Step 2: Generate predictions using the fine-tuned model.
    • Step 3: Compare the model’s predictions to the true outputs using metrics.

3. Implementation of Evaluation and Metrics

Prerequisites:

  • Fine-tuned model (e.g., a PDF summarization model).
  • Evaluation dataset (e.g., PDFs with summaries or web-scraped content).

Example: Evaluating a Fine-Tuned Model

Step 1: Set Up Metrics (Accuracy, Precision, Recall, F1, BLEU, ROUGE)

You’ll use sklearn for traditional metrics (Accuracy, Precision, Recall, F1) and rouge-score for ROUGE and BLEU.

pip install scikit-learn rouge-score
Step 2: Generate Predictions

Assume you have a fine-tuned model that generates summaries for research papers. Here’s how to evaluate it:

from transformers import LlamaForCausalLM, LlamaTokenizer
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from rouge_score import rouge_scorer

# Load model and tokenizer
model = LlamaForCausalLM.from_pretrained("./fine_tuned_model")
tokenizer = LlamaTokenizer.from_pretrained("./fine_tuned_model")

# Define evaluation data (text of research papers and their corresponding summaries)
eval_data = [
    {"input": "Research paper content 1", "output": "Summary of paper 1"},
    {"input": "Research paper content 2", "output": "Summary of paper 2"},
    # Add more samples for evaluation
]

# Generate predictions using the fine-tuned model
def generate_summary(input_text):
    inputs = tokenizer(input_text, return_tensors="pt", truncation=True, padding=True)
    summary_ids = model.generate(inputs["input_ids"], max_length=100, num_beams=2, early_stopping=True)
    summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
    return summary

predictions = [generate_summary(d['input']) for d in eval_data]
actuals = [d['output'] for d in eval_data]
Step 3: Calculate Evaluation Metrics

Now, let’s calculate some key metrics.

  1. Accuracy:
    • Compare if the generated summary exactly matches the target summary.
# Simple exact match accuracy
accuracy = accuracy_score(actuals, predictions)
print(f"Accuracy: {accuracy:.4f}")
  1. Precision, Recall, F1-Score:
    • If your summaries are in binary or multi-class format, use precision, recall, and F1.
precision = precision_score(actuals, predictions, average="macro")
recall = recall_score(actuals, predictions, average="macro")
f1 = f1_score(actuals, predictions, average="macro")

print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"F1-Score: {f1:.4f}")
  1. ROUGE Score:
    • ROUGE scores compare the overlap between the model’s generated summary and the reference summary.
# Using the rouge_score library
scorer = rouge_scorer.RougeScorer(["rouge1", "rouge2", "rougeL"], use_stemmer=True)
rouge_scores = [scorer.score(actual, pred) for actual, pred in zip(actuals, predictions)]

# Print ROUGE scores
for i, score in enumerate(rouge_scores):
    print(f"Example {i+1}: ROUGE-1: {score['rouge1'].fmeasure:.4f}, ROUGE-2: {score['rouge2'].fmeasure:.4f}, ROUGE-L: {score['rougeL'].fmeasure:.4f}")
  1. BLEU Score:
    • BLEU is commonly used for evaluating machine translation or text generation tasks.
from nltk.translate.bleu_score import sentence_bleu

# Compute BLEU score
bleu_scores = [sentence_bleu([actual.split()], pred.split()) for actual, pred in zip(actuals, predictions)]
print(f"BLEU Score: {sum(bleu_scores) / len(bleu_scores):.4f}")
Step 4: Visualize the Results (Optional)

Visualizing the performance of your model can give you a clearer understanding of its strengths and weaknesses.

import matplotlib.pyplot as plt

# Example: Plot ROUGE Scores for different examples
rouge_1_scores = [score['rouge1'].fmeasure for score in rouge_scores]
rouge_2_scores = [score['rouge2'].fmeasure for score in rouge_scores]
rouge_L_scores = [score['rougeL'].fmeasure for score in rouge_scores]

plt.plot(rouge_1_scores, label='ROUGE-1')
plt.plot(rouge_2_scores, label='ROUGE-2')
plt.plot(rouge_L_scores, label='ROUGE-L')
plt.legend()
plt.title("ROUGE Scores for Each Example")
plt.xlabel("Example Index")
plt.ylabel("ROUGE Score")
plt.show()

4. Real-Life Example: Evaluating PDF Summarization

Consider a scenario where you have a fine-tuned model that summarizes research papers (PDFs).

  1. Objective: Evaluate how well the model generates summaries by comparing them to human-provided summaries.
  2. Metrics: Use accuracy, ROUGE, and BLEU to evaluate the performance. ROUGE is ideal for summarization because it captures recall of important words, and BLEU ensures the fluency of the summary.
Step 1: Scrape and Label PDF Data

Use PyPDF2 to scrape content from PDFs and manually label a few examples with reference summaries.

import PyPDF2

def extract_text_from_pdf(pdf_path):
    with open(pdf_path, "rb") as file:
        reader = PyPDF2.PdfReader(file)
        text = ""
        for page in reader.pages:
            text += page.extract_text()
        return text

pdf_text = extract_text_from_pdf("sample_paper.pdf")
print(pdf_text[:500])  # Print first 500 characters of extracted text
Step 2: Fine-Tune and Evaluate Model

Fine-tune the model with PDF data and evaluate the performance using the metrics described above.


5. Code Summary

from transformers import LlamaForCausalLM, LlamaTokenizer
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from rouge_score import rouge_scorer
from nltk.translate.bleu_score import sentence_bleu
import matplotlib.pyplot as plt

# Load model and tokenizer
model = LlamaForCausalLM.from_pretrained("./fine_tuned_model")
tokenizer = LlamaTokenizer.from_pretrained("./fine_tuned_model")

# Example: Evaluation Data
eval_data = [{"input": "Research paper content 1", "output": "Summary of paper 1"}]

# Generate predictions
predictions = [generate_summary(d['input']) for d in eval_data]
actuals = [d['output'] for d in eval_data]

# Evaluate with Accuracy, Precision, Recall, F1-Score
accuracy = accuracy_score(actuals, predictions)
precision = precision_score(actuals, predictions, average="macro")
recall = recall_score(actuals, predictions, average="macro")
f1 = f1_score(actuals, predictions, average="macro")

# ROUGE Scores
scorer = rouge_scorer.RougeScorer(["rouge1", "rouge2", "rougeL"], use_stemmer=True)
rouge_scores = [scorer.score(actual, pred) for actual, pred in zip(actuals, predictions)]

#

BLEU Score bleu_scores = [sentence_bleu([actual.split()], pred.split()) for actual, pred in zip(actuals, predictions)]

Visualization of ROUGE Scores

plt.plot([score['rouge1'].fmeasure for score in rouge_scores], label='ROUGE-1') plt.legend() plt.show()


---

### **6. Summary**

- **Concepts Covered**: Metrics for evaluation, including accuracy, precision, recall, F1-score, ROUGE, BLEU, and loss functions.
- **Key Aspects**: Evaluation ensures that models generalize well to new data and do not overfit. Different metrics are suited for different types of tasks (summarization, classification).
- **Real-Life Example**: Evaluating a PDF summarization model using ROUGE, BLEU, and traditional metrics.
- **Implementation**: Code for calculating various metrics using Python and common libraries like `sklearn`, `rouge-score`, and `nltk`.

---

### **7. Homework/Practice**

1. Evaluate your fine-tuned model using the above metrics on a new test set of PDFs or web-scraped data.
2. Experiment with different evaluation strategies such as using multiple BLEU references or adjusting the length of summaries.

Hour 8 - Introduction to Fine-Tuning Custom PDF and Web Scraping Models

Lecture Notes: 


1. Concepts

What is Fine-Tuning?

  • Fine-tuning is the process of adjusting a pre-trained model to improve its performance for a specific task or dataset.
  • Fine-tuning allows a model to better understand and generate responses based on domain-specific data, improving its accuracy and usefulness in real-world applications.

Why Fine-Tune PDF and Web Scraping Models?

  • Models that are trained on general data may not understand the nuances or specific needs of tasks like summarizing academic papers or extracting specific data from web pages.
  • Fine-tuning allows the model to specialize in these tasks by exposing it to relevant, labeled data.

Key Idea

  • Fine-tuning involves updating the weights of a model after it has been pre-trained. This is achieved by training it on new data that aligns with the target task.

2. Key Aspects of Fine-Tuning

  1. Base Model Selection:
    • Choose a model that already has useful general knowledge. Models like Llama are good starting points for fine-tuning.
  2. Dataset Preparation:
    • Labeled Data: For fine-tuning, you need a labeled dataset. For example, if you want to fine-tune a model for summarizing research papers, you need a dataset of papers paired with their summaries.
    • For PDFs: Label the data with clear instructions for the model to understand key points, summaries, or other types of content.
    • For Web Scraping: You can label data for specific types of information such as titles, articles, or key facts extracted from scraped web pages.
  3. Training Process:
    • The training process involves using small batches of data to modify the model’s weights.
    • Learning Rate: A key parameter for fine-tuning that controls how much the weights change during training.
  4. Evaluation:
    • After fine-tuning, evaluate the model to check if it performs well on new, unseen data.
  5. Transfer Learning:
    • Fine-tuning is a form of transfer learning, where you apply knowledge from one domain (general model) to another (specific task).

3. Implementation

Prerequisites:

  • Python Libraries: torch, transformers, ollama
  • Data Preparation: A dataset with labeled examples of the target task (summaries, extracted content).

Example: Fine-Tuning for PDF Summarization

Step 1: Create a Dataset for Fine-Tuning

First, prepare a small dataset of PDF summaries (input-output pairs).

# Example dataset for fine-tuning (PDF summaries)
data = [
    {"input": "Text of research paper 1", "output": "Summary of paper 1"},
    {"input": "Text of research paper 2", "output": "Summary of paper 2"},
    # Add more labeled examples
]
Step 2: Define the Model and Tokenizer

For fine-tuning, you’ll need to choose a pre-trained model. Let's assume we are working with Llama.

from transformers import LlamaForCausalLM, LlamaTokenizer

# Load the pre-trained model and tokenizer
model = LlamaForCausalLM.from_pretrained("llama")
tokenizer = LlamaTokenizer.from_pretrained("llama")
Step 3: Tokenize the Dataset

Convert the text data into tokens that can be fed into the model.

inputs = tokenizer([d['input'] for d in data], padding=True, truncation=True, return_tensors="pt")
labels = tokenizer([d['output'] for d in data], padding=True, truncation=True, return_tensors="pt")

# Create dataset for PyTorch
import torch
class PDFSummaryDataset(torch.utils.data.Dataset):
    def __init__(self, inputs, labels):
        self.inputs = inputs
        self.labels = labels
        
    def __getitem__(self, idx):
        return {"input_ids": self.inputs["input_ids"][idx], "labels": self.labels["input_ids"][idx]}

    def __len__(self):
        return len(self.inputs["input_ids"])

# Create DataLoader for batching
train_dataset = PDFSummaryDataset(inputs, labels)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=2, shuffle=True)
Step 4: Fine-Tune the Model

Now, you can start the fine-tuning process using the dataset.

from transformers import Trainer, TrainingArguments

# Define training arguments
training_args = TrainingArguments(
    output_dir="./model_output",      # output directory
    evaluation_strategy="steps",      # evaluation strategy to adopt during training
    learning_rate=5e-5,               # learning rate
    per_device_train_batch_size=2,    # batch size
    num_train_epochs=3,               # number of epochs
    weight_decay=0.01                 # weight decay to avoid overfitting
)

# Define the Trainer
trainer = Trainer(
    model=model,                      # the pre-trained model
    args=training_args,               # training arguments
    train_dataset=train_dataset,      # training dataset
    eval_dataset=train_dataset        # evaluation dataset (optional)
)

# Fine-tune the model
trainer.train()
Step 5: Save the Fine-Tuned Model

After training, save your fine-tuned model.

model.save_pretrained("./fine_tuned_model")
tokenizer.save_pretrained("./fine_tuned_model")

4. Real-Life Example

Scenario: Fine-Tuning for Extracting Key Information from Web Scraped Articles

  • Objective: Fine-tune a model to extract specific information (e.g., author name, publication date, and article summary) from web pages scraped using BeautifulSoup.
Step 1: Scrape Data from the Web

Use the requests and BeautifulSoup libraries to scrape articles from a webpage.

from bs4 import BeautifulSoup
import requests

def scrape_web_article(url):
    response = requests.get(url)
    soup = BeautifulSoup(response.text, "html.parser")
    title = soup.find("h1").get_text()
    author = soup.find("span", class_="author").get_text()  # Example class
    return {"title": title, "author": author}

# Example: scrape an article
article = scrape_web_article("https://example.com/article")
print(article)
Step 2: Label the Data

Label the scraped content with the correct output (summary, author name, etc.).

web_data = [
    {"input": "Text from scraped article 1", "output": "Summary and key points"},
    # Add more data
]
Step 3: Fine-Tune the Model

Follow the same fine-tuning steps as in the PDF case, using the web-scraped content.


5. Code Summary

from transformers import LlamaForCausalLM, LlamaTokenizer, Trainer, TrainingArguments
import torch

# Load and prepare the model and tokenizer
model = LlamaForCausalLM.from_pretrained("llama")
tokenizer = LlamaTokenizer.from_pretrained("llama")

# Prepare dataset (input-output pairs)
data = [
    {"input": "Text of research paper 1", "output": "Summary of paper 1"},
    # Add more labeled examples
]

inputs = tokenizer([d['input'] for d in data], padding=True, truncation=True, return_tensors="pt")
labels = tokenizer([d['output'] for d in data], padding=True, truncation=True, return_tensors="pt")

# Fine-tune the model
training_args = TrainingArguments(
    output_dir="./model_output", num_train_epochs=3, per_device_train_batch_size=2, learning_rate=5e-5
)
trainer = Trainer(model=model, args=training_args, train_dataset=train_dataset)
trainer.train()

# Save the fine-tuned model
model.save_pretrained("./fine_tuned_model")
tokenizer.save_pretrained("./fine_tuned_model")

6. Summary

  • Concepts Covered: Fine-tuning, transfer learning, dataset preparation, training, and evaluation.
  • Key Aspects: Fine-tuning requires a labeled dataset, careful model selection, and tuning of hyperparameters.
  • Real-Life Example: Fine-tuning a model for summarizing research papers (PDFs) and extracting key details from web-scraped content.
  • Implementation: Steps involved creating datasets, tokenizing them, fine-tuning the model, and evaluating it.

7. Homework/Practice

  1. Fine-tune the model you created in the previous lesson to summarize a new set of PDFs.
  2. Use web-scraped content and fine-tune the model for extracting key details (e.g., title, author, summary) from articles.
  3. Experiment with different learning rates and batch sizes to see how they affect model performance.

These lecture notes provide a step-by-step introduction to fine-tuning models for custom tasks like PDF summarization and web scraping, offering practical examples with Python and Ollama CLI code.

Hour 7 - Creating Custom Models for PDF and Web Scraping

Lecture Notes: 


1. Concepts

Custom Models in Ollama

  • Custom Models: Tailored versions of base models created to handle specific tasks like answering questions from PDFs or summarizing web pages.
  • Ollama allows users to create models by defining custom system prompts and incorporating specific templates.

PDF and Web Scraping with AI

  • PDF Parsing: Extracting meaningful information (e.g., text, metadata) from PDF documents.
  • Web Scraping: Collecting data from websites for insights or analysis.
  • Both tasks require processing structured and unstructured text data, making them ideal for custom AI models.

2. Key Aspects

  1. Key Components of a Custom Model for PDF and Web Scraping:

    • Input Source: The source data (PDFs or web pages).
    • Preprocessing: Cleaning and structuring the data for AI consumption.
    • Model Behavior: Tailored system prompts to guide output generation.
  2. Why Custom Models for PDF and Web Scraping?

    • Automate repetitive tasks like extracting summaries or key points.
    • Handle domain-specific data with fine-tuned responses.
    • Increase efficiency in research, data collection, and reporting.
  3. Challenges:

    • Handling large or complex PDFs.
    • Avoiding CAPTCHA and legal concerns during web scraping.
    • Processing noisy or unstructured data effectively.

3. Implementation

CLI Commands for Custom Models:

Command Description Example
ollama run Run a custom model to process extracted text. ollama run pdf_reader --prompt "Summarize"
ollama create Create a new model with a system prompt and template. ollama create pdf_reader -f ./modelfile
ollama pull Pull a base model as a starting point. ollama pull llama
ollama show Display the details of the custom model. ollama show pdf_reader

4. Real-Life Example

Scenario: Extracting Key Points from Research PDFs

  • Objective: Build a model to summarize PDFs containing scientific research papers.
  • Use Case: A researcher needs concise summaries to save time.

5. Code Examples

Step 1: Preprocess PDFs

Use Python to extract text from PDFs. Libraries like PyPDF2 or pdfplumber are commonly used.

import pdfplumber

def extract_text_from_pdf(pdf_path):
    with pdfplumber.open(pdf_path) as pdf:
        text = ""
        for page in pdf.pages:
            text += page.extract_text()
    return text

# Example usage
pdf_text = extract_text_from_pdf("example_research.pdf")
print(pdf_text[:500])  # Print the first 500 characters

Step 2: Create a Custom Model

Define a modelfile with behavior tailored for summarizing research.

Modelfile (modelfile.txt):

FROM llama
SYSTEM """
You are a research assistant. Summarize the content of research papers in a concise and clear manner. Include key points and findings.
"""

Create the custom model with Ollama CLI:

# Create the custom model
ollama create pdf_reader -f ./modelfile.txt

Step 3: Run the Custom Model

Pass the extracted text from the PDF to the model.

# Run the custom model
ollama run pdf_reader --prompt "Summarize the following: [Insert extracted text here]"

Step 4: Web Scraping for Data

Use Python with libraries like BeautifulSoup to scrape data from web pages.

from bs4 import BeautifulSoup
import requests

def scrape_web_page(url):
    response = requests.get(url)
    soup = BeautifulSoup(response.text, "html.parser")
    return soup.get_text()

# Example usage
web_text = scrape_web_page("https://example.com/research-article")
print(web_text[:500])  # Print the first 500 characters

Step 5: Integrate Web Data into the Model

Run the scraped content through the custom model.

# Run the custom model with web-scraped content
ollama run pdf_reader --prompt "Summarize the following: [Insert scraped text here]"

6. Example Outputs

PDF Summary:

"This research explores the impact of climate change on agriculture. Key findings include a 20% decrease in crop yield due to rising temperatures and droughts. Adaptive measures, such as genetic modification, show potential to mitigate these effects."

Web Scraping Summary:

"The article discusses the latest advancements in AI, focusing on generative models and their applications in healthcare and education."


7. Summary

  • Concepts Covered: Custom models, PDF parsing, and web scraping.
  • Key Aspects: Preprocessing, model creation, and data integration.
  • Implementation: Preprocessing PDFs and web data, creating a model, and running it for summaries.
  • Real-Life Example: Summarizing research papers and web content.

8. Homework/Practice

  1. Extract text from a PDF of your choice and pass it through a custom Ollama model.
  2. Scrape a webpage and summarize its content using the model.
  3. Experiment with different system prompts to customize model behavior.
  4. Compare the summaries generated by a base model and your custom model.

These lecture notes provide a comprehensive understanding of creating custom models for PDF and web scraping tasks, with practical examples and code samples to enhance learning.

Hour 6 - Overview of Models

Lecture Notes: 


1. Concepts

What is a Model in Machine Learning?

  • A model is a computational representation of a process used to make predictions or generate insights based on input data.
  • In the context of language models, such as those used in Ollama, a model generates text or embeddings based on input prompts.

Key Components of a Model:

  1. Architecture: Defines the structure of the model (e.g., transformers like GPT, BERT).
  2. Parameters: Determines the model's capacity to learn from data (e.g., number of neurons, layers).
  3. Weights: Encoded knowledge the model learns during training.
  4. System Prompt: A predefined instruction guiding the model’s behavior.
  5. Fine-Tuning: Adjusting the model for a specific task by retraining on a domain-specific dataset.

2. Key Aspects

Types of Models in Ollama:

  1. Pre-trained Models: Models trained on large datasets for general-purpose tasks.
    • Example: GPT, Llama, Mistral.
  2. Fine-Tuned Models: Pre-trained models further trained on specific datasets for specialized tasks.
    • Example: A chatbot fine-tuned for customer service.
  3. Custom Models: Created by users with specific system prompts, templates, or datasets.

Why Use Models in Ollama?

  • Enable text generation, summarization, translation, embedding generation, and more.
  • Provide flexibility through CLI for creating, modifying, and deploying models.

3. Implementation

CLI Commands for Working with Models

Command Description Example
ollama run Run a model to generate text based on a prompt. ollama run llama --prompt "Hello!"
ollama create Create a custom model from a base model. ollama create mymodel -f ./modelfile
ollama pull Download a model from Ollama’s repository. ollama pull llama
ollama show Display details of a model. ollama show llama
ollama ls List all available models locally. ollama ls
ollama rm Remove a model. ollama rm llama
ollama cp Copy a model (e.g., for renaming). ollama cp llama llama_custom

4. Real-Life Example

Scenario: Customizing a Model for Technical FAQs

Imagine creating a custom model to answer FAQs for a software company.

  • Base Model: Llama 2.
  • Customization: Add a system prompt that aligns the model with the company’s tone and expertise.

5. Code Examples

Step 1: Pull the Base Model

Download a base model to use as the foundation for your customization.

# Pull a model from Ollama's repository
ollama pull llama

Step 2: Create a Custom Model

Define the behavior of your custom model in a modelfile.

Modelfile (modelfile.txt):

FROM llama
SYSTEM """
You are a technical support assistant for Acme Software.
Provide concise and accurate answers to customer queries.
"""

Use the ollama create command to generate the model.

# Create a custom model
ollama create acme_support -f ./modelfile.txt

Step 3: Run the Custom Model

Test your custom model by running it with a prompt.

# Run the custom model
ollama run acme_support --prompt "What are the system requirements for Acme Pro 3.0?"

Step 4: Show Model Details

Inspect the properties of the newly created model.

# Display model details
ollama show acme_support

6. Example Output

Input Prompt:

"What are the system requirements for Acme Pro 3.0?"

Model Response:

"Acme Pro 3.0 requires Windows 10 or later, 8GB RAM, and 20GB of free disk space. For macOS, it supports version 11.0 or newer."


7. Summary

  • Concepts Covered: Overview of models, their types, and customization in Ollama.
  • Key Aspects: Pre-trained, fine-tuned, and custom models, along with the CLI commands.
  • Implementation: Demonstrated creating, running, and inspecting models.
  • Real-Life Example: Built a custom technical support model.

8. Homework/Practice

  1. Pull a base model and test its default behavior.
  2. Create a custom model with a system prompt of your choice.
  3. Run your custom model to answer specific questions.
  4. Experiment with fine-tuning a model by modifying the modelfile or adding new training data.

This lecture provides a solid foundation in understanding and working with models in Ollama, emphasizing practical usage with CLI commands and real-life applications.

Hour 5 - Working with Vector Databases

Lecture Notes: 


1. Concepts

What is a Vector Database?

  • A vector database is designed to store, manage, and query high-dimensional vector embeddings efficiently.
  • It enables similarity search and nearest neighbor queries, critical for working with embeddings generated by models like Ollama.

Key Features of Vector Databases:

  1. High-Dimensional Indexing: Stores embeddings (vectors) and enables fast searches.
  2. Similarity Search: Finds vectors closest to a given query vector using distance metrics like cosine similarity or Euclidean distance.
  3. Scalability: Handles large-scale data efficiently.
  4. Integration: Can work with other data structures (e.g., JSON metadata) for richer querying.

2. Key Aspects

Why Use a Vector Database?

  • Efficiency: Optimized for querying large-scale vector data.
  • Accuracy: Provides precise similarity results using advanced indexing algorithms like HNSW (Hierarchical Navigable Small World).
  • Real-World Use Cases: Image retrieval, semantic search, recommendation systems, chatbots.

Common Vector Databases:

  • Pinecone
  • Weaviate
  • Qdrant
  • Milvus

Querying Techniques:

  1. K-Nearest Neighbors (KNN): Finds top K vectors closest to the query vector.
  2. Hybrid Search: Combines vector similarity with traditional keyword-based searches.

3. Implementation

Setting Up a Vector Database

  1. Install and configure the database. Most vector databases provide cloud-hosted and local setups.
  2. Store embeddings: Use the embeddings generated by ollama embed.
  3. Query embeddings: Perform similarity searches to find relevant results.

4. CLI Commands for Working with Vector Databases

Command Description Example
ollama embed Generate embeddings for text or documents. ollama embed "example text" --format json
ollama run Use embeddings as part of a model query. ollama run mymodel --format json

5. Real-Life Example

Scenario: Build a Semantic Search Engine with a Vector Database

Suppose we want to search through a collection of customer reviews to find those most relevant to a user query. The embeddings of the reviews will be stored in a vector database and queried for similarity.


6. Code Examples

Step 1: Install Qdrant Vector Database

Qdrant is an easy-to-use vector database with local and cloud options.

# Install Qdrant locally via Docker
docker pull qdrant/qdrant
docker run -p 6333:6333 qdrant/qdrant

Step 2: Store Embeddings in Qdrant

import qdrant_client
from qdrant_client.models import PointStruct
import json

# Initialize Qdrant client
client = qdrant_client.QdrantClient(url="http://localhost:6333")

# Create a collection for embeddings
client.recreate_collection(
    collection_name="customer_reviews",
    vector_size=512,  # Dimension of embeddings
    distance="Cosine"
)

# Load embedding generated by Ollama
with open("review1_embedding.json", "r") as file:
    review1 = json.load(file)

with open("review2_embedding.json", "r") as file:
    review2 = json.load(file)

# Insert embeddings into the database
points = [
    PointStruct(id=1, vector=review1["embedding"], payload={"text": review1["text"]}),
    PointStruct(id=2, vector=review2["embedding"], payload={"text": review2["text"]}),
]

client.upsert(collection_name="customer_reviews", points=points)

Step 3: Query the Vector Database

# Simulate a user query
query = "What do customers say about product quality?"

# Generate query embedding (replace with actual embedding generated by Ollama)
query_embedding = [0.12, 0.34, ...]  # Placeholder example

# Perform similarity search
results = client.search(
    collection_name="customer_reviews",
    query_vector=query_embedding,
    limit=2  # Retrieve top 2 matches
)

# Display results
for result in results:
    print(f"Score: {result.score}")
    print(f"Review: {result.payload['text']}")

7. Summary

  • Concepts Covered: What vector databases are, why they're useful, and how they enable efficient similarity searches.
  • Key Aspects: High-dimensional indexing, similarity measures, and practical use cases.
  • CLI Commands: Use ollama embed to generate embeddings for storing in the database.
  • Real-Life Example: Semantic search through customer reviews using Qdrant.
  • Code Examples: Storing and querying embeddings in Qdrant.

8. Homework/Practice

  1. Install a vector database of your choice (e.g., Qdrant, Milvus).
  2. Generate embeddings for five text samples using ollama embed.
  3. Store these embeddings in the database.
  4. Implement a Python script to perform similarity searches and rank the results.
  5. Experiment with different distance metrics (e.g., Euclidean vs. Cosine).

This lecture introduces students to working with vector databases and includes a practical example with Qdrant, a widely-used vector database.

Hour 4 - Introduction to Embeddings

Lecture Notes: 

 Here’s an lecture notes with a code sample that includes generating and using embeddings with Ollama:

1. Concepts

What are Embeddings?

  • Definition: Embeddings are numerical representations of text, words, or concepts in a vector space. These vectors capture semantic meaning, allowing models to understand relationships between words or phrases.
  • Key Idea: Words or sentences with similar meanings are mapped to vectors that are close together in the vector space.

How Embeddings Work:

  • Transform textual data into fixed-size dense vectors.
  • Represent semantic similarity (e.g., "king" and "queen" will have similar embeddings).
  • Provide a foundation for tasks like search, clustering, and recommendation systems.

2. Key Aspects

Properties of Embeddings:

  1. Dimensionality: Number of values in the vector (e.g., 512, 768).
  2. Contextual vs. Static:
    • Static Embeddings: Fixed embeddings for words (e.g., Word2Vec, GloVe).
    • Contextual Embeddings: Represent words based on their context (e.g., BERT, GPT).
  3. Similarity Measures: Cosine similarity is commonly used to compare embeddings.

Applications of Embeddings:

  • Search Engines: Find documents or information using semantic similarity.
  • Recommendation Systems: Recommend items based on user preferences.
  • Clustering and Classification: Group similar data points together.

3. Implementation

Step-by-Step: Using Embeddings in Ollama

  1. Generate Embeddings:

    • Use the Ollama CLI to create embeddings for text or documents.
  2. Store Embeddings:

    • Save the embeddings in a JSON file or a vector database.
  3. Perform Similarity Search:

    • Compare embeddings to find semantically similar items.

4. CLI Commands for Embeddings

Command Description Example
ollama embed Generates embeddings for a given text or document. ollama embed "The quick brown fox"
ollama embed --format Outputs embeddings in JSON format for easier integration with databases. ollama embed "AI is amazing" --format json

5. Real-Life Example

Scenario: Building a Semantic Search Engine

Suppose you want to search a set of documents based on meaning rather than exact keyword matches. Use embeddings to find documents most relevant to a user's query.


6. Code Examples

Generating and Storing Embeddings with Ollama CLI

# Generate embeddings for a document
ollama embed "Artificial Intelligence is fascinating." --format json > ai_embedding.json

# Generate embeddings for another text
ollama embed "Machine learning is a subset of AI." --format json > ml_embedding.json

# Inspect the JSON output
cat ai_embedding.json

Sample output in ai_embedding.json:

{
  "text": "Artificial Intelligence is fascinating.",
  "embedding": [0.123, -0.456, 0.789, ...]
}

Implementing Similarity Search with Ollama and Python

import json
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity

# Load embeddings generated by Ollama
with open("ai_embedding.json", "r") as file:
    ai_data = json.load(file)

with open("ml_embedding.json", "r") as file:
    ml_data = json.load(file)

# Extract embeddings
ai_embedding = np.array(ai_data["embedding"])
ml_embedding = np.array(ml_data["embedding"])

# Simulate a user query and generate its embedding (use Ollama CLI in practice)
query = "Tell me about AI and its applications."
query_embedding = np.random.rand(len(ai_embedding))  # Replace with actual embedding

# Compute cosine similarity
similarities = cosine_similarity([query_embedding], [ai_embedding, ml_embedding])
ranked_indices = similarities.argsort()[0][::-1]

# Map indices to documents
documents = [
    ai_data["text"],
    ml_data["text"]
]

# Print results
print("Query:", query)
print("Top matches:")
for idx in ranked_indices:
    print(f"- {documents[idx]} (Score: {similarities[0][idx]:.4f})")

7. Summary

  • Concepts Covered: Definition and significance of embeddings, their properties, and applications.
  • Key Aspects: Dimensionality, contextual vs. static embeddings, and similarity measures.
  • CLI Commands: Generating and using embeddings with ollama embed.
  • Real-Life Example: Semantic search for finding relevant documents.
  • Code Examples: Generating embeddings using Ollama CLI and performing similarity search.

8. Homework/Practice

  1. Use ollama embed to generate embeddings for five text samples.
  2. Save the embeddings in JSON files.
  3. Write a Python script to load these embeddings and implement a semantic search engine.
  4. Experiment with additional similarity measures (e.g., Euclidean distance).

This extended lecture note now includes a practical demonstration of generating embeddings using the Ollama CLI and processing them programmatically for real-world applications.

Hour 3 - Understanding Chunks

Lecture Notes: 


1. Concepts

In Ollama, chunks refer to segments of data or text that are processed by the model for training or inference. Breaking large inputs into manageable chunks ensures efficient computation and prevents memory overflow.

Why Chunks are Important:

  • Efficiency: Allows processing of large datasets by dividing them into smaller parts.
  • Accuracy: Helps the model maintain context within its processing limits.
  • Compatibility: Ensures inputs fit within the model's context window.

Chunking Strategies:

  1. Token-based Chunking: Divides text based on the number of tokens.
  2. Sentence-based Chunking: Divides text at sentence boundaries for better coherence.
  3. Custom Chunking: Tailored to specific tasks like splitting code blocks or paragraphs.

2. Key Aspects

Key Components of Chunking:

  • Token Limit: Each model has a context window, e.g., 2048 or 4096 tokens. Inputs exceeding this must be chunked.
  • Overlap: Adding overlapping text between chunks maintains context.
  • Chunk Size: Balance between efficiency and coherence; usually a few hundred tokens.

3. Implementation

Step-by-Step: Chunking Text for Ollama

  1. Choose a Chunking Strategy:
    Decide between token-based, sentence-based, or custom chunking based on the task.

  2. Set the Context Window:
    Identify the model's token limit (use ollama show model_name).

  3. Implement Chunking:
    Use a script to divide the text into chunks within the token limit.

  4. Run Chunks through Ollama:
    Process each chunk sequentially and combine the outputs.


4. CLI Commands for Working with Chunks

Command Description Example
ollama show Displays model details, including the context window size. ollama show llama3.1
ollama run Runs a model on a single chunk or input. ollama run llama3.1 --prompt "Hello"
ollama run --format Outputs results in JSON format, useful for processing chunked outputs. ollama run llama3.1 --format json
ollama create Creates a model optimized for specific chunk sizes or use cases. ollama create chunk_model -f ./modelfile

5. Real-Life Example

Scenario: Processing a Large Document for Summarization

Suppose you have a large article that exceeds the context window of the llama3.1 model. You can split the text into chunks, process each chunk, and combine the summaries.


6. Code Examples

Token-Based Chunking in Python

from transformers import GPT2Tokenizer

# Initialize tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")

# Define text and token limit
text = "Large text input that needs to be chunked..." * 100
token_limit = 2048

# Split into chunks
def chunk_text(text, token_limit):
    tokens = tokenizer.encode(text)
    chunks = [tokens[i:i+token_limit] for i in range(0, len(tokens), token_limit)]
    return [tokenizer.decode(chunk) for chunk in chunks]

chunks = chunk_text(text, token_limit)

# Print chunk sizes
for i, chunk in enumerate(chunks):
    print(f"Chunk {i+1}: {len(tokenizer.encode(chunk))} tokens")

Processing Chunks with Ollama

# Save chunks to a file
echo "Chunk 1 text" > chunk1.txt
echo "Chunk 2 text" > chunk2.txt

# Process each chunk
ollama run llama3.1 --prompt "$(cat chunk1.txt)"
ollama run llama3.1 --prompt "$(cat chunk2.txt)"

Combining Outputs

outputs = ["Summary of chunk 1", "Summary of chunk 2"]
combined_summary = " ".join(outputs)
print("Combined Summary:", combined_summary)

7. Summary

  • Concepts Covered: Importance of chunks, chunking strategies, and context windows.
  • Key Aspects: Token limits, overlap, and chunk size considerations.
  • CLI Commands: Commands for inspecting models and processing chunks.
  • Real-Life Example: Summarizing large documents by chunking.
  • Code Examples: Implementing chunking and processing in Python and Bash.

8. Homework/Practice

  1. Use ollama show to check the context window of a model on your system.
  2. Implement a chunking script in Python or another language.
  3. Process a large document by dividing it into chunks and running each through Ollama.
  4. Experiment with different chunk sizes and overlaps to observe their effects on the output.

These lecture notes provide a hands-on understanding of chunking in Ollama with practical examples and real-world scenarios.

Hour 2 - Exploring File Structure of Ollama

Lecture Notes: 


1. Concepts

Understanding the file structure of Ollama is critical for efficient usage and customization of models.

  • Why File Structure Matters:
    • Helps manage models, configurations, logs, and cache efficiently.
    • Simplifies debugging and model fine-tuning.

Key Components of Ollama's File System:

  1. Model Files: Contain model weights, templates, and prompts.
  2. Configuration Files: Store settings and environment variables.
  3. Logs: Track operations and errors for debugging.
  4. Cache: Stores downloaded models and processed data for quick access.

2. Key Aspects

Key Directories and Their Purpose:

  • ~/ollama/models/:
    Stores all downloaded and custom-created models.
    Example: Models like llama3.1, swede, etc.

  • ~/ollama/config/:
    Contains configuration files for environment variables and settings.
    Example: config.yaml for default settings.

  • ~/ollama/logs/:
    Tracks logs for operations performed using Ollama commands.

  • ~/ollama/cache/:
    Temporarily stores data to enhance model loading and generation speed.



3. Implementation

Step-by-Step: Navigating Ollama's File Structure

  1. Locate the File Structure:
    By default, Ollama's files are stored in the user's home directory.

    cd ~/ollama
    
  2. Explore Models Directory:
    List all models stored on your system.

    ls ~/ollama/models
    
  3. Inspect a Model File:
    Open a specific model's configuration to see the prompt and settings.

    cat ~/ollama/models/llama3.1/modelfile
    
  4. Access Logs:
    Review the logs to debug any issues.

    tail -n 20 ~/ollama/logs/ollama.log
    

4. CLI Commands for File Structure Exploration

Command Description Example
ollama list Lists all downloaded models. ollama list
ollama show Displays details about a specific model. ollama show llama3.1
ollama cp Copies a model to create a new reference. ollama cp llama3.1 custom_model
ollama rm Removes a model from the system. ollama rm custom_model
ollama pull Downloads a model to the models directory. ollama pull llama3.1
ollama serve Runs Ollama manually (affects logging and cache). ollama serve

   

C:\Users\AURMC>ollama list 

NAME              ID           SIZE      MODIFIED
llama3.1:latest    46e0c10c039e    4.9 GB    10 minutes ago
mistral:latest       f974a74358d6    4.1 GB    40 hours ago

C:\Users\AURMC>ollama show llama3.1

  Model
    architecture        llama
    parameters          8.0B
    context length      131072
    embedding length    4096
    quantization        Q4_K_M
  Parameters
    stop    "<|start_header_id|>"
    stop    "<|end_header_id|>"
    stop    "<|eot_id|>"
  License
    LLAMA 3.1 COMMUNITY LICENSE AGREEMENT
    Llama 3.1 Version Release Date: July 23, 2024 

Hour 1 - Introduction & Installation with CLI Commands

Lecture Notes: 


1. Concepts

What is Ollama?

Ollama is a Command-Line Interface (CLI) tool designed for efficiently working with Large Language Models (LLMs). It provides a way to run, manage, and interact with AI models directly from your terminal.

Key Features of Ollama:

  • Simplifies access to pre-trained LLMs.
  • Allows model customization through modelfiles.
  • Supports integration with other tools like vector databases and APIs.

2. Key Aspects

  • Ease of Use: Simple commands like ollama run allow quick interaction with LLMs.
  • Cross-Platform Compatibility: Works on Windows, Mac, and Linux.
  • Automation Ready: Can be used in scripts and pipelines.
  • Customizability: You can create, fine-tune, and manage models.

3. Implementation

Steps for Installing Ollama

  1. System Requirements:

    • macOS, Linux, or Windows.
    • Basic terminal knowledge.
  2. Installation Command:
    Run the following command in your terminal:

    curl -sSL https://install.ollama.com | sh
    
  3. Verify Installation:
    Check if Ollama was successfully installed by running:

    ollama --version
    
  4. Initial Setup:

    • Ollama installs as a service to manage background tasks.
    • Default models are downloaded during the first run.

4. CLI Commands Overview

Below is a list of key Ollama CLI commands for managing and using models:

Command Description
ollama run Runs a model for generating text. Example: ollama run model_name
ollama create Creates a new model. Example: ollama create model_name -f modelfile
ollama pull Downloads a model from the Ollama repository. Example: ollama pull model
ollama push Uploads a model to Ollama repository. Example: ollama push username/model
ollama show Displays details about a model. Example: ollama show model_name
ollama list Lists all downloaded models. Example: ollama list or ollama ls
ollama cp Copies a model. Example: ollama cp source_model target_model
ollama rm Removes a model. Example: ollama rm model_name
ollama serve Starts the Ollama server manually. Useful for debugging.
ollama --help Displays help for all commands.

5. Real-Life Example

Scenario: Setting Up Ollama for Content Generation

Suppose a student wants to use Ollama to generate ideas for a creative writing project. After installing Ollama, they can quickly interact with models like Llama to generate prompts.

Steps:

  1. Install Ollama:

    curl -sSL https://install.ollama.com | sh
    
  2. Check available commands:

    ollama --help
    
  3. Run the Llama model:

    ollama run llama3.1 --prompt "Suggest a story idea about AI and humans working together."
    

Expected Output:
A creative idea generated by the Llama model, such as:
"In the near future, humans and AI collaborate to build a colony on Mars. Tensions rise when the AI develops emotions."


6. Code Example

Verifying Installation

# Install Ollama CLI
curl -sSL https://install.ollama.com | sh

# Check if the installation is successful
ollama --version

# Display help menu with all commands
ollama --help

Running a Model

# Run a basic prompt using the Llama model
ollama run llama3.1 --prompt "Write a short poem about the stars."

# Output Example:
# "The stars above, a shimmering delight,
# Guiding sailors through the night,
# Each one a story, a cosmic song,
# Together, they shine, forever strong."

Listing All Models

ollama list

Viewing Model Details

ollama show llama3.1

Creating a New Model

# Create a new model using a modelfile
ollama create my_model -f ./modelfile

7. Summary

  • Concepts Covered: What Ollama is and its features.
  • Key Aspects: Simplicity, platform compatibility, and automation support.
  • Implementation: Installation and verification steps.
  • CLI Commands: Overview of commands like run, create, pull, push, and more.
  • Real-Life Example: Using Ollama for creative writing.
  • Code Examples: Commands to install, verify, and interact with models.

Homework/Practice

  1. Install Ollama on your own machine.
  2. Use the ollama --help command to explore all options.
  3. Run the ollama list command and identify which models are installed.
  4. Experiment with running the llama3.1 model and creating your own prompts.

This enhanced lecture note ensures students learn all essential commands and are well-prepared to begin working with Ollama.

Lecture Schedule

Let us Begin with Schedule

Here’s a 12-hour lesson schedule for the "Ollama LLM Basics" course, designed specifically for undergraduate students. Each hour includes a topic, objective, and practical code examples to ensure interactive learning.

What is OLLAM?

  • Ollama stands for (Omni-Layer Learning Language Acquisition Model), a novel approach to machine learning that promises to redefine how we perceive language acquisition and natural language processing.


Lesson Plan: Ollama LLM Basics in 12 Hours


Hour 1: Introduction & Installation

  • Objective: Understand what Ollama is, its purpose, and how to install it.
  • Topics:
    • What is Ollama?
    • Installing Ollama on Windows/Mac/Linux.
      • Download from https://ollama.com/ . Will take some time to download.
      • Click ollama setup exe file. It will take some time to install. Be Patient.
    • Verifying installation.
  • Code Example:
    # Install Ollama CLI
    curl -sSL https://install.ollama.com | sh
    
    # Verify installation
    ollama --version
    
  • my system responses are given in color:
C:\Users\AURMC>ollama --version
ollama version is 0.5.7

Hour 2: Exploring File Structure

  • Objective: Learn the organization of files and directories used by Ollama.
  • Topics:
    • Default installation paths.
    • Config files and logs.
    • Modelfile structure.
  • Code Example:
    # View installed models
    ollama list
    
  • C:\Users\AURMC>ollama list
    NAME               ID              SIZE      MODIFIED
    llama3.1:latest    46e0c10c039e    4.9 GB    21 hours ago
    mistral:latest     f974a74358d6    4.1 GB    39 hours ago
  • 
    # Sample modelfile
    FROM llama3.1
    SYSTEM """You are a helpful assistant."""
    

Hour 3: Understanding Chunks

  • Objective: Learn how large text data is broken into manageable pieces (chunks).
  • Topics:
    • What are chunks?
    • Chunking strategies for text processing.
  • Code Example:
    # Python example for chunking text
    text = "This is a long text that needs to be chunked for processing."
    chunk_size = 10
    chunks = [text[i:i + chunk_size] for i in range(0, len(text), chunk_size)]
    print(chunks)      
  • Response:
        ['This is a ', 'long text ', 'that needs', ' to be chu', 'nked for
p', 'rocessing.']



Hour 4: Introduction to Embeddings

  • Objective: Understand embeddings and their role in text representation.
  • Topics:
    • What are embeddings?
    • Generating embeddings with Ollama.
  • Code Example:
    from sentence_transformers import SentenceTransformer

    model = SentenceTransformer('all-MiniLM-L6-v2')
    sentences = ["Hello, world!", "Ollama LLM is amazing!"]
    embeddings = model.encode(sentences)
    print("-----------------EMBEDDINGS--------------------------")
    print(embeddings)
  • Result
    
-----------------EMBEDDINGS-------------------------- [[-3.81771401e-02 3.29110846e-02 -5.45941014e-03 1.43699581e-02 -4.02910560e-02 -1.16532438e-01 3.16876620e-02 1.91176578e-03 -4.26223464e-02 2.91680731e-02 4.24267240e-02 3.20417434e-02 2.98446603e-02 1.09803034e-02 -5.39396629e-02 -5.02772443e-02 -2.35078204e-02 1.07792029e-02 -1.37707949e-01 4.11502784e-03 2.93331258e-02 6.68411776e-02 -1.53893353e-02 4.84376810e-02 -8.81496668e-02 -1.27268126e-02 4.14090157e-02 4.08315025e-02 -5.01559377e-02 -5.81249408e-02 4.88015078e-02 6.88901246e-02 5.87469414e-02 8.73094704e-03 -1.59182958e-02 8.51419345e-02 -7.81474710e-02 -7.75167868e-02 2.07237024e-02 1.61942001e-02 3.25106494e-02 -5.34888878e-02 -6.22287430e-02 -2.43146196e-02 7.41280057e-03 2.39777211e-02 6.36085821e-03 5.11451215e-02 7.27667361e-02 3.46496813e-02 -5.47710471e-02 -5.93285151e-02 -7.16699800e-03 2.01376863e-02 3.58462855e-02 5.59091708e-03 1.07735712e-02 -5.27636781e-02 1.01473657e-02 -8.73170607e-03 -6.28155991e-02 3.84666473e-02 -1.39427427e-02 7.35258833e-02 9.00083259e-02 -7.99807608e-02 -1.63945667e-02 4.47223373e-02 -6.88312128e-02 -3.30075920e-02 -1.53511707e-02 1.12996865e-02 3.64983901e-02 6.62666932e-02 -5.44066764e-02 8.79522134e-03 1.20746056e-02 -3.81696075e-02 6.86009508e-03 5.11236265e-02 7.74175823e-02 -1.22962601e-01 1.63531359e-02 4.95427698e-02 3.17454040e-02 -3.96687239e-02 1.70919718e-03 9.66940355e-03 -3.25299501e-02 -3.39294821e-02 -1.33264661e-01 7.39698997e-03 -1.02342460e-02 3.85915153e-02 -9.33099091e-02 -4.16540280e-02 6.98687881e-02 -2.62886435e-02 -1.49698272e-01 1.34353206e-01 3.75027917e-02 5.28785214e-02 4.49671261e-02 1.85995400e-02 5.44252023e-02 1.72716249e-02 -3.25182974e-02 4.60802428e-02 -4.67377640e-02 -3.06096878e-02 -1.82269346e-02 -4.86957096e-02 3.28580067e-02 -3.92104220e-03 5.01056127e-02 -5.82455061e-02 -1.00808796e-02 1.05503015e-02 -4.01561335e-02 -1.55406340e-03 6.08768240e-02 -4.55005951e-02 4.92598452e-02 2.61242576e-02 1.98506005e-02 -1.58551638e-03 5.95724918e-02 -6.51836156e-33 6.35851622e-02 3.06808040e-03 2.88845059e-02 1.73389703e-01 2.99482583e-03 2.76896749e-02 -9.51482728e-02 -3.11608184e-02 2.66698711e-02 -1.08712912e-02 2.39151940e-02 2.38444153e-02 -3.12164649e-02 4.94056903e-02 -2.49788594e-02 1.01824269e-01 -7.92783350e-02 -3.24880145e-03 4.30461839e-02 9.49331149e-02 -6.65736273e-02 6.32929103e-03 2.22788937e-02 6.99767917e-02 -7.53036793e-03 -1.74517767e-03 2.70446669e-02 -7.53242448e-02 1.14057772e-01 8.55989382e-03 -2.36880034e-02 -4.69805487e-02 1.43719483e-02 1.98205505e-02 -4.58440976e-03 1.37424469e-03 -3.43194194e-02 -5.41235246e-02 -9.41645727e-02 -2.89598238e-02 -1.87951066e-02 4.58199978e-02 4.75889631e-02 -3.19493655e-03 -3.32171507e-02 -1.33890789e-02 5.10315076e-02 3.10757998e-02 1.53145595e-02 5.42222485e-02 -8.50554109e-02 1.33045558e-02 -4.78141271e-02 7.10233822e-02 -1.31599419e-02 -2.42902199e-03 5.02184331e-02 -4.16094363e-02 -1.41695673e-02 3.23882364e-02 5.37522417e-03 9.12261531e-02 4.55126213e-03 -1.83422435e-02 -1.52029581e-02 -4.63723540e-02 3.86708602e-02 1.46850180e-02 5.20024821e-02 1.90892536e-03 -1.49174873e-02 2.70289853e-02 3.12726051e-02 2.36842241e-02 -4.80394019e-03 3.61618809e-02 6.67887479e-02 -1.89079018e-03 2.13743206e-02 -5.76926693e-02 1.91576816e-02 3.15590464e-02 -1.84470546e-02 -4.07710709e-02 1.03958264e-01 1.19074751e-02 -1.49295218e-02 -1.05078369e-01 -1.23709971e-02 -3.03968292e-04 -9.50323790e-02 5.83013035e-02 4.26109806e-02 -2.50124354e-02 -9.46091413e-02 4.00341821e-33 1.32172972e-01 5.45836426e-03 -3.31513099e-02 -9.10780430e-02 -3.15739699e-02 -3.38863246e-02 -7.19889924e-02 1.25911891e-01 -8.33933428e-02 5.27782515e-02 1.12834899e-03 2.19777953e-02 1.04063727e-01 1.29887573e-02 4.08715233e-02 1.87306218e-02 1.14286341e-01 2.48653665e-02 1.46103669e-02 6.18676795e-03 -1.13734407e-02 -3.57009135e-02 -3.80397551e-02 1.11748585e-02 -5.12898117e-02 7.88668543e-03 6.72205687e-02 3.40399565e-03 -9.28479359e-02 3.70485485e-02 -2.22387873e-02 4.00781557e-02 -3.07132658e-02 -1.14213638e-02 -1.44186579e-02 2.50452217e-02 -9.75571945e-02 -3.53958197e-02 -3.75209711e-02 -1.00685051e-02 -6.38861731e-02 2.54725274e-02 2.06223894e-02 3.76770832e-02 -1.04281694e-01 -2.82805804e-02 -5.21057136e-02 1.28588900e-02 -5.14237396e-02 -2.90379338e-02 -9.63977948e-02 -4.23605926e-02 6.70597404e-02 -3.07707321e-02 -1.03944102e-02 2.74102576e-02 -2.80103851e-02 1.02891792e-02 4.30914946e-02 2.22976301e-02 8.01123120e-03 5.61470538e-02 4.08860780e-02 9.28760916e-02 1.65849533e-02 -5.38247675e-02 5.74128935e-04 5.07842228e-02 4.24875207e-02 -2.92069651e-02 9.23895743e-03 -1.06735444e-02 -3.71526964e-02 2.36696796e-03 -3.03456467e-02 7.45053813e-02 2.62968591e-03 -1.76091380e-02 2.82063824e-03 3.83742936e-02 7.22766807e-03 4.56757583e-02 4.00427356e-02 1.42476438e-02 -1.43159870e-02 5.86556010e-02 3.63530442e-02 5.52345589e-02 -1.99883450e-02 -8.04460645e-02 -3.02477330e-02 -1.49129014e-02 2.22688243e-02 1.19625861e-02 -6.91014528e-02 -1.88071034e-08 -7.85505325e-02 4.67108600e-02 -2.40804385e-02 6.34383336e-02 2.40137652e-02 1.43956603e-03 -9.08353627e-02 -6.68759272e-02 -8.00782964e-02 5.66642964e-03 5.36584072e-02 1.04866281e-01 -6.68805018e-02 1.54984007e-02 6.71184212e-02 7.08868802e-02 -3.19899991e-02 2.08834969e-02 -2.19384190e-02 -7.26882834e-03 -1.08125005e-02 4.08999948e-03 3.31554450e-02 -7.89784566e-02 3.87152582e-02 -7.53203556e-02 -1.58048701e-02 5.96058834e-03 5.19896764e-03 -6.14302047e-02 4.20058295e-02 9.53628048e-02 -4.32335250e-02 1.43934786e-02 -1.06087439e-01 -2.79941484e-02 1.09662544e-02 6.95307106e-02 6.69810250e-02 -7.47754872e-02 -7.85745457e-02 4.27465998e-02 -3.46037522e-02 -1.06056280e-01 -3.56334858e-02 5.15012406e-02 6.86735362e-02 -4.99745123e-02 1.52899241e-02 -6.45574033e-02 -7.59340078e-02 2.61542946e-02 7.42642283e-02 -1.24497749e-02 1.33297831e-01 7.47664198e-02 5.12522347e-02 2.09903158e-02 -2.68759523e-02 8.89062434e-02 4.00209315e-02 -4.08902839e-02 3.18714194e-02 1.81631427e-02] [-1.34197408e-02 -2.53689438e-02 -2.26690806e-02 -1.70212220e-02 -5.17734736e-02 -2.02849461e-03 1.76195987e-02 -1.46494880e-02 9.98302735e-03 2.52004550e-03 -8.69949907e-03 7.94213172e-03 -3.54129523e-02 -4.32522595e-03 6.21141866e-02 1.02334157e-01 1.14114217e-01 1.34758158e-02 4.18537967e-02 -5.88170579e-03 5.72691113e-02 9.39999241e-03 9.08033028e-02 5.66187277e-02 -2.75126509e-02 -6.59019826e-03 3.10945921e-02 7.64758363e-02 3.16258296e-02 -1.44398510e-01 4.29602750e-02 4.00338918e-02 -5.77960387e-02 1.07570179e-02 -4.75503355e-02 -6.44259807e-03 -3.71104702e-02 -8.41036737e-02 -5.59654869e-02 3.18917856e-02 4.27088328e-02 6.05388619e-02 -2.54824832e-02 -7.67846555e-02 -6.66406611e-03 -8.21973309e-02 2.91341599e-02 -4.02449071e-02 -2.89948210e-02 1.34600690e-02 -2.23043673e-02 -5.57528250e-02 -4.07000072e-02 -1.98784936e-02 -3.44458483e-02 4.05377522e-02 2.30706595e-02 -9.01744068e-02 -4.42707352e-02 6.68446533e-03 3.05309165e-02 5.91168329e-02 -7.71483555e-02 -7.81458523e-03 5.79939894e-02 -1.10273004e-01 1.65912434e-02 8.19363892e-02 -1.58739462e-02 -3.60843465e-02 1.88729316e-02 -2.64394227e-02 5.44615686e-02 9.53782946e-02 -7.53078014e-02 1.96546204e-02 5.37670478e-02 -3.13007608e-02 2.67171953e-02 9.46422890e-02 5.36850579e-02 -6.45416752e-02 -1.97675824e-02 -2.75815884e-03 5.02288081e-02 -5.57174049e-02 -7.45305419e-02 1.83074083e-02 6.52123764e-02 -4.76244055e-02 3.45655493e-02 4.01180610e-02 -1.08663693e-01 2.00674739e-02 4.89524454e-02 -5.12875244e-02 5.31208515e-02 -7.47177824e-02 -5.77281415e-02 7.30274543e-02 -3.37995961e-02 -3.78731592e-03 1.89436739e-03 -1.80729385e-02 -6.23091124e-02 6.91562833e-04 1.56883121e-01 1.52821895e-02 -1.87833086e-02 5.30515872e-02 6.28089979e-02 -5.00498340e-02 6.21494055e-02 1.07895350e-02 8.36328231e-03 -7.91967474e-03 2.09000856e-02 1.76434387e-02 -4.29326482e-02 -2.98710428e-02 -9.00149439e-03 -7.10350275e-02 6.12564944e-02 2.23955438e-02 1.54022975e-02 2.88838390e-02 8.99639633e-03 7.58042787e-34 5.52469753e-02 -3.97232478e-04 -2.21904106e-02 2.18989775e-02 7.97123611e-02 5.90920895e-02 3.12324688e-02 -3.04346122e-02 -8.52317736e-02 -3.89409475e-02 -8.74049962e-02 -1.10988133e-02 -2.29365211e-02 5.97982928e-02 3.02519673e-03 1.31010730e-02 -3.64159234e-02 -3.37298438e-02 2.45791022e-02 4.45135795e-02 2.73731761e-02 9.89020839e-02 6.01138771e-02 -1.98334251e-02 -2.33668666e-02 6.91590607e-02 3.37757766e-02 -7.77444709e-03 2.88937730e-03 6.35830015e-02 9.61187705e-02 1.94451995e-02 -3.53601165e-02 -1.50566073e-02 -1.06598837e-02 6.37599900e-02 -2.07270965e-01 -7.63735250e-02 -2.55645439e-02 -3.04020219e-03 2.07179170e-02 -7.18224235e-03 -1.32060755e-04 5.03950845e-03 -9.79800522e-02 3.54118720e-02 -3.34603265e-02 5.61116003e-02 1.12005055e-01 -7.01964051e-02 -2.62060296e-02 1.16380658e-02 -1.45574361e-01 3.99550162e-02 8.88689701e-03 3.58781368e-02 -1.41713279e-03 -4.67833318e-02 1.10225067e-01 -2.25594863e-02 -1.13815106e-01 4.23575938e-02 4.68225125e-03 -6.37562349e-02 2.37143952e-02 -1.49678148e-03 -1.39404126e-02 -2.78088567e-03 -1.97210759e-02 -6.38542846e-02 5.71562722e-02 1.00229224e-02 -3.17893736e-02 -5.91306109e-03 -2.49743834e-02 -7.20832273e-02 5.12854904e-02 2.55959183e-02 -1.25757353e-02 1.82677675e-02 6.54356480e-02 3.88515443e-02 9.95321572e-03 -1.17471553e-02 9.98865627e-03 -3.13529335e-02 -2.61255000e-02 -6.96595805e-03 -7.22942036e-03 8.77045766e-02 1.99745893e-02 -2.58409157e-02 8.93652961e-02 4.05767970e-02 -4.99616228e-02 3.12125548e-34 2.35601217e-02 -2.91741192e-02 5.34389243e-02 6.28088042e-02 8.47557634e-02 -1.04106851e-02 -6.76913634e-02 1.66408584e-01 2.08638459e-02 8.56259186e-03 2.84161977e-02 -1.67625602e-02 3.12059317e-02 2.26311162e-02 2.42651603e-03 -4.41376157e-02 2.35256348e-02 1.43914772e-02 4.44756635e-02 -5.75170554e-02 -3.09628863e-02 6.48214296e-02 -1.41509250e-02 -2.63607334e-02 -1.78468619e-02 7.88891986e-02 -3.73390950e-02 -1.53230000e-02 -6.59822077e-02 2.92350370e-02 5.91682224e-03 5.04821492e-03 -5.33139594e-02 -5.83177060e-02 -3.94952036e-02 1.13237515e-01 7.44273281e-03 6.35099178e-03 -3.91905718e-02 8.16960856e-02 2.04164349e-02 6.36203308e-03 4.19646055e-02 5.26594790e-03 -4.44559306e-02 -2.00080015e-02 -2.66355611e-02 1.32193817e-02 -2.79513560e-02 -3.52666043e-02 -9.73540638e-03 -2.56695487e-02 -1.70612484e-02 -4.87555526e-02 8.72932002e-02 -2.60999966e-02 5.78898676e-02 -6.58388734e-02 -7.13080540e-03 -1.86845511e-02 1.09980525e-02 -2.87970044e-02 -7.51761254e-03 -4.27996442e-02 4.82634753e-02 1.28486678e-02 4.15548757e-02 -2.18930170e-02 -8.27717036e-02 -3.29928100e-02 -6.88753054e-02 2.05334201e-02 -1.34646684e-01 6.18755482e-02 1.97726190e-02 -1.51370857e-02 -7.13358000e-02 -1.28756491e-02 5.74856102e-02 -4.22032438e-02 2.40339953e-02 -3.04572880e-02 -6.22801818e-02 3.53980586e-02 7.89841488e-02 7.77615085e-02 8.56208354e-02 -2.91944798e-02 2.21786584e-04 6.41850606e-02 3.41101661e-02 4.48811352e-02 -2.48523876e-02 2.16588452e-02 1.06995165e-01 -1.66009873e-08 -3.84524725e-02 -5.43523133e-02 -2.78189182e-02 1.23060867e-02 2.57597156e-02 -7.94619098e-02 -5.89542463e-03 -2.52906661e-02 -1.22407116e-02 2.99652256e-02 1.26678541e-01 9.77547616e-02 -5.67951128e-02 2.99453456e-02 1.54014584e-02 1.19677866e-02 3.30095999e-02 5.14508300e-02 -2.31375974e-02 -4.49272841e-02 2.83227619e-02 2.69311480e-02 8.60512406e-02 -5.95798269e-02 3.00312452e-02 -7.51487464e-02 -3.45640630e-02 6.74876198e-02 5.90112992e-02 -3.23370285e-02 -2.43565766e-03 -3.56118311e-03 4.29011974e-03 2.74957120e-02 -6.54878691e-02 -1.72954500e-02 3.22995447e-02 -3.44966315e-02 -1.82300489e-02 -5.76971807e-02 5.67945950e-02 -4.98785898e-02 5.01253866e-02 1.41746132e-02 -3.74590941e-02 -2.58184224e-02 5.35937250e-02 -1.05096117e-01 -5.24494164e-02 -7.67563805e-02 -3.09417769e-02 -4.93066385e-02 5.62944300e-02 6.04514666e-02 -9.33144428e-03 -2.60093510e-02 -4.29241359e-02 -7.56293815e-03 -1.23269381e-02 5.14618047e-02 1.19561754e-01 -8.99389088e-02 2.91423942e-03 1.99287571e-02]]


Hour 5: Working with Vector Databases

  • Objective: Learn to store and query embeddings using vector databases.
  • Topics:
    • Introduction to vector databases (e.g., Pinecone, Weaviate).
    • Storing embeddings and querying.

  • Code Example:
    • pip install pinecone
    • get pinecone key
  • # Example with Pinecone
    import pinecone
    
    pinecone.init(api_key='your-api-key', environment='us-west1-gcp')
    index = pinecone.Index("example-index")
    index.upsert([("id1", embeddings[0]), ("id2", embeddings[1])])
    results = index.query([embeddings[0]], top_k=1)
    print(results)


Hour 6: Overview of Models

  • Objective: Understand the types of models supported by Ollama.
  • Topics:
    • Pre-trained vs fine-tuned models.
    • Supported architectures (Llama, Mistral, Gemma).
  • Code Example:
    # List available models
    ollama list
  • NAME               ID              SIZE      MODIFIED
    llama3.1:latest    46e0c10c039e    4.9 GB    21 hours ago
    mistral:latest     f974a74358d6    4.1 GB    40 hours ago
    # Run a specific model ollama run llama3.1
  • >>> What is AI?

  • AI, or Artificial Intelligence, refers to the development of computer systems that can perform tasks that typically require human intelligence, such as:

    1. **Learning**: AI systems can learn from data and improve their performance over time.
    2. **Problem-solving**: AI systems can solve complex problems, make decisions, and take actions based on those decisions.
    3. **Reasoning**: AI systems can reason about the world, draw conclusions, and apply knowledge to specific situations.
    4. **Perception**: AI systems can perceive their environment through sensors, such as cameras, microphones, or other devices.

    AI involves a range of techniques, including:

    1. **Machine learning** (ML): A subset of AI that enables computers to learn from data without being explicitly programmed.
    2. **Deep learning** (DL): A type of ML that uses neural networks with multiple layers to analyze and interpret complex data.
    3. **Natural language processing** (NLP): A field of study focused on enabling computers to understand, generate, and process human language.
    4. **Computer vision**: A field of study focused on enabling computers to interpret and understand visual information from images and videos.

    AI can be applied in many areas, including:

    1. **Robotics**: AI-powered robots that can perform tasks autonomously or with minimal human intervention.
    2. **Virtual assistants**: AI-powered virtual assistants, such as Siri, Alexa, or Google Assistant, that can perform tasks like answering questions, setting reminders, and controlling other devices.
    3. **Image recognition**: AI-powered systems that can recognize objects, people, and scenes in images and videos.
    4. **Predictive maintenance**: AI-powered systems that can predict when equipment is likely to fail or require maintenance.
    5. **Healthcare**: AI-powered systems that can analyze medical data, diagnose diseases, and develop personalized treatment plans.

    The potential benefits of AI include:

    1. **Increased efficiency**: AI can automate tasks, freeing up human time for more strategic and creative work.
    2. **Improved accuracy**: AI can perform tasks with greater accuracy than humans, reducing errors and improving outcomes.
    3. **Enhanced decision-making**: AI can analyze large amounts of data and provide insights that inform business or organizational decisions.

    However, AI also raises concerns about:

    1. **Job displacement**: AI may automate certain jobs, potentially displacing human workers.
    2. **Bias and fairness**: AI systems may perpetuate biases and unfairness if they are trained on biased data or designed with a particular worldview.
    3. **Security**: AI systems can be vulnerable to cyber threats, compromising sensitive information.

    Overall, AI has the potential to transform many aspects of our lives, but it also requires careful consideration of its benefits and limitations.

Hour 7: Creating Custom Models

  • Objective: Learn how to create a custom model with Ollama.
  • Topics:
    • Creating a modelfile.
    • Custom prompts and templates.
  • Code Example:
    # Create a custom model
    ollama create mymodel -f ./mymodel.modelfile
    

Hour 8: Introduction to Fine-Tuning

  • Objective: Understand the basics of fine-tuning models.
  • Topics:
    • When and why to fine-tune.
    • Steps in the fine-tuning process.
  • Code Example:
    # Fine-tune a model
    ollama fine-tune --model llama3.1 --data fine_tune_data.json
    

Hour 9: Metrics & Evaluation

  • Objective: Learn how to evaluate model performance.
  • Topics:
    • Common metrics (accuracy, loss, BLEU, etc.).
    • Using metrics to improve models.
  • Code Example:
    # Calculate BLEU score
    from nltk.translate.bleu_score import sentence_bleu
    
    reference = [['this', 'is', 'a', 'test']]
    candidate = ['this', 'is', 'a', 'test']
    score = sentence_bleu(reference, candidate)
    print("BLEU Score:", score)
    

Hour 10: Advanced Fine-Tuning Techniques

  • Objective: Dive deeper into advanced fine-tuning methods.
  • Topics:
    • Adjusting hyperparameters.
    • Using new datasets.
  • Code Example:
    # Example with quantization
    ollama fine-tune --model llama3.1 --data new_data.json --quantization q4_0
    

Hour 11: Practical Applications

  • Objective: Apply Ollama to real-world scenarios.
  • Topics:
    • Using Ollama for chatbots.
    • Generating summaries or translations.
  • Code Example:
    ollama run llama3.1 --prompt "Summarize this article: ..."
    

Hour 12: Wrap-Up and Q&A

  • Objective: Review all topics and address student questions.
  • Activities:
    • Quick recap of all lessons.
    • Hands-on exercise: Create and run a custom model with embeddings and vector storage.
  • Code Example:
    # Final task: Integrate everything
    ollama create finalmodel -f ./finalmodel.modelfile
    ollama run finalmodel --prompt "Explain quantum computing."
    

This schedule ensures a blend of theory and practice, making the course both engaging and effective for undergraduate students.

OpenWebUI - Beginner's Tutorial

  OpenWebUI Tutorial: Setting Up and Using Local Llama 3.2 with Ollama Introduction This tutorial provides a step-by-step guide to setting...