Skip to content
Snippets Groups Projects
Commit b20e1de2 authored by Dellandrea Emmanuel's avatar Dellandrea Emmanuel
Browse files

Update Subject_7_LLM.ipynb

parent 351fc3c6
Branches
No related tags found
No related merge requests found
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### **_Deep Learning - Bsc Data Science for Responsible Business - Centrale Lyon_** ### **_Deep Learning - Bsc Data Science for Responsible Business - Centrale Lyon_**
2024-2025 2024-2025
Emmanuel Dellandréa Emmanuel Dellandréa
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Practical Session 7 – Large Language Models # Practical Session 7 – Large Language Models
The objective of this tutorial is to learn to work with LLM models for sentence generation and classification. The pretrained models and tokenizers will be obtained from the [Hugging Face platform](https://huggingface.co/). The objective of this tutorial is to learn to work with LLM models for sentence generation and classification. The pretrained models and tokenizers will be obtained from the [Hugging Face platform](https://huggingface.co/).
This notebook contains 8 parts: This notebook contains 8 parts:
1. Using a Hugging Face text generation model 1. Using a Hugging Face text generation model
2. Using Pipeline of Hugging Face for text classification 2. Using Pipeline of Hugging Face for text classification
3. Using Pipeline with a specific model and tokenizer of Hugging Face 3. Using Pipeline with a specific model and tokenizer of Hugging Face
4. Experimenting with models from Hugging Face 4. Experimenting with models from Hugging Face
5. Training a LLM for sentence classification using the **Trainer** class 5. Training a LLM for sentence classification using the **Trainer** class
6. Fine tuning a LLM model with a custom head 6. Fine tuning a LLM model with a custom head
7. Sharing a model on Hugging Face platform 7. Sharing a model on Hugging Face platform
8. Further experiments 8. Further experiments
Before going further into experiments, you work is to understand the provided code, that gives an overview of using LLM with Hugging Face. Before going further into experiments, you work is to understand the provided code, that gives an overview of using LLM with Hugging Face.
**This code is intentionally not commented. It is your responsibility to add all the necessary comments to ensure your proper understanding of the code.** **This code is intentionally not commented. It is your responsibility to add all the necessary comments to ensure your proper understanding of the code.**
You might frequently rely on [Hugging Face’s documentation](https://huggingface.co/docs).
--- ---
As the computation can be heavy, particularly during training, we encourage you to use a GPU. If your laptob is not equiped, you may use one of these remote jupyter servers, where you can select the execution on GPU : As the computation can be heavy, particularly during training, we encourage you to use a GPU. If your laptob is not equiped, you may use one of these remote jupyter servers, where you can select the execution on GPU :
1) [jupyter.mi90.ec-lyon.fr](https://jupyter.mi90.ec-lyon.fr/) 1) [jupyter.mi90.ec-lyon.fr](https://jupyter.mi90.ec-lyon.fr/)
This server is accessible within the campus network. If outside, you need to use a VPN. Before executing the notebook, select the kernel "Python PyTorch" to run it on GPU and have access to PyTorch module. This server is accessible within the campus network. If outside, you need to use a VPN. Before executing the notebook, select the kernel "Python PyTorch" to run it on GPU and have access to PyTorch module.
2) [Google Colaboratory](https://colab.research.google.com/) 2) [Google Colaboratory](https://colab.research.google.com/)
Before executing the notebook, select the execution on GPU : "Runtime" -> "Change runtime type" --> "T4 GPU". Before executing the notebook, select the execution on GPU : "Runtime" -> "Change runtime type" --> "T4 GPU".
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Installing required librairies ### Installing required librairies
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
!pip install huggingface_hub !pip install huggingface_hub
!pip install ipywidgets !pip install ipywidgets
!pip install transformers !pip install transformers
!pip install datasets !pip install datasets
!pip install accelerate !pip install accelerate
!pip install scikit-learn !pip install scikit-learn
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Then login to Hugging Face ### Then login to Hugging Face
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from huggingface_hub import notebook_login from huggingface_hub import notebook_login
notebook_login() notebook_login()
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Part 1 - Using a Hugging Face text generation model ### Part 1 - Using a Hugging Face text generation model
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from transformers import AutoTokenizer, AutoModelForCausalLM from transformers import AutoTokenizer, AutoModelForCausalLM
# model_name = "mistralai/Mistral-7B" # model_name = "mistralai/Mistral-7B"
# model_name = "deepseek-ai/DeepSeek-R1" # model_name = "deepseek-ai/DeepSeek-R1"
# model_name = "meta-llama/Llama-3.2-3B-Instruct" # model_name = "meta-llama/Llama-3.2-3B-Instruct"
# model_name = "homebrewltd/AlphaMaze-v0.2-1.5B" # model_name = "homebrewltd/AlphaMaze-v0.2-1.5B"
model_name = "gpt2" model_name = "gpt2"
tokenizer = AutoTokenizer.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
input_text = "Hello. Who are you ?" input_text = "Hello. Who are you ?"
encoded_input = tokenizer(input_text, return_tensors="pt") encoded_input = tokenizer(input_text, return_tensors="pt")
output = model.generate( output = model.generate(
input_ids=encoded_input["input_ids"], input_ids=encoded_input["input_ids"],
attention_mask=encoded_input["attention_mask"], attention_mask=encoded_input["attention_mask"],
max_length=100, max_length=100,
temperature=0.8, temperature=0.8,
pad_token_id=tokenizer.pad_token_id pad_token_id=tokenizer.pad_token_id
) )
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
generated_text = tokenizer.decode(output[0], skip_special_tokens=True) generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text) print(generated_text)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Part 2 - Using Pipeline of Hugging Face for text classification ### Part 2 - Using Pipeline of Hugging Face for text classification
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from transformers import pipeline from transformers import pipeline
classifier = pipeline("text-classification") classifier = pipeline("text-classification")
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
classifier("We are very happy to welcome you at Centrale Lyon.") classifier("We are very happy to welcome you at Centrale Lyon.")
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
results = classifier(["We are very happy to welcome you at Centrale Lyon.", "We hope you don't hate it."]) results = classifier(["We are very happy to welcome you at Centrale Lyon.", "We hope you don't hate it."])
for result in results: for result in results:
print(f"label: {result['label']}, with score: {round(result['score'], 4)}") print(f"label: {result['label']}, with score: {round(result['score'], 4)}")
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Part 3 - Using Pipeline with a specific model and tokenizer of Hugging Face ### Part 3 - Using Pipeline with a specific model and tokenizer of Hugging Face
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
model_name = "nlptown/bert-base-multilingual-uncased-sentiment" model_name = "nlptown/bert-base-multilingual-uncased-sentiment"
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from transformers import AutoTokenizer, AutoModelForSequenceClassification from transformers import AutoTokenizer, AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained(model_name) model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer) classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
classifier("We are very hapy to present you this incredible model.") classifier("We are very hapy to present you this incredible model.")
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Part 4 - Experimenting with models from Hugging Face ### Part 4 - Experimenting with models from Hugging Face
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from transformers import AutoTokenizer from transformers import AutoTokenizer
model_name = "nlptown/bert-base-multilingual-uncased-sentiment" model_name = "nlptown/bert-base-multilingual-uncased-sentiment"
tokenizer = AutoTokenizer.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
encoding = tokenizer("We are very happy to welcome you at Centrale Lyon.") encoding = tokenizer("We are very happy to welcome you at Centrale Lyon.")
print(encoding) print(encoding)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
batch = tokenizer( batch = tokenizer(
["We are very happy to welcome you at Centrale Lyon.", "We hope you don't hate it."], ["We are very happy to welcome you at Centrale Lyon.", "We hope you don't hate it."],
padding=True, padding=True,
truncation=True, truncation=True,
max_length=512, max_length=512,
return_tensors="pt", return_tensors="pt",
) )
print(batch) print(batch)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from transformers import AutoModelForSequenceClassification from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained(model_name, torch_dtype="auto") model = AutoModelForSequenceClassification.from_pretrained(model_name, torch_dtype="auto")
print(model) print(model)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
outputs = model(**batch) outputs = model(**batch)
print(outputs) print(outputs)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from torch import nn from torch import nn
predictions = nn.functional.softmax(outputs.logits, dim=-1) predictions = nn.functional.softmax(outputs.logits, dim=-1)
print(predictions) print(predictions)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
save_directory = "./save_pretrained" save_directory = "./save_pretrained"
tokenizer.save_pretrained(save_directory) tokenizer.save_pretrained(save_directory)
model.save_pretrained(save_directory) model.save_pretrained(save_directory)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
loaded_model = AutoModelForSequenceClassification.from_pretrained("./save_pretrained") loaded_model = AutoModelForSequenceClassification.from_pretrained("./save_pretrained")
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Part 5 - Training a LLM for sentence classification using the **Trainer** class ### Part 5 - Training a LLM for sentence classification using the **Trainer** class
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from transformers import AutoModelForSequenceClassification from transformers import AutoModelForSequenceClassification
model_name = "distilbert/distilbert-base-uncased" model_name = "distilbert/distilbert-base-uncased"
model = AutoModelForSequenceClassification.from_pretrained(model_name, torch_dtype="auto") model = AutoModelForSequenceClassification.from_pretrained(model_name, torch_dtype="auto")
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from transformers import TrainingArguments from transformers import TrainingArguments
training_args = TrainingArguments( training_args = TrainingArguments(
output_dir="save_folder/", output_dir="save_folder/",
learning_rate=2e-5, learning_rate=2e-5,
per_device_train_batch_size=8, per_device_train_batch_size=8,
per_device_eval_batch_size=8, per_device_eval_batch_size=8,
num_train_epochs=2, num_train_epochs=2,
) )
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from transformers import AutoTokenizer from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from datasets import load_dataset from datasets import load_dataset
dataset = load_dataset("rotten_tomatoes") dataset = load_dataset("rotten_tomatoes")
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
def tokenize_dataset(dataset): def tokenize_dataset(dataset):
return tokenizer(dataset["text"]) return tokenizer(dataset["text"])
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
dataset = dataset.map(tokenize_dataset, batched=True) dataset = dataset.map(tokenize_dataset, batched=True)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from transformers import DataCollatorWithPadding from transformers import DataCollatorWithPadding
data_collator = DataCollatorWithPadding(tokenizer=tokenizer) data_collator = DataCollatorWithPadding(tokenizer=tokenizer)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from transformers import Trainer from transformers import Trainer
trainer = Trainer( trainer = Trainer(
model=model, model=model,
args=training_args, args=training_args,
train_dataset=dataset["train"], train_dataset=dataset["train"],
eval_dataset=dataset["test"], eval_dataset=dataset["test"],
processing_class=tokenizer, processing_class=tokenizer,
data_collator=data_collator, data_collator=data_collator,
) )
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
trainer.train() trainer.train()
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
save_directory = "./tomatoes_save_pretrained" save_directory = "./tomatoes_save_pretrained"
tokenizer.save_pretrained(save_directory) tokenizer.save_pretrained(save_directory)
model.save_pretrained(save_directory) model.save_pretrained(save_directory)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
model = AutoModelForSequenceClassification.from_pretrained(save_directory) model = AutoModelForSequenceClassification.from_pretrained(save_directory)
tokenizer = AutoTokenizer.from_pretrained(save_directory) tokenizer = AutoTokenizer.from_pretrained(save_directory)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from transformers import pipeline from transformers import pipeline
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer) classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
t = dataset['test'][345] t = dataset['test'][345]
print(t) print(t)
classifier(t['text']) classifier(t['text'])
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Part 6 - Fine tuning a LLM model with a custom head ### Part 6 - Fine tuning a LLM model with a custom head
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from datasets import load_dataset from datasets import load_dataset
from transformers import DistilBertTokenizer, DistilBertModel from transformers import DistilBertTokenizer, DistilBertModel
import torch import torch
from torch.utils.data import DataLoader from torch.utils.data import DataLoader
from torch.optim import AdamW from torch.optim import AdamW
from sklearn.metrics import accuracy_score, precision_recall_fscore_support from sklearn.metrics import accuracy_score, precision_recall_fscore_support
import numpy as np import numpy as np
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
dataset = load_dataset("imdb") dataset = load_dataset("imdb")
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-uncased") tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-uncased")
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
def tokenize_function(examples): def tokenize_function(examples):
return tokenizer(examples["text"], padding="max_length", truncation=True, max_length=512) return tokenizer(examples["text"], padding="max_length", truncation=True, max_length=512)
tokenized_datasets = dataset.map(tokenize_function, batched=True) tokenized_datasets = dataset.map(tokenize_function, batched=True)
tokenized_datasets = tokenized_datasets.remove_columns(["text"]) tokenized_datasets = tokenized_datasets.remove_columns(["text"])
tokenized_datasets = tokenized_datasets.rename_column("label", "labels") tokenized_datasets = tokenized_datasets.rename_column("label", "labels")
tokenized_datasets.set_format("torch") tokenized_datasets.set_format("torch")
train_dataset = tokenized_datasets["train"] train_dataset = tokenized_datasets["train"]
test_dataset = tokenized_datasets["test"] test_dataset = tokenized_datasets["test"]
train_loader = DataLoader(train_dataset, batch_size=8, shuffle=True) train_loader = DataLoader(train_dataset, batch_size=8, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=8) test_loader = DataLoader(test_dataset, batch_size=8)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
bert_model = DistilBertModel.from_pretrained("distilbert-base-uncased") bert_model = DistilBertModel.from_pretrained("distilbert-base-uncased")
for param in bert_model.parameters(): for param in bert_model.parameters():
param.requires_grad = False param.requires_grad = False
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
class CustomBERTModel(torch.nn.Module): class CustomBERTModel(torch.nn.Module):
def __init__(self, bert_model): def __init__(self, bert_model):
super(CustomBERTModel, self).__init__() super(CustomBERTModel, self).__init__()
self.bert = bert_model self.bert = bert_model
self.custom_head = torch.nn.Sequential( self.custom_head = torch.nn.Sequential(
torch.nn.Linear(self.bert.config.hidden_size, 128), torch.nn.Linear(self.bert.config.hidden_size, 128),
torch.nn.ReLU(), torch.nn.ReLU(),
torch.nn.Dropout(0.1), torch.nn.Dropout(0.1),
torch.nn.Linear(128, 2) torch.nn.Linear(128, 2)
) )
def forward(self, input_ids, attention_mask): def forward(self, input_ids, attention_mask):
outputs = self.bert(input_ids=input_ids, attention_mask=attention_mask) outputs = self.bert(input_ids=input_ids, attention_mask=attention_mask)
outputs = self.custom_head(outputs.last_hidden_state[:, 0, :]) # Use [CLS] token output outputs = self.custom_head(outputs.last_hidden_state[:, 0, :]) # Use [CLS] token output
return outputs return outputs
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
bert_model = DistilBertModel.from_pretrained("distilbert-base-uncased") bert_model = DistilBertModel.from_pretrained("distilbert-base-uncased")
for param in bert_model.parameters(): for param in bert_model.parameters():
param.requires_grad = False param.requires_grad = False
model = CustomBERTModel(bert_model) model = CustomBERTModel(bert_model)
# device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device = torch.device("mps") device = torch.device("mps")
model.to(device) model.to(device)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
optimizer = AdamW(model.parameters(), lr=2e-5) optimizer = AdamW(model.parameters(), lr=2e-5)
criterion = torch.nn.CrossEntropyLoss() criterion = torch.nn.CrossEntropyLoss()
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
def train_epoch(model, data_loader, optimizer, criterion, device): def train_epoch(model, data_loader, optimizer, criterion, device):
model.train() model.train()
total_loss = 0 total_loss = 0
for batch in data_loader: for batch in data_loader:
optimizer.zero_grad() optimizer.zero_grad()
input_ids = batch["input_ids"].to(device) input_ids = batch["input_ids"].to(device)
attention_mask = batch["attention_mask"].to(device) attention_mask = batch["attention_mask"].to(device)
labels = batch["labels"].to(device) labels = batch["labels"].to(device)
outputs = model(input_ids=input_ids, attention_mask=attention_mask) outputs = model(input_ids=input_ids, attention_mask=attention_mask)
loss = criterion(outputs, labels) loss = criterion(outputs, labels)
loss.backward() loss.backward()
optimizer.step() optimizer.step()
total_loss += loss.item() total_loss += loss.item()
return total_loss / len(data_loader) return total_loss / len(data_loader)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
def evaluate(model, data_loader, criterion, device): def evaluate(model, data_loader, criterion, device):
model.eval() model.eval()
total_loss = 0 total_loss = 0
all_predictions = [] all_predictions = []
all_labels = [] all_labels = []
with torch.no_grad(): with torch.no_grad():
for batch in data_loader: for batch in data_loader:
input_ids = batch["input_ids"].to(device) input_ids = batch["input_ids"].to(device)
attention_mask = batch["attention_mask"].to(device) attention_mask = batch["attention_mask"].to(device)
labels = batch["labels"].to(device) labels = batch["labels"].to(device)
outputs = model(input_ids=input_ids, attention_mask=attention_mask) outputs = model(input_ids=input_ids, attention_mask=attention_mask)
loss = criterion(outputs, labels) loss = criterion(outputs, labels)
total_loss += loss.item() total_loss += loss.item()
predictions = torch.argmax(outputs, dim=-1) predictions = torch.argmax(outputs, dim=-1)
all_predictions.extend(predictions.cpu().numpy()) all_predictions.extend(predictions.cpu().numpy())
all_labels.extend(labels.cpu().numpy()) all_labels.extend(labels.cpu().numpy())
accuracy = accuracy_score(all_labels, all_predictions) accuracy = accuracy_score(all_labels, all_predictions)
precision, recall, f1, _ = precision_recall_fscore_support(all_labels, all_predictions, average="binary") precision, recall, f1, _ = precision_recall_fscore_support(all_labels, all_predictions, average="binary")
return total_loss / len(data_loader), accuracy, precision, recall, f1 return total_loss / len(data_loader), accuracy, precision, recall, f1
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
num_epochs = 3 num_epochs = 3
for epoch in range(num_epochs): for epoch in range(num_epochs):
print(f"Epoch {epoch + 1}/{num_epochs}") print(f"Epoch {epoch + 1}/{num_epochs}")
train_loss = train_epoch(model, train_loader, optimizer, criterion, device) train_loss = train_epoch(model, train_loader, optimizer, criterion, device)
print(f"Train Loss: {train_loss:.4f}") print(f"Train Loss: {train_loss:.4f}")
val_loss, val_accuracy, val_precision, val_recall, val_f1 = evaluate(model, test_loader, criterion, device) val_loss, val_accuracy, val_precision, val_recall, val_f1 = evaluate(model, test_loader, criterion, device)
print(f"Validation Loss: {val_loss:.4f}") print(f"Validation Loss: {val_loss:.4f}")
print(f"Accuracy: {val_accuracy:.4f}, Precision: {val_precision:.4f}, Recall: {val_recall:.4f}, F1 Score: {val_f1:.4f}") print(f"Accuracy: {val_accuracy:.4f}, Precision: {val_precision:.4f}, Recall: {val_recall:.4f}, F1 Score: {val_f1:.4f}")
torch.save(model.state_dict(), f"custom_bert_epoch_{epoch + 1}.pth") torch.save(model.state_dict(), f"custom_bert_epoch_{epoch + 1}.pth")
# (After 76 minutes of training) # (After 76 minutes of training)
# Epoch 1/3 # Epoch 1/3
# Train Loss: 0.6708 # Train Loss: 0.6708
# Validation Loss: 0.6415 # Validation Loss: 0.6415
# Accuracy: 0.7917, Precision: 0.8218, Recall: 0.7450, F1 Score: 0.7815 # Accuracy: 0.7917, Precision: 0.8218, Recall: 0.7450, F1 Score: 0.7815
# Epoch 2/3 # Epoch 2/3
# Train Loss: 0.6172 # Train Loss: 0.6172
# Validation Loss: 0.5825 # Validation Loss: 0.5825
# Accuracy: 0.8051, Precision: 0.8142, Recall: 0.7907, F1 Score: 0.8023 # Accuracy: 0.8051, Precision: 0.8142, Recall: 0.7907, F1 Score: 0.8023
# Epoch 3/3 # Epoch 3/3
# Train Loss: 0.5634 # Train Loss: 0.5634
# Validation Loss: 0.5300 # Validation Loss: 0.5300
# Accuracy: 0.8098, Precision: 0.8339, Recall: 0.7738, F1 Score: 0.8027 # Accuracy: 0.8098, Precision: 0.8339, Recall: 0.7738, F1 Score: 0.8027
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
model_save_path = "custom_bert_model.pth" model_save_path = "custom_bert_model.pth"
torch.save(model.state_dict(), model_save_path) torch.save(model.state_dict(), model_save_path)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
loadedbert_model = DistilBertModel.from_pretrained("distilbert-base-uncased") loadedbert_model = DistilBertModel.from_pretrained("distilbert-base-uncased")
loaded_model = CustomBERTModel(loadedbert_model) loaded_model = CustomBERTModel(loadedbert_model)
loaded_model.load_state_dict(torch.load(model_save_path)) loaded_model.load_state_dict(torch.load(model_save_path))
loaded_model.to(device) loaded_model.to(device)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
batch = next(iter(test_loader)) batch = next(iter(test_loader))
ids = batch['input_ids'][0] ids = batch['input_ids'][0]
attention_mask = batch['attention_mask'][0] attention_mask = batch['attention_mask'][0]
label = batch['labels'][0] label = batch['labels'][0]
ids = ids.to(device) ids = ids.to(device)
attention_mask = attention_mask.to(device) attention_mask = attention_mask.to(device)
text = tokenizer.decode(ids, skip_special_tokens=True) text = tokenizer.decode(ids, skip_special_tokens=True)
print(text) print(text)
print(label) print(label)
loaded_model.eval() loaded_model.eval()
output = model(input_ids=ids.unsqueeze(0), attention_mask=attention_mask.unsqueeze(0)) output = model(input_ids=ids.unsqueeze(0), attention_mask=attention_mask.unsqueeze(0))
output = output.squeeze(0) output = output.squeeze(0)
print(output) print(output)
prediction = torch.argmax(output, dim=-1) prediction = torch.argmax(output, dim=-1)
print(prediction) print(prediction)
print(label) print(label)
print(prediction == label) print(prediction == label)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Part 7 - Sharing a model on Hugging Face platform ### Part 7 - Sharing a model on Hugging Face platform
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from transformers import DistilBertPreTrainedModel, DistilBertModel from transformers import DistilBertPreTrainedModel, DistilBertModel
import torch.nn as nn import torch.nn as nn
class CustomDistilBERTModel(DistilBertPreTrainedModel): class CustomDistilBERTModel(DistilBertPreTrainedModel):
def __init__(self, config, freeze_backbone=True): def __init__(self, config, freeze_backbone=True):
super().__init__(config) super().__init__(config)
self.distilbert = DistilBertModel(config) self.distilbert = DistilBertModel(config)
self.classifier = nn.Sequential( self.classifier = nn.Sequential(
nn.Linear(config.hidden_size, 128), nn.Linear(config.hidden_size, 128),
nn.ReLU(), nn.ReLU(),
nn.Dropout(0.1), nn.Dropout(0.1),
nn.Linear(128, config.num_labels) # Binary classification nn.Linear(128, config.num_labels) # Binary classification
) )
self.init_weights() self.init_weights()
# Freeze DistilBERT backbone if specified # Freeze DistilBERT backbone if specified
if freeze_backbone: if freeze_backbone:
for param in self.distilbert.parameters(): for param in self.distilbert.parameters():
param.requires_grad = False param.requires_grad = False
def forward(self, input_ids, attention_mask=None, labels=None): def forward(self, input_ids, attention_mask=None, labels=None):
outputs = self.distilbert(input_ids=input_ids, attention_mask=attention_mask) outputs = self.distilbert(input_ids=input_ids, attention_mask=attention_mask)
logits = self.classifier(outputs.last_hidden_state[:, 0, :]) # Use [CLS] token output logits = self.classifier(outputs.last_hidden_state[:, 0, :]) # Use [CLS] token output
return logits return logits
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from transformers import AutoConfig from transformers import AutoConfig
AutoConfig.register("custom-distilbert", AutoConfig) AutoConfig.register("custom-distilbert", AutoConfig)
AutoModel.register(CustomDistilBERTModel, "custom-distilbert") AutoModel.register(CustomDistilBERTModel, "custom-distilbert")
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from transformers import DistilBertTokenizer from transformers import DistilBertTokenizer
# Initialize the configuration with custom attributes # Initialize the configuration with custom attributes
config = AutoConfig.from_pretrained("distilbert-base-uncased", num_labels=2) config = AutoConfig.from_pretrained("distilbert-base-uncased", num_labels=2)
config.architectures = ["CustomDistilBERTModel"] config.architectures = ["CustomDistilBERTModel"]
# Initialize the model and tokenizer # Initialize the model and tokenizer
model = CustomDistilBERTModel(config) model = CustomDistilBERTModel(config)
tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-uncased") tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-uncased")
# Save locally # Save locally
model.save_pretrained("custom_distilbert_model") model.save_pretrained("custom_distilbert_model")
tokenizer.save_pretrained("custom_distilbert_model") tokenizer.save_pretrained("custom_distilbert_model")
print("Custom model and tokenizer saved locally!") print("Custom model and tokenizer saved locally!")
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
device = "mps" device = "mps"
model = model.to(device) model = model.to(device)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
num_epochs = 3 num_epochs = 3
for epoch in range(num_epochs): for epoch in range(num_epochs):
print(f"Epoch {epoch + 1}/{num_epochs}") print(f"Epoch {epoch + 1}/{num_epochs}")
train_loss = train_epoch(model, train_loader, optimizer, criterion, device) train_loss = train_epoch(model, train_loader, optimizer, criterion, device)
print(f"Train Loss: {train_loss:.4f}") print(f"Train Loss: {train_loss:.4f}")
val_loss, val_accuracy, val_precision, val_recall, val_f1 = evaluate(model, test_loader, criterion, device) val_loss, val_accuracy, val_precision, val_recall, val_f1 = evaluate(model, test_loader, criterion, device)
print(f"Validation Loss: {val_loss:.4f}") print(f"Validation Loss: {val_loss:.4f}")
print(f"Accuracy: {val_accuracy:.4f}, Precision: {val_precision:.4f}, Recall: {val_recall:.4f}, F1 Score: {val_f1:.4f}") print(f"Accuracy: {val_accuracy:.4f}, Precision: {val_precision:.4f}, Recall: {val_recall:.4f}, F1 Score: {val_f1:.4f}")
torch.save(model.state_dict(), f"custom_bert_epoch_{epoch + 1}.pth") torch.save(model.state_dict(), f"custom_bert_epoch_{epoch + 1}.pth")
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
model.push_to_hub("custom-distilbert-model") model.push_to_hub("custom-distilbert-model")
tokenizer.push_to_hub("custom-distilbert-model") tokenizer.push_to_hub("custom-distilbert-model")
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from transformers import AutoTokenizer, AutoModel from transformers import AutoTokenizer, AutoModel
loaded_tokenizer = AutoTokenizer.from_pretrained("your_hf_id/custom-distilbert-model") loaded_tokenizer = AutoTokenizer.from_pretrained("your_hf_id/custom-distilbert-model")
loaded_model = AutoModel.from_pretrained("your_hf_id/custom-distilbert-model") loaded_model = AutoModel.from_pretrained("your_hf_id/custom-distilbert-model")
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Part 8 - Further experiments ### Part 8 - Further experiments
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Now that you know the basics for manipulating LLM through Hugging Face platform, it is time to experiment with: Now that you know the basics for manipulating LLM through Hugging Face platform, it is time to experiment with:
- different [NLP tasks](https://huggingface.co/tasks) - different [NLP tasks](https://huggingface.co/tasks)
- different [models](https://huggingface.co/models?pipeline_tag=text-classification&sort=trending) - different [models](https://huggingface.co/models?pipeline_tag=text-classification&sort=trending)
- different [datasets](https://huggingface.co/datasets?task_categories=task_categories:text-classification&sort=trending) - different [datasets](https://huggingface.co/datasets?task_categories=task_categories:text-classification&sort=trending)
... and to share your finetuned models on the platform. ... and to share your finetuned models on the platform.
Besides, don't forget to monitor your trainings through [Weights & Biases](https://wandb.ai/home). Besides, don't forget to monitor your trainings through [Weights & Biases](https://wandb.ai/home).
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment