diff --git a/Practical_sessions/Session_7/Subject_7_LLM.ipynb b/Practical_sessions/Session_7/Subject_7_LLM.ipynb
new file mode 100644
index 0000000000000000000000000000000000000000..ca9fb93f78e42a0092550d4575e53a7df3d45602
--- /dev/null
+++ b/Practical_sessions/Session_7/Subject_7_LLM.ipynb
@@ -0,0 +1,923 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### **_Deep Learning  - Bsc Data Science for Responsible Business - Centrale Lyon_**\n",
+    "\n",
+    "2024-2025\n",
+    "\n",
+    "Emmanuel Dellandréa\t  "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Practical Session 7 – Large Language Models\n",
+    "\n",
+    "The objective of this tutorial is to learn to work with LLM models for sentence generation and classification. The pretrained models and tokenizers will be obtained from the [Hugging Face platform](https://huggingface.co/).\n",
+    "\n",
+    "This notebook contains 8 parts:\n",
+    "1. Using a Hugging Face text generation model\n",
+    "2. Using Pipeline of Hugging Face for text classification\n",
+    "3. Using Pipeline with a specific model and tokenizer of Hugging Face\n",
+    "4. Experimenting with models from Hugging Face\n",
+    "5. Training a LLM for sentence classification using the **Trainer** class\n",
+    "6. Fine tuning a LLM model with a custom head\n",
+    "7. Sharing a model on Hugging Face platform\n",
+    "8. Further experiments\n",
+    "\n",
+    "Before going further into experiments, you work is to understand the provided code, that gives an overview of using LLM with Hugging Face.\n",
+    "\n",
+    "**This code is intentionally not commented. It is your responsibility to add all the necessary comments to ensure your proper understanding of the code.**\n",
+    "\n",
+    "\n",
+    "---\n",
+    "\n",
+    "\n",
+    "\n",
+    "\n",
+    "As the computation can be heavy, particularly during training, we encourage you to use a GPU. If your laptob is not equiped, you may use one of these remote jupyter servers, where you can select the execution on GPU :\n",
+    "\n",
+    "1) [jupyter.mi90.ec-lyon.fr](https://jupyter.mi90.ec-lyon.fr/)\n",
+    "\n",
+    "This server is accessible within the campus network. If outside, you need to use a VPN. Before executing the notebook, select the kernel \"Python PyTorch\" to run it on GPU and have access to PyTorch module.\n",
+    "\n",
+    "2) [Google Colaboratory](https://colab.research.google.com/)\n",
+    "\n",
+    "Before executing the notebook, select the execution on GPU : \"Runtime\" -> \"Change runtime type\" --> \"T4 GPU\". "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Installing required librairies "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!pip install huggingface_hub\n",
+    "!pip install ipywidgets\n",
+    "!pip install transformers\n",
+    "!pip install datasets\n",
+    "!pip install accelerate\n",
+    "!pip install scikit-learn\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Then login to Hugging Face"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from huggingface_hub import notebook_login\n",
+    "notebook_login()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Part 1 - Using a Hugging Face text generation model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from transformers import AutoTokenizer, AutoModelForCausalLM\n",
+    "\n",
+    "# model_name = \"mistralai/Mistral-7B\"\n",
+    "# model_name = \"deepseek-ai/DeepSeek-R1\"\n",
+    "# model_name = \"meta-llama/Llama-3.2-3B-Instruct\"\n",
+    "# model_name = \"homebrewltd/AlphaMaze-v0.2-1.5B\"\n",
+    "model_name = \"gpt2\"\n",
+    "\n",
+    "\n",
+    "tokenizer = AutoTokenizer.from_pretrained(model_name)\n",
+    "model = AutoModelForCausalLM.from_pretrained(model_name)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "input_text = \"Hello. Who are you ?\"\n",
+    "encoded_input = tokenizer(input_text, return_tensors=\"pt\")\n",
+    "\n",
+    "output = model.generate(\n",
+    "    input_ids=encoded_input[\"input_ids\"],\n",
+    "    attention_mask=encoded_input[\"attention_mask\"],\n",
+    "    max_length=100,\n",
+    "    temperature=0.8,\n",
+    "    pad_token_id=tokenizer.pad_token_id\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "generated_text = tokenizer.decode(output[0], skip_special_tokens=True)\n",
+    "print(generated_text)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Part 2 - Using Pipeline of Hugging Face for text classification"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from transformers import pipeline\n",
+    "\n",
+    "classifier = pipeline(\"text-classification\")\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "classifier(\"We are very happy to welcome you at Centrale Lyon.\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "results = classifier([\"We are very happy to welcome you at Centrale Lyon.\", \"We hope you don't hate it.\"])\n",
+    "for result in results:\n",
+    "    print(f\"label: {result['label']}, with score: {round(result['score'], 4)}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Part 3 - Using Pipeline with a specific model and tokenizer of Hugging Face"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "model_name = \"nlptown/bert-base-multilingual-uncased-sentiment\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from transformers import AutoTokenizer, AutoModelForSequenceClassification\n",
+    "\n",
+    "model = AutoModelForSequenceClassification.from_pretrained(model_name)\n",
+    "tokenizer = AutoTokenizer.from_pretrained(model_name)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "classifier = pipeline(\"text-classification\", model=model, tokenizer=tokenizer)\n",
+    "classifier(\"We are very hapy to present you this incredible  model.\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Part 4 - Experimenting with models from Hugging Face"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from transformers import AutoTokenizer\n",
+    "\n",
+    "model_name = \"nlptown/bert-base-multilingual-uncased-sentiment\"\n",
+    "tokenizer = AutoTokenizer.from_pretrained(model_name)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "encoding = tokenizer(\"We are very happy to welcome you at Centrale Lyon.\")\n",
+    "print(encoding)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "batch = tokenizer(\n",
+    "    [\"We are very happy to welcome you at Centrale Lyon.\", \"We hope you don't hate it.\"],\n",
+    "    padding=True,\n",
+    "    truncation=True,\n",
+    "    max_length=512,\n",
+    "    return_tensors=\"pt\",\n",
+    ")\n",
+    "print(batch)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from transformers import AutoModelForSequenceClassification\n",
+    "\n",
+    "model = AutoModelForSequenceClassification.from_pretrained(model_name, torch_dtype=\"auto\")\n",
+    "print(model)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "outputs = model(**batch)\n",
+    "print(outputs)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from torch import nn\n",
+    "\n",
+    "predictions = nn.functional.softmax(outputs.logits, dim=-1)\n",
+    "print(predictions)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "save_directory = \"./save_pretrained\"\n",
+    "tokenizer.save_pretrained(save_directory)\n",
+    "model.save_pretrained(save_directory)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loaded_model = AutoModelForSequenceClassification.from_pretrained(\"./save_pretrained\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Part 5 - Training a LLM for sentence classification using the **Trainer** class"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from transformers import AutoModelForSequenceClassification\n",
+    "\n",
+    "model_name = \"distilbert/distilbert-base-uncased\"\n",
+    "\n",
+    "model = AutoModelForSequenceClassification.from_pretrained(model_name, torch_dtype=\"auto\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from transformers import TrainingArguments\n",
+    "\n",
+    "training_args = TrainingArguments(\n",
+    "    output_dir=\"save_folder/\",\n",
+    "    learning_rate=2e-5,\n",
+    "    per_device_train_batch_size=8,\n",
+    "    per_device_eval_batch_size=8,\n",
+    "    num_train_epochs=2,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from transformers import AutoTokenizer\n",
+    "\n",
+    "tokenizer = AutoTokenizer.from_pretrained(model_name)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from datasets import load_dataset\n",
+    "\n",
+    "dataset = load_dataset(\"rotten_tomatoes\") "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def tokenize_dataset(dataset):\n",
+    "    return tokenizer(dataset[\"text\"])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "dataset = dataset.map(tokenize_dataset, batched=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from transformers import DataCollatorWithPadding\n",
+    "\n",
+    "data_collator = DataCollatorWithPadding(tokenizer=tokenizer)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from transformers import Trainer\n",
+    "\n",
+    "trainer = Trainer(\n",
+    "    model=model,\n",
+    "    args=training_args,\n",
+    "    train_dataset=dataset[\"train\"],\n",
+    "    eval_dataset=dataset[\"test\"],\n",
+    "    processing_class=tokenizer,\n",
+    "    data_collator=data_collator,\n",
+    ") "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "trainer.train()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "save_directory = \"./tomatoes_save_pretrained\"\n",
+    "tokenizer.save_pretrained(save_directory)\n",
+    "model.save_pretrained(save_directory)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "model = AutoModelForSequenceClassification.from_pretrained(save_directory)\n",
+    "tokenizer = AutoTokenizer.from_pretrained(save_directory)\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from transformers import pipeline\n",
+    "classifier = pipeline(\"text-classification\", model=model, tokenizer=tokenizer)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "t = dataset['test'][345]\n",
+    "print(t)\n",
+    "classifier(t['text'])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Part 6 - Fine tuning a LLM model with a custom head"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from datasets import load_dataset\n",
+    "from transformers import DistilBertTokenizer, DistilBertModel\n",
+    "import torch\n",
+    "from torch.utils.data import DataLoader\n",
+    "from torch.optim import AdamW\n",
+    "from sklearn.metrics import accuracy_score, precision_recall_fscore_support\n",
+    "import numpy as np"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "dataset = load_dataset(\"imdb\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tokenizer = DistilBertTokenizer.from_pretrained(\"distilbert-base-uncased\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def tokenize_function(examples):\n",
+    "    return tokenizer(examples[\"text\"], padding=\"max_length\", truncation=True, max_length=512)\n",
+    "\n",
+    "tokenized_datasets = dataset.map(tokenize_function, batched=True)\n",
+    "\n",
+    "\n",
+    "tokenized_datasets = tokenized_datasets.remove_columns([\"text\"])\n",
+    "tokenized_datasets = tokenized_datasets.rename_column(\"label\", \"labels\")\n",
+    "tokenized_datasets.set_format(\"torch\")\n",
+    "\n",
+    "train_dataset = tokenized_datasets[\"train\"]\n",
+    "test_dataset = tokenized_datasets[\"test\"]\n",
+    "\n",
+    "train_loader = DataLoader(train_dataset, batch_size=8, shuffle=True)\n",
+    "test_loader = DataLoader(test_dataset, batch_size=8)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "bert_model = DistilBertModel.from_pretrained(\"distilbert-base-uncased\")\n",
+    "\n",
+    "for param in bert_model.parameters():\n",
+    "    param.requires_grad = False\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "class CustomBERTModel(torch.nn.Module):\n",
+    "    def __init__(self, bert_model):\n",
+    "        super(CustomBERTModel, self).__init__()\n",
+    "        self.bert = bert_model\n",
+    "        self.custom_head = torch.nn.Sequential(\n",
+    "            torch.nn.Linear(self.bert.config.hidden_size, 128),\n",
+    "            torch.nn.ReLU(),\n",
+    "            torch.nn.Dropout(0.1),\n",
+    "            torch.nn.Linear(128, 2) \n",
+    "        )\n",
+    "\n",
+    "    def forward(self, input_ids, attention_mask):\n",
+    "        outputs = self.bert(input_ids=input_ids, attention_mask=attention_mask)\n",
+    "        outputs = self.custom_head(outputs.last_hidden_state[:, 0, :])  # Use [CLS] token output\n",
+    "        return outputs"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "bert_model = DistilBertModel.from_pretrained(\"distilbert-base-uncased\")\n",
+    "\n",
+    "for param in bert_model.parameters():\n",
+    "    param.requires_grad = False\n",
+    "\n",
+    "model = CustomBERTModel(bert_model)\n",
+    "\n",
+    "# device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n",
+    "device = torch.device(\"mps\")\n",
+    "\n",
+    "model.to(device)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "optimizer = AdamW(model.parameters(), lr=2e-5)\n",
+    "criterion = torch.nn.CrossEntropyLoss()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def train_epoch(model, data_loader, optimizer, criterion, device):\n",
+    "    model.train()\n",
+    "    total_loss = 0\n",
+    "    for batch in data_loader:\n",
+    "        optimizer.zero_grad()\n",
+    "        input_ids = batch[\"input_ids\"].to(device)\n",
+    "        attention_mask = batch[\"attention_mask\"].to(device)\n",
+    "        labels = batch[\"labels\"].to(device)\n",
+    "\n",
+    "        outputs = model(input_ids=input_ids, attention_mask=attention_mask)\n",
+    "        loss = criterion(outputs, labels)\n",
+    "        loss.backward()\n",
+    "        optimizer.step()\n",
+    "\n",
+    "        total_loss += loss.item()\n",
+    "    return total_loss / len(data_loader)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def evaluate(model, data_loader, criterion, device):\n",
+    "    model.eval()\n",
+    "    total_loss = 0\n",
+    "    all_predictions = []\n",
+    "    all_labels = []\n",
+    "    \n",
+    "    with torch.no_grad():\n",
+    "        for batch in data_loader:\n",
+    "            input_ids = batch[\"input_ids\"].to(device)\n",
+    "            attention_mask = batch[\"attention_mask\"].to(device)\n",
+    "            labels = batch[\"labels\"].to(device)\n",
+    "\n",
+    "            outputs = model(input_ids=input_ids, attention_mask=attention_mask)\n",
+    "            loss = criterion(outputs, labels)\n",
+    "            total_loss += loss.item()\n",
+    "\n",
+    "            predictions = torch.argmax(outputs, dim=-1)\n",
+    "            all_predictions.extend(predictions.cpu().numpy())\n",
+    "            all_labels.extend(labels.cpu().numpy())\n",
+    "    \n",
+    "    accuracy = accuracy_score(all_labels, all_predictions)\n",
+    "    precision, recall, f1, _ = precision_recall_fscore_support(all_labels, all_predictions, average=\"binary\")\n",
+    "    return total_loss / len(data_loader), accuracy, precision, recall, f1"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "num_epochs = 3\n",
+    "\n",
+    "for epoch in range(num_epochs):\n",
+    "    print(f\"Epoch {epoch + 1}/{num_epochs}\")\n",
+    "    \n",
+    "    train_loss = train_epoch(model, train_loader, optimizer, criterion, device)\n",
+    "    print(f\"Train Loss: {train_loss:.4f}\")\n",
+    "    \n",
+    "    val_loss, val_accuracy, val_precision, val_recall, val_f1 = evaluate(model, test_loader, criterion, device)\n",
+    "    print(f\"Validation Loss: {val_loss:.4f}\")\n",
+    "    print(f\"Accuracy: {val_accuracy:.4f}, Precision: {val_precision:.4f}, Recall: {val_recall:.4f}, F1 Score: {val_f1:.4f}\")\n",
+    "\n",
+    "    torch.save(model.state_dict(), f\"custom_bert_epoch_{epoch + 1}.pth\")\n",
+    "\n",
+    "\n",
+    "# (After 76 minutes of training)\n",
+    "# Epoch 1/3\n",
+    "# Train Loss: 0.6708\n",
+    "# Validation Loss: 0.6415\n",
+    "# Accuracy: 0.7917, Precision: 0.8218, Recall: 0.7450, F1 Score: 0.7815\n",
+    "# Epoch 2/3\n",
+    "# Train Loss: 0.6172\n",
+    "# Validation Loss: 0.5825\n",
+    "# Accuracy: 0.8051, Precision: 0.8142, Recall: 0.7907, F1 Score: 0.8023\n",
+    "# Epoch 3/3\n",
+    "# Train Loss: 0.5634\n",
+    "# Validation Loss: 0.5300\n",
+    "# Accuracy: 0.8098, Precision: 0.8339, Recall: 0.7738, F1 Score: 0.8027"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "model_save_path = \"custom_bert_model.pth\"\n",
+    "\n",
+    "torch.save(model.state_dict(), model_save_path)\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loadedbert_model = DistilBertModel.from_pretrained(\"distilbert-base-uncased\")\n",
+    "\n",
+    "loaded_model = CustomBERTModel(loadedbert_model)\n",
+    "\n",
+    "loaded_model.load_state_dict(torch.load(model_save_path))\n",
+    "\n",
+    "loaded_model.to(device)\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "batch = next(iter(test_loader))\n",
+    "\n",
+    "ids = batch['input_ids'][0]\n",
+    "attention_mask = batch['attention_mask'][0]\n",
+    "label = batch['labels'][0]\n",
+    "\n",
+    "ids = ids.to(device)\n",
+    "attention_mask = attention_mask.to(device)\n",
+    "\n",
+    "text = tokenizer.decode(ids, skip_special_tokens=True)\n",
+    "print(text)\n",
+    "print(label)\n",
+    "\n",
+    "loaded_model.eval()\n",
+    "output = model(input_ids=ids.unsqueeze(0), attention_mask=attention_mask.unsqueeze(0))\n",
+    "\n",
+    "output = output.squeeze(0)\n",
+    "print(output)\n",
+    "prediction = torch.argmax(output, dim=-1)\n",
+    "print(prediction)\n",
+    "print(label)\n",
+    "print(prediction == label)\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Part 7 - Sharing a model on Hugging Face platform"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from transformers import DistilBertPreTrainedModel, DistilBertModel\n",
+    "import torch.nn as nn\n",
+    "\n",
+    "class CustomDistilBERTModel(DistilBertPreTrainedModel):\n",
+    "    def __init__(self, config, freeze_backbone=True):\n",
+    "        super().__init__(config)\n",
+    "        self.distilbert = DistilBertModel(config)\n",
+    "        self.classifier = nn.Sequential(\n",
+    "            nn.Linear(config.hidden_size, 128),\n",
+    "            nn.ReLU(),\n",
+    "            nn.Dropout(0.1),\n",
+    "            nn.Linear(128, config.num_labels)  # Binary classification\n",
+    "        )\n",
+    "        self.init_weights()\n",
+    "\n",
+    "        # Freeze DistilBERT backbone if specified\n",
+    "        if freeze_backbone:\n",
+    "            for param in self.distilbert.parameters():\n",
+    "                param.requires_grad = False\n",
+    "\n",
+    "    def forward(self, input_ids, attention_mask=None, labels=None):\n",
+    "        outputs = self.distilbert(input_ids=input_ids, attention_mask=attention_mask)\n",
+    "        logits = self.classifier(outputs.last_hidden_state[:, 0, :])  # Use [CLS] token output\n",
+    "        return logits\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from transformers import AutoConfig\n",
+    "AutoConfig.register(\"custom-distilbert\", AutoConfig)\n",
+    "AutoModel.register(CustomDistilBERTModel, \"custom-distilbert\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from transformers import DistilBertTokenizer\n",
+    "\n",
+    "# Initialize the configuration with custom attributes\n",
+    "config = AutoConfig.from_pretrained(\"distilbert-base-uncased\", num_labels=2)\n",
+    "config.architectures = [\"CustomDistilBERTModel\"]\n",
+    "\n",
+    "# Initialize the model and tokenizer\n",
+    "model = CustomDistilBERTModel(config)\n",
+    "tokenizer = DistilBertTokenizer.from_pretrained(\"distilbert-base-uncased\")\n",
+    "\n",
+    "# Save locally\n",
+    "model.save_pretrained(\"custom_distilbert_model\")\n",
+    "tokenizer.save_pretrained(\"custom_distilbert_model\")\n",
+    "\n",
+    "print(\"Custom model and tokenizer saved locally!\")\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "device = \"mps\"\n",
+    "model = model.to(device)\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "num_epochs = 3\n",
+    "\n",
+    "for epoch in range(num_epochs):\n",
+    "    print(f\"Epoch {epoch + 1}/{num_epochs}\")\n",
+    "    \n",
+    "    train_loss = train_epoch(model, train_loader, optimizer, criterion, device)\n",
+    "    print(f\"Train Loss: {train_loss:.4f}\")\n",
+    "    \n",
+    "    val_loss, val_accuracy, val_precision, val_recall, val_f1 = evaluate(model, test_loader, criterion, device)\n",
+    "    print(f\"Validation Loss: {val_loss:.4f}\")\n",
+    "    print(f\"Accuracy: {val_accuracy:.4f}, Precision: {val_precision:.4f}, Recall: {val_recall:.4f}, F1 Score: {val_f1:.4f}\")\n",
+    "\n",
+    "    torch.save(model.state_dict(), f\"custom_bert_epoch_{epoch + 1}.pth\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "model.push_to_hub(\"custom-distilbert-model\")\n",
+    "tokenizer.push_to_hub(\"custom-distilbert-model\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from transformers import AutoTokenizer, AutoModel\n",
+    "loaded_tokenizer = AutoTokenizer.from_pretrained(\"your_hf_id/custom-distilbert-model\")\n",
+    "loaded_model = AutoModel.from_pretrained(\"your_hf_id/custom-distilbert-model\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Part 8 - Further experiments"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now that you know the basics for manipulating LLM through Hugging Face platform, it is time to experiment with:\n",
+    "- different [NLP tasks](https://huggingface.co/tasks)\n",
+    "- different [models](https://huggingface.co/models?pipeline_tag=text-classification&sort=trending)\n",
+    "- different [datasets](https://huggingface.co/datasets?task_categories=task_categories:text-classification&sort=trending)\n",
+    "\n",
+    "... and to share your finetuned models on the platform.\n",
+    "\n",
+    "Besides, don't forget to monitor your trainings through [Weights & Biases](https://wandb.ai/home)."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "td_llm",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.11"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}