Release assignment 1

724fa496 · Romain Vuillemot · abc40d69 · abc40d69 · 724fa496
Commit 724fa496 authored 6 months ago by Romain Vuillemot
--- a/assignments/README.md
+++ b/assignments/README.md
-# UE5 Fundamentals of Algorithms
-### Bachelor of Science in Data Science for Responsible Business
-Instructor: [Romain Vuillemot](romain.vuillemot@ec-lyon.fr)
-<div style="text-align: center;">
-    <img src="../figures/logo-ecl.png" width="40%" style="width:20%; display:inline-block; vertical-align:middle;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
-    <img src="../figures/logo-emlyon.png" width="10%" style="width:20%; display:inline-block; vertical-align:middle;">
-</div>
-## Assignments
--- a/assignments/assignment-01.ipynb
+++ b/assignments/assignment-01.ipynb
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "a1d61493",
+   "metadata": {},
+   "source": [
+    "NAME:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "db477b79",
+   "metadata": {},
+   "source": [
+    "<center>\n",
+    "    <h3>ASSIGNMENT#1 (UE5 Algorithms)</h3>\n",
+    "</center>\n",
+    "\n",
+    "Due date: **November 8th, 2024, 5pm** (no extension will be allowed).\n",
+    "\n",
+    "- Submit your solution by email: [romain.vuillemot@ec-lyon.fr](romain.vuillemot@ec-lyon.fr)\n",
+    "\n",
+    "- You have to work alone for this assignement.\n",
+    "\n",
+    "## Goal of this assignment\n",
+    "\n",
+    "In this assignment we provide you with a dataset of characters from a movie. Your role will be to answer the questions below programmatically, using Pyhon. **Please note you need to answer with fully working python code embedded in this notebook as solution (no external modules or files can be included).** You may then replace the code below with your answer for each question:\n",
+    "\n",
+    "```python\n",
+    "# YOUR CODE HERE\n",
+    "raise NotImplementedError()\n",
+    "```\n",
+    "\n",
+    "Do not forget to also provide 1) examples of use of the solution, and 2) serveral tests (using `assert`).\n",
+    "\n",
+    "## Grading\n",
+    "\n",
+    "- 30% for the results to the questions\n",
+    "- 60% for the code quality\n",
+    "- 10% for the notebook presentation\n",
+    "- +10% bonus question\n",
+    "\n",
+    "## Getting started\n",
+    "\n",
+    "We provide you with a dataset containing movie characters. Each character is represented as a row, along with connections to other characters, based on the movie script. You will use those connections to create a graph-based data structure and answer the questions. \n",
+    "\n",
+    "To get you started with the dataset, we provide you with the code that loads it:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "cf409997-537f-4f89-a039-c2b3494972f2",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "with open('users.csv', 'r') as file:\n",
+    "    lines = file.readlines()\n",
+    "    data = [tuple(line.strip().split(',')) for line in lines[1:]]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c48b6391-d858-4a35-b428-b99434ee2d57",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "The `data` variable contains the list of characters. You may look at the `users.csv` file to grasp the values of this variable. Here is a sample:\n",
+    "\n",
+    "```\n",
+    "[('Tony_Stark',\n",
+    "  '40',\n",
+    "  'Male',\n",
+    "  '1.85',\n",
+    "  'Steve_Rogers Natasha_Romanoff Bruce_Banner Thor_Odinson'),\n",
+    " ('Steve_Rogers',\n",
+    "  '98',\n",
+    "  'Male',\n",
+    "  '1.88',\n",
+    "  'Tony_Stark Natasha_Romanoff Sam_Wilson Bucky_Barnes'),\n",
+    "...\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "70e93e78-f4dd-435f-b0f7-d56d928d7051",
+   "metadata": {},
+   "source": [
+    "**Question 1 -** How many characters are there in the dataset? Are there any duplicate?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9022b450-6fa6-41d0-999e-e019917cda79",
+   "metadata": {
+    "deletable": false,
+    "nbgrader": {
+     "cell_type": "code",
+     "checksum": "6474e3afbe1bbdb5b06ce64fc0534c4f",
+     "grade": false,
+     "grade_id": "cell-f45150625899eb53",
+     "locked": false,
+     "schema_version": 3,
+     "solution": true,
+     "task": false
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# YOUR CODE HERE\n",
+    "raise NotImplementedError()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0a1a4c6c-f95c-4dc3-8bb3-863d1048ae79",
+   "metadata": {},
+   "source": [
+    "**Question 2 -** Write a function that turns the `data` variable into a dictionnary data structure like the one below:\n",
+    "\n",
+    "```\n",
+    "{'Tony_Stark': ['Steve_Rogers', 'Natasha_Romanoff', 'Bruce_Banner', 'Thor_Odinson'], 'Steve_Rogers': ['Tony_Stark', 'Natasha_Romanoff', 'Sam_Wilson', 'Bucky_Barnes'], ..\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6cd4d69c-f7fb-4661-a737-42bea97431c7",
+   "metadata": {
+    "deletable": false,
+    "nbgrader": {
+     "cell_type": "code",
+     "checksum": "e414bc5c79fc378ab2466fc5df186739",
+     "grade": false,
+     "grade_id": "cell-b624a3dccb825990",
+     "locked": false,
+     "schema_version": 3,
+     "solution": true,
+     "task": false
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# YOUR CODE HERE\n",
+    "raise NotImplementedError()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a25a532d-6ec8-4982-9ec5-31428e4b98a5",
+   "metadata": {},
+   "source": [
+    "**Question 3 -** Count the number of friends for each character, and return it as a dictionnary where 1) the key is the character, and 2) the value the number of friends."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b4b930d6-eaea-4106-b6c8-a69c5ee581fa",
+   "metadata": {
+    "deletable": false,
+    "nbgrader": {
+     "cell_type": "code",
+     "checksum": "94fb1535f4fee547db2bf0d600f73f8e",
+     "grade": false,
+     "grade_id": "cell-4762ba6b7ddfc059",
+     "locked": false,
+     "schema_version": 3,
+     "solution": true,
+     "task": false
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# YOUR CODE HERE\n",
+    "raise NotImplementedError()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9f852068-4b41-4ff3-82c7-c76ae0e544d2",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "**Question 4 -** Return the one with most friend and their count, as a `Tuple` (a pair where the first element is the character, and the second one the number of friends). If there are many characters for this solution, then return them as a `Tuple` where the left part is a list."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b3f79181-c16c-4b42-92f3-809695ac7b70",
+   "metadata": {
+    "deletable": false,
+    "nbgrader": {
+     "cell_type": "code",
+     "checksum": "9cc09f861147be1d77b137692f826dab",
+     "grade": false,
+     "grade_id": "cell-c7ae717bc690c271",
+     "locked": false,
+     "schema_version": 3,
+     "solution": true,
+     "task": false
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# YOUR CODE HERE\n",
+    "raise NotImplementedError()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4a62b350-2bc8-476d-b9c6-13a08d649bdc",
+   "metadata": {},
+   "source": [
+    "**Question 5 -** Use the `Set` data structure to write a function that counts the number of different `Gender` in the original dataset."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "8ad718e6-ad13-4197-aee6-d9a6d51c3c93",
+   "metadata": {
+    "deletable": false,
+    "nbgrader": {
+     "cell_type": "code",
+     "checksum": "8539c3a7566b0552ef9b0b7c7323d3aa",
+     "grade": false,
+     "grade_id": "cell-a3e91eca6f8d5368",
+     "locked": false,
+     "schema_version": 3,
+     "solution": true,
+     "task": false
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# YOUR CODE HERE\n",
+    "raise NotImplementedError()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "74ba60d1-ac85-4d61-969a-3bdaa14fbc45",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "## Bonus\n",
+    "\n",
+    "**Question 6** - Find interesting facts in this dataset (in case you need some [background](https://en.wikipedia.org/wiki/Marvel_Cinematic_Universe)) and use the above functions or new ones to support your findings. Example of facts could be relationships between characters, distribution of attributes (eg age), particular group of characters, etc."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "22507d4f-41ff-479a-930f-1ebdd154e7dd",
+   "metadata": {
+    "deletable": false,
+    "nbgrader": {
+     "cell_type": "code",
+     "checksum": "a2ae9546a36fd1fe98820cb6ca9a8787",
+     "grade": false,
+     "grade_id": "cell-f49266562ffd4509",
+     "locked": false,
+     "schema_version": 3,
+     "solution": true,
+     "task": false
+    },
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# YOUR CODE HERE\n",
+    "raise NotImplementedError()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.9"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
+%% Cell type:markdown id:a1d61493 tags:
+NAME:
+%% Cell type:markdown id:db477b79 tags:
+<center>
+    <h3>ASSIGNMENT#1 (UE5 Algorithms)</h3>
+</center>
+Due date: **November 8th, 2024, 5pm** (no extension will be allowed).
+- Submit your solution by email: [romain.vuillemot@ec-lyon.fr](romain.vuillemot@ec-lyon.fr)
+- You have to work alone for this assignement.
+## Goal of this assignment
+In this assignment we provide you with a dataset of characters from a movie. Your role will be to answer the questions below programmatically, using Pyhon. **Please note you need to answer with fully working python code embedded in this notebook as solution (no external modules or files can be included).** You may then replace the code below with your answer for each question:
+```python
+# YOUR CODE HERE
+raise NotImplementedError()
+```
+Do not forget to also provide 1) examples of use of the solution, and 2) serveral tests (using `assert`).
+## Grading
+- 30% for the results to the questions
+- 60% for the code quality
+- 10% for the notebook presentation
+- +10% bonus question
+## Getting started
+We provide you with a dataset containing movie characters. Each character is represented as a row, along with connections to other characters, based on the movie script. You will use those connections to create a graph-based data structure and answer the questions.
+To get you started with the dataset, we provide you with the code that loads it:
+%% Cell type:code id:cf409997-537f-4f89-a039-c2b3494972f2 tags:
+``` python
+with open('users.csv', 'r') as file:
+    lines = file.readlines()
+    data = [tuple(line.strip().split(',')) for line in lines[1:]]
+```
+%% Cell type:markdown id:c48b6391-d858-4a35-b428-b99434ee2d57 tags:
+The `data` variable contains the list of characters. You may look at the `users.csv` file to grasp the values of this variable. Here is a sample:
+```
+[('Tony_Stark',
+  '40',
+  'Male',
+  '1.85',
+  'Steve_Rogers Natasha_Romanoff Bruce_Banner Thor_Odinson'),
+ ('Steve_Rogers',
+  '98',
+  'Male',
+  '1.88',
+  'Tony_Stark Natasha_Romanoff Sam_Wilson Bucky_Barnes'),
+...
+```
+%% Cell type:markdown id:70e93e78-f4dd-435f-b0f7-d56d928d7051 tags:
+**Question 1 -** How many characters are there in the dataset? Are there any duplicate?
+%% Cell type:code id:9022b450-6fa6-41d0-999e-e019917cda79 tags:
+``` python
+# YOUR CODE HERE
+raise NotImplementedError()
+```
+%% Cell type:markdown id:0a1a4c6c-f95c-4dc3-8bb3-863d1048ae79 tags:
+**Question 2 -** Write a function that turns the `data` variable into a dictionnary data structure like the one below:
+```
+{'Tony_Stark': ['Steve_Rogers', 'Natasha_Romanoff', 'Bruce_Banner', 'Thor_Odinson'], 'Steve_Rogers': ['Tony_Stark', 'Natasha_Romanoff', 'Sam_Wilson', 'Bucky_Barnes'], ..
+```
+%% Cell type:code id:6cd4d69c-f7fb-4661-a737-42bea97431c7 tags:
+``` python
+# YOUR CODE HERE
+raise NotImplementedError()
+```
+%% Cell type:markdown id:a25a532d-6ec8-4982-9ec5-31428e4b98a5 tags:
+**Question 3 -** Count the number of friends for each character, and return it as a dictionnary where 1) the key is the character, and 2) the value the number of friends.
+%% Cell type:code id:b4b930d6-eaea-4106-b6c8-a69c5ee581fa tags:
+``` python
+# YOUR CODE HERE
+raise NotImplementedError()
+```
+%% Cell type:markdown id:9f852068-4b41-4ff3-82c7-c76ae0e544d2 tags:
+**Question 4 -** Return the one with most friend and their count, as a `Tuple` (a pair where the first element is the character, and the second one the number of friends). If there are many characters for this solution, then return them as a `Tuple` where the left part is a list.
+%% Cell type:code id:b3f79181-c16c-4b42-92f3-809695ac7b70 tags:
+``` python
+# YOUR CODE HERE
+raise NotImplementedError()
+```
+%% Cell type:markdown id:4a62b350-2bc8-476d-b9c6-13a08d649bdc tags:
+**Question 5 -** Use the `Set` data structure to write a function that counts the number of different `Gender` in the original dataset.
+%% Cell type:code id:8ad718e6-ad13-4197-aee6-d9a6d51c3c93 tags:
+``` python
+# YOUR CODE HERE
+raise NotImplementedError()
+```
+%% Cell type:markdown id:74ba60d1-ac85-4d61-969a-3bdaa14fbc45 tags:
+## Bonus
+**Question 6** - Find interesting facts in this dataset (in case you need some [background](https://en.wikipedia.org/wiki/Marvel_Cinematic_Universe)) and use the above functions or new ones to support your findings. Example of facts could be relationships between characters, distribution of attributes (eg age), particular group of characters, etc.
+%% Cell type:code id:22507d4f-41ff-479a-930f-1ebdd154e7dd tags:
+``` python
+# YOUR CODE HERE
+raise NotImplementedError()
+```