Skip to content
Snippets Groups Projects
Commit 724fa496 authored by Romain Vuillemot's avatar Romain Vuillemot
Browse files

Release assignment 1

parent abc40d69
No related branches found
No related tags found
No related merge requests found
# UE5 Fundamentals of Algorithms
### Bachelor of Science in Data Science for Responsible Business
Instructor: [Romain Vuillemot](romain.vuillemot@ec-lyon.fr)
<div style="text-align: center;">
<img src="../figures/logo-ecl.png" width="40%" style="width:20%; display:inline-block; vertical-align:middle;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<img src="../figures/logo-emlyon.png" width="10%" style="width:20%; display:inline-block; vertical-align:middle;">
</div>
## Assignments
%% Cell type:markdown id:a1d61493 tags:
NAME:
%% Cell type:markdown id:db477b79 tags:
<center>
<h3>ASSIGNMENT#1 (UE5 Algorithms)</h3>
</center>
Due date: **November 8th, 2024, 5pm** (no extension will be allowed).
- Submit your solution by email: [romain.vuillemot@ec-lyon.fr](romain.vuillemot@ec-lyon.fr)
- You have to work alone for this assignement.
## Goal of this assignment
In this assignment we provide you with a dataset of characters from a movie. Your role will be to answer the questions below programmatically, using Pyhon. **Please note you need to answer with fully working python code embedded in this notebook as solution (no external modules or files can be included).** You may then replace the code below with your answer for each question:
```python
# YOUR CODE HERE
raise NotImplementedError()
```
Do not forget to also provide 1) examples of use of the solution, and 2) serveral tests (using `assert`).
## Grading
- 30% for the results to the questions
- 60% for the code quality
- 10% for the notebook presentation
- +10% bonus question
## Getting started
We provide you with a dataset containing movie characters. Each character is represented as a row, along with connections to other characters, based on the movie script. You will use those connections to create a graph-based data structure and answer the questions.
To get you started with the dataset, we provide you with the code that loads it:
%% Cell type:code id:cf409997-537f-4f89-a039-c2b3494972f2 tags:
``` python
with open('users.csv', 'r') as file:
lines = file.readlines()
data = [tuple(line.strip().split(',')) for line in lines[1:]]
```
%% Cell type:markdown id:c48b6391-d858-4a35-b428-b99434ee2d57 tags:
The `data` variable contains the list of characters. You may look at the `users.csv` file to grasp the values of this variable. Here is a sample:
```
[('Tony_Stark',
'40',
'Male',
'1.85',
'Steve_Rogers Natasha_Romanoff Bruce_Banner Thor_Odinson'),
('Steve_Rogers',
'98',
'Male',
'1.88',
'Tony_Stark Natasha_Romanoff Sam_Wilson Bucky_Barnes'),
...
```
%% Cell type:markdown id:70e93e78-f4dd-435f-b0f7-d56d928d7051 tags:
**Question 1 -** How many characters are there in the dataset? Are there any duplicate?
%% Cell type:code id:9022b450-6fa6-41d0-999e-e019917cda79 tags:
``` python
# YOUR CODE HERE
raise NotImplementedError()
```
%% Cell type:markdown id:0a1a4c6c-f95c-4dc3-8bb3-863d1048ae79 tags:
**Question 2 -** Write a function that turns the `data` variable into a dictionnary data structure like the one below:
```
{'Tony_Stark': ['Steve_Rogers', 'Natasha_Romanoff', 'Bruce_Banner', 'Thor_Odinson'], 'Steve_Rogers': ['Tony_Stark', 'Natasha_Romanoff', 'Sam_Wilson', 'Bucky_Barnes'], ..
```
%% Cell type:code id:6cd4d69c-f7fb-4661-a737-42bea97431c7 tags:
``` python
# YOUR CODE HERE
raise NotImplementedError()
```
%% Cell type:markdown id:a25a532d-6ec8-4982-9ec5-31428e4b98a5 tags:
**Question 3 -** Count the number of friends for each character, and return it as a dictionnary where 1) the key is the character, and 2) the value the number of friends.
%% Cell type:code id:b4b930d6-eaea-4106-b6c8-a69c5ee581fa tags:
``` python
# YOUR CODE HERE
raise NotImplementedError()
```
%% Cell type:markdown id:9f852068-4b41-4ff3-82c7-c76ae0e544d2 tags:
**Question 4 -** Return the one with most friend and their count, as a `Tuple` (a pair where the first element is the character, and the second one the number of friends). If there are many characters for this solution, then return them as a `Tuple` where the left part is a list.
%% Cell type:code id:b3f79181-c16c-4b42-92f3-809695ac7b70 tags:
``` python
# YOUR CODE HERE
raise NotImplementedError()
```
%% Cell type:markdown id:4a62b350-2bc8-476d-b9c6-13a08d649bdc tags:
**Question 5 -** Use the `Set` data structure to write a function that counts the number of different `Gender` in the original dataset.
%% Cell type:code id:8ad718e6-ad13-4197-aee6-d9a6d51c3c93 tags:
``` python
# YOUR CODE HERE
raise NotImplementedError()
```
%% Cell type:markdown id:74ba60d1-ac85-4d61-969a-3bdaa14fbc45 tags:
## Bonus
**Question 6** - Find interesting facts in this dataset (in case you need some [background](https://en.wikipedia.org/wiki/Marvel_Cinematic_Universe)) and use the above functions or new ones to support your findings. Example of facts could be relationships between characters, distribution of attributes (eg age), particular group of characters, etc.
%% Cell type:code id:22507d4f-41ff-479a-930f-1ebdd154e7dd tags:
``` python
# YOUR CODE HERE
raise NotImplementedError()
```
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment