diff --git a/labs-solutions/05-programming-strategies-exercises.ipynb b/labs-solutions/05-programming-strategies-exercises.ipynb new file mode 100644 index 0000000000000000000000000000000000000000..fef8eae75fde3d4d7a8b3ef0f59c8618c7b5d077 --- /dev/null +++ b/labs-solutions/05-programming-strategies-exercises.ipynb @@ -0,0 +1,794 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "e58599e3-9ab7-4d43-bb22-aeccade424ce", + "metadata": {}, + "source": [ + "# UE5 Fundamentals of Algorithms\n", + "# Lab 5: Programming strategies: Brute force and greedy" + ] + }, + { + "cell_type": "markdown", + "id": "691b3c38-0e83-4bb2-ac90-ef76d2dd9a7a", + "metadata": {}, + "source": [ + "---" + ] + }, + { + "cell_type": "markdown", + "id": "3a1dbf76-c34e-4859-ab46-6ff8ba5d4f19", + "metadata": {}, + "source": [ + "<details style=\"border: 1px\">\n", + "<summary> How to use those notebooks</summary>\n", + " \n", + "For each of the following questions:\n", + "- In the `# YOUR CODE HERE` cell, remove `raise NotImplementedError()` to write your code\n", + "- Write an example of use of your code or make sure the given examples and tests pass\n", + "- Add extra tests in the `#Tests` cell\n", + " \n", + "</details>" + ] + }, + { + "cell_type": "markdown", + "id": "8f128fee-a1d6-4106-b997-0ed27b5ed91d", + "metadata": { + "tags": [] + }, + "source": [ + "# Exercice 1: Knapsack problem\n", + "\n", + "The goal of the knapsack problem is to select the maximum number of items (each with weights, coming from a limited set) that can be packed in a knapsack, which has a limited capacity (eg a total weight). The problem can be formulated as follows:\n", + "\n", + "$\\max \\sum_{i=1}^n v_i x_i$\n", + "\n", + "- **$x_i ∈{0,1}$** the binary variable to pick item $i$ from $n$ weights\n", + "\n", + "So that:\n", + "\n", + "$\\sum_{i=1}^n w_i x_i \\leq W \\quad \\text{and} \\quad x_i \\in \\{0,1,2,\\dots,n-1\\}$\n", + "\n", + "- **W:** The maximum weight capacity of the knapsack\n", + "- **w:** A list containing the $n$ weights of the available items\n", + "\n", + "We first assume that each item can only be picked 1 time at most. Here is an example of solution:" + ] + }, + { + "cell_type": "code", + "execution_count": 96, + "id": "3c4b8bae-2220-4fff-b3fc-189585e25fb1", + "metadata": {}, + "outputs": [], + "source": [ + "weights = [4, 5, 7, 8] # limited set of items with weights\n", + "capacity = 17 # maximum capacity of the knapsack\n", + "selected_items = [1, 1, 0, 1] # x_i items: 1 item of weight 4, etc." + ] + }, + { + "cell_type": "markdown", + "id": "72b396ed-8504-4d5d-a16d-d1d9b21b09c7", + "metadata": { + "tags": [] + }, + "source": [ + "The above selection of items is correct (and optimal) as it sums equals the total capacity weight:\n", + "\n", + "$(4 \\times 1) + (5 \\times 1) + (7 \\times 0) + (8 \\times 1) = 4 + 5 + 0 + 8 = 17$" + ] + }, + { + "cell_type": "markdown", + "id": "c49a5db7-16d5-4809-827a-c167af01d8fd", + "metadata": {}, + "source": [ + "## 1.1 Check if a solution is correct" + ] + }, + { + "cell_type": "markdown", + "id": "5379ed20-2701-4475-b3a2-baaf7b0054c6", + "metadata": {}, + "source": [ + "Write a function that checks if a given list of selected items fits into the knapsack (but it is not necessarily optimal):" + ] + }, + { + "cell_type": "code", + "execution_count": 97, + "id": "e114caea-cf9d-4fe3-bd01-8a747667f6da", + "metadata": { + "nbgrader": { + "grade": false, + "grade_id": "cell-efc715872a4a89ca", + "locked": false, + "schema_version": 3, + "solution": true, + "task": false + }, + "tags": [] + }, + "outputs": [], + "source": [ + "def check_knapsack_solution(weights, capacity, selected_items):\n", + " ### BEGIN SOLUTION\n", + " total_weight = sum(weights[i] * selected_items[i] for i in range(len(weights)))\n", + " return total_weight <= capacity\n", + " ### END SOLUTION" + ] + }, + { + "cell_type": "code", + "execution_count": 98, + "id": "5ef1cc51-63e7-404f-8e05-f450ea264797", + "metadata": {}, + "outputs": [], + "source": [ + "assert check_knapsack_solution(weights, capacity, selected_items) # it fits" + ] + }, + { + "cell_type": "markdown", + "id": "e1ea24d2-df15-4b8a-923b-4006d0911a81", + "metadata": { + "tags": [] + }, + "source": [ + "## 1.2 Brute force approach" + ] + }, + { + "cell_type": "markdown", + "id": "d8552dac-ea9d-4ded-a57f-da86dbcfff47", + "metadata": {}, + "source": [ + "A brute force solution to the Knapsack problem is trying **all possible combinations of items** and check if they fit. You may implement such solution as follows:\n", + "\n", + "- Calculate the combination of items selections and their total weight\n", + "- Check if each combination is within the knapsack capacity\n", + "- Return the best (valid) combination that fits within the capacity\n", + "\n", + "You may use the Cartesian product ([doc](https://docs.python.org/3/library/itertools.html#itertools.product)) to create the list of all possible selections by permutting $x_i$. This means to create an array of permutation of selected items (`0` meaning the item is not selected, `1` it is selected). So `(0, 0, 0, 0)` means we do not pick any item, and `(1, 1, 1, 1)` we pick them all. Here is the code for the product:" + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "id": "e688c41f-ef8c-413b-b86a-c116c457d8b1", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[(0, 0, 0, 0),\n", + " (0, 0, 0, 1),\n", + " (0, 0, 1, 0),\n", + " (0, 0, 1, 1),\n", + " (0, 1, 0, 0),\n", + " (0, 1, 0, 1),\n", + " (0, 1, 1, 0),\n", + " (0, 1, 1, 1),\n", + " (1, 0, 0, 0),\n", + " (1, 0, 0, 1),\n", + " (1, 0, 1, 0),\n", + " (1, 0, 1, 1),\n", + " (1, 1, 0, 0),\n", + " (1, 1, 0, 1),\n", + " (1, 1, 1, 0),\n", + " (1, 1, 1, 1)]" + ] + }, + "execution_count": 32, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from itertools import product\n", + "list(product([0, 1], repeat=len(weights)))" + ] + }, + { + "cell_type": "markdown", + "id": "aa8969ef-8ec2-4765-979a-d0f3a363943f", + "metadata": {}, + "source": [ + "The brute force function will return a tuple with both the sum and the list of $x_i$ as a `Tuple`." + ] + }, + { + "cell_type": "code", + "execution_count": 99, + "id": "4b9a42ff-416b-45ef-b7e9-de1ed6f6a478", + "metadata": { + "nbgrader": { + "grade": false, + "grade_id": "cell-4b4e08cac35afd19", + "locked": false, + "schema_version": 3, + "solution": true, + "task": false + }, + "tags": [] + }, + "outputs": [], + "source": [ + "def brute_force_knapsack(weights, capacity):\n", + " ### BEGIN SOLUTION\n", + " n = len(weights)\n", + " best_total_weight = 0\n", + " best_combination = None\n", + "\n", + " # generate all combinations of weights\n", + " for selection in product([0, 1], repeat=n):\n", + " total_weight = sum(weights[i] * selection[i] for i in range(n))\n", + " \n", + " # check if fits in the knapsack\n", + " if total_weight <= capacity and total_weight > best_total_weight:\n", + " best_total_weight = total_weight\n", + " best_combination = selection\n", + " \n", + " return best_total_weight, best_combination\n", + " ### END SOLUTION" + ] + }, + { + "cell_type": "code", + "execution_count": 101, + "id": "54d8d9d5-e996-4502-a9ac-0fcf9e9072bf", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "# Tests\n", + "assert brute_force_knapsack(weights, capacity) == (17, (1, 1, 0, 1)) # (sum, (x_i))" + ] + }, + { + "cell_type": "markdown", + "id": "d3142897-f48c-4d61-8322-dec75b79da50", + "metadata": { + "tags": [] + }, + "source": [ + "To better interpret the solution, write a function that converts the $x_i$ into a list of weights:" + ] + }, + { + "cell_type": "code", + "execution_count": 102, + "id": "6450ebd1-b2cb-4c13-be89-c8c1727030e0", + "metadata": { + "nbgrader": { + "grade": false, + "grade_id": "cell-9cb73b6cad9ce8c8", + "locked": false, + "schema_version": 3, + "solution": true, + "task": false + }, + "tags": [] + }, + "outputs": [], + "source": [ + "def get_used_weights(weights, best_combination):\n", + " ### BEGIN SOLUTION\n", + " if best_combination is None:\n", + " return []\n", + " \n", + " used_weights = [weights[i] for i in range(len(weights)) if best_combination[i] == 1]\n", + " \n", + " return used_weights\n", + " ### END SOLUTION" + ] + }, + { + "cell_type": "code", + "execution_count": 103, + "id": "24f10578-4d4e-4921-92f1-4db5073e64ec", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "assert get_used_weights(weights, (1, 1, 0, 1)) == [4, 5, 8]" + ] + }, + { + "cell_type": "markdown", + "id": "9786bab1-ef77-4a8c-9988-19b1dea15620", + "metadata": { + "tags": [] + }, + "source": [ + "## 1.3 Greedy approach\n", + "\n", + "Now use a greedy approach to solve this problem. \n", + "\n", + "**Important:** we now have **unlimited** number of items to take from. You may use the following steps:\n", + "\n", + "1. Start by initializing a list containing the indices of all remaining items\n", + "2. Begin with an empty knapsack, having a total weight of 0\n", + "3. In each iteration, select the item with the least weight from the remaining items that can fit into the current knapsack capacity\n", + "4. Add this item to the knapsack and remove it from the list of available items\n", + "5. After completing the process, return the total weight of the knapsack and the list of items selected.\n", + "\n", + "Tip: to implement the greedy hypothesis (picking the items with lowest weight first), you may sort the indice list of ascending weights using:\n", + "\n", + "```python\n", + "sorted_indices = sorted(range(n), key=lambda i: weights[i])\n", + "```\n", + "\n", + "The function should return the result in a format similar to the brute force. " + ] + }, + { + "cell_type": "code", + "execution_count": 113, + "id": "14b9d2b0-813b-4c6c-91e4-a2049274726d", + "metadata": { + "nbgrader": { + "grade": false, + "grade_id": "cell-ef92479f0b318cdf", + "locked": false, + "schema_version": 3, + "solution": true, + "task": false + }, + "tags": [] + }, + "outputs": [], + "source": [ + "def greedy_knapsack(weights, W):\n", + " ### BEGIN SOLUTION\n", + " n = len(weights)\n", + " remaining_W = W\n", + " selected_items = [0] * n\n", + "\n", + " sorted_indices = sorted(range(n), key=lambda i: weights[i])\n", + "\n", + " for i in sorted_indices:\n", + " if remaining_W <= 0:\n", + " break\n", + " \n", + " max_possible_quantity = remaining_W // weights[i]\n", + "\n", + " selected_items[i] = max_possible_quantity\n", + " remaining_W -= max_possible_quantity * weights[i]\n", + " \n", + " total_weight = capacity - remaining_W # The weight used in the knapsack\n", + " return total_weight, selected_items\n", + " ### END SOLUTION" + ] + }, + { + "cell_type": "code", + "execution_count": 114, + "id": "92be4d4b-6ac9-4ada-801d-d9ef21a67b92", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(10, [5, 0, 0])" + ] + }, + "execution_count": 114, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "weights = [2, 3, 5]\n", + "capacity = 11\n", + "greedy_knapsack(weights, capacity)" + ] + }, + { + "cell_type": "code", + "execution_count": 115, + "id": "71cf817f-34f9-4c52-ab84-0902964bfd62", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "assert brute_force_knapsack(weights, capacity)[0] == greedy_knapsack(weights, capacity)[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 112, + "id": "87bf1ffe-9e68-4f64-a197-16364a11ee47", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "weights = [2, 3, 6]\n", + "capacity = 11\n", + "assert brute_force_knapsack(weights, capacity)[0] > greedy_knapsack(weights, capacity)[0]" + ] + }, + { + "cell_type": "markdown", + "id": "e87958bc-bacc-4dd6-85b5-45af8cb59e67", + "metadata": {}, + "source": [ + "As you can see the brute force outperfoms the greedy for this particular scenario. " + ] + }, + { + "cell_type": "markdown", + "id": "e7602801-5fb5-4904-bafa-810d9787c6f5", + "metadata": { + "nbgrader": { + "grade": false, + "grade_id": "cell-9e7356aa23ccfb4c", + "locked": false, + "schema_version": 3, + "solution": false, + "task": false + }, + "tags": [] + }, + "source": [ + "## 1.4 Greedy with limited availability\n", + "\n", + "Now write a new version of the Greedy function but with a limit `max_quantity` variable that limits the quantity of items that can be picked. Eg with `max_quantity = 2` each item can only be picked twice." + ] + }, + { + "cell_type": "code", + "execution_count": 118, + "id": "755116cb-456e-48ad-8dfe-bd72617488d9", + "metadata": { + "nbgrader": { + "grade": false, + "grade_id": "cell-978651102277b091", + "locked": false, + "schema_version": 3, + "solution": true, + "task": false + }, + "tags": [] + }, + "outputs": [], + "source": [ + "def greedy_knapsack_with_max_quantity(weights, capacity, max_quantity):\n", + " ### BEGIN SOLUTION\n", + " n = len(weights)\n", + " remaining_capacity = capacity\n", + " selected_items = []\n", + " \n", + " sorted_indices = sorted(range(n), key=lambda i: weights[i])\n", + "\n", + " for i in sorted_indices:\n", + " if remaining_capacity <= 0:\n", + " break\n", + " \n", + " max_possible_quantity = min(max_quantity, remaining_capacity // weights[i])\n", + " \n", + " for _ in range(max_possible_quantity):\n", + " selected_items.append(i)\n", + " \n", + " remaining_capacity -= max_possible_quantity * weights[i]\n", + " \n", + " total_weight = capacity - remaining_capacity\n", + " return total_weight, selected_items\n", + " ### END SOLUTION" + ] + }, + { + "cell_type": "code", + "execution_count": 119, + "id": "cae704d4-4904-4b6c-89de-3029bd1b5fbd", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(13, [0, 0, 1])" + ] + }, + "execution_count": 119, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "weights = [4, 5, 7, 8] # item weights\n", + "capacity = 15 # knapsack capacity\n", + "max_quantity = 2 # maximum quantity an item can be selected (for each item)\n", + "\n", + "greedy_knapsack_with_max_quantity(weights, capacity, max_quantity)" + ] + }, + { + "cell_type": "markdown", + "id": "e5b7304d-7b64-42df-9465-8f4491525a8b", + "metadata": {}, + "source": [ + "# Exercice 2: Organize a schedule\n", + "\n", + "Write a greedy algorithm that returns a list of time slots that do not overlap. The time slots are provided using the `(start, end)` format where start/end are integers (instead of time). " + ] + }, + { + "cell_type": "markdown", + "id": "2698e92a-188d-4dbf-8e76-f7b7374db2b0", + "metadata": {}, + "source": [ + "Let's begin by writing a function that checks that a given schedule is OK:\n", + "\n", + "- Each time slot has `start < end` and `duration > 0`\n", + "- No time slot overlaps" + ] + }, + { + "cell_type": "code", + "execution_count": 120, + "id": "95c36e13-10c5-4922-9211-f8fc269de372", + "metadata": { + "nbgrader": { + "grade": false, + "grade_id": "cell-9e31fffcdff1654c", + "locked": false, + "schema_version": 3, + "solution": true, + "task": false + }, + "tags": [] + }, + "outputs": [], + "source": [ + "def check_schedule_solution(schedule):\n", + " ### BEGIN SOLUTION\n", + " for start, end in schedule:\n", + " if start >= end:\n", + " return False\n", + "\n", + " sorted_schedule = sorted(schedule, key=lambda x: x[0])\n", + "\n", + " for i in range(1, len(sorted_schedule)):\n", + " prev_end = sorted_schedule[i - 1][1]\n", + " current_start = sorted_schedule[i][0]\n", + " \n", + " if current_start < prev_end:\n", + " return False\n", + " ### END SOLUTION\n", + " return True" + ] + }, + { + "cell_type": "code", + "execution_count": 121, + "id": "90a4a8a7-4ee7-4bca-9bae-e8372e9bc047", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "assert check_schedule_solution([(0, 2), (2, 4)])\n", + "assert not check_schedule_solution([(0, 3), (2, 4)])" + ] + }, + { + "cell_type": "markdown", + "id": "b8219a51-e70c-4885-941f-287bff3eadfc", + "metadata": {}, + "source": [ + "Now write the greedy algorithm that picks time slots without overlaps. In this question you may prioritize the ones that end last, so you may sort by [reverse order](https://docs.python.org/3/howto/sorting.html). Eg:\n", + "\n", + "```python\n", + "intervals.sort(key=lambda x: x[1])\n", + "````" + ] + }, + { + "cell_type": "code", + "execution_count": 122, + "id": "e0ed0841-1255-4bdb-b7f8-e689c2a953af", + "metadata": { + "nbgrader": { + "grade": false, + "grade_id": "interval_scheduling", + "locked": false, + "schema_version": 3, + "solution": true, + "task": false + }, + "tags": [] + }, + "outputs": [], + "source": [ + "def interval_scheduling(intervals):\n", + " ### BEGIN SOLUTION\n", + " if not intervals:\n", + " return []\n", + "\n", + " intervals.sort(key=lambda x: x[1]) # earliest\n", + " \n", + " selected_intervals = [intervals[0]]\n", + " current_end_time = intervals[0][1]\n", + " \n", + " for interval in intervals[1:]:\n", + " start_time, end_time = interval\n", + " if start_time >= current_end_time:\n", + " selected_intervals.append(interval)\n", + " current_end_time = end_time\n", + "\n", + " return selected_intervals\n", + " ### END SOLUTION" + ] + }, + { + "cell_type": "code", + "execution_count": 123, + "id": "d6a50280-f016-4686-8515-1b4a136c0fd9", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[(0, 2), (2, 4)]" + ] + }, + "execution_count": 123, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "interval_scheduling([(0, 2), (2, 4), (1, 3)])" + ] + }, + { + "cell_type": "code", + "execution_count": 124, + "id": "bd38d673-077d-44b1-b81c-ac7448f5c01c", + "metadata": { + "nbgrader": { + "grade": true, + "grade_id": "correct_interval_scheduling", + "locked": true, + "points": 1, + "schema_version": 3, + "solution": false, + "task": false + }, + "tags": [] + }, + "outputs": [], + "source": [ + "assert interval_scheduling([(0, 2), (2, 4), (1, 3)]) == [(0, 2), (2, 4)]" + ] + }, + { + "cell_type": "markdown", + "id": "c142c3b3-2f01-48d0-8596-601bb133542b", + "metadata": { + "tags": [] + }, + "source": [ + "Write another version of the greedy algorithm that prioritizes the time slots that are the longest." + ] + }, + { + "cell_type": "code", + "execution_count": 125, + "id": "095836b8-2612-4d58-a9ce-b7968489418c", + "metadata": { + "nbgrader": { + "grade": false, + "grade_id": "interval_scheduling_longest", + "locked": false, + "schema_version": 3, + "solution": true, + "task": false + }, + "tags": [] + }, + "outputs": [], + "source": [ + "def interval_scheduling_longest(intervals):\n", + " ### BEGIN SOLUTION\n", + " intervals.sort(key=lambda x: x[1] - x[0], reverse=True)\n", + " \n", + " selected_intervals = []\n", + " current_end_time = float('-inf')\n", + " \n", + " for interval in intervals:\n", + " start_time, end_time = interval\n", + " if start_time >= current_end_time:\n", + " selected_intervals.append(interval)\n", + " current_end_time = end_time\n", + " return selected_intervals\n", + " ### END SOLUTION" + ] + }, + { + "cell_type": "code", + "execution_count": 85, + "id": "afcf3a69-aa36-4b71-93a8-218337ed88b9", + "metadata": { + "nbgrader": { + "grade": true, + "grade_id": "correct_interval_scheduling_longest", + "locked": true, + "points": 1, + "schema_version": 3, + "solution": false, + "task": false + }, + "tags": [] + }, + "outputs": [], + "source": [ + "assert interval_scheduling_longest([(0, 4), (3, 5), (5, 7)]) == [(0, 4), (5, 7)]\n", + "assert interval_scheduling([(0, 4), (3, 5), (5, 7)]) == interval_scheduling_longest(([(0, 4), (3, 5), (5, 7)]))" + ] + }, + { + "cell_type": "markdown", + "id": "a559e4db-9ef2-4d20-bb9f-cb638d8c1f24", + "metadata": {}, + "source": [ + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 57, + "id": "46947633-3f5f-420b-b416-500a0dab4fcb", + "metadata": { + "nbgrader": { + "grade": false, + "grade_id": "cell-0bf75b2bfe8e2e4c", + "locked": false, + "schema_version": 3, + "solution": true, + "task": false + }, + "tags": [] + }, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.9" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}