Instructions

4192979a · Bouyahya Zied · ed10c8be · 4192979a
Commit 4192979a authored 4 months ago by Bouyahya Zied
--- a/EDA Application/Instructions.md
+++ b/EDA Application/Instructions.md
+# Practical Work statement: Exploratory Data Analysis (EDA) on the cars dataset
+
+**Objective:**  
+The goal of this practical work is to further familiarize you with the key steps of Exploratory Data Analysis (EDA) using Python. You will manipulate a dataset, clean it, and analyze the data to extract useful insights.
+
+---
+
+## Steps to follow:
+
+1. **Importing required libraries for EDA**  
+   - Import essential Python libraries for data analysis, such as `pandas`, `numpy`, `matplotlib`, `seaborn`, and `scipy`.
+
+2. **Loading the data into a dataframe**  
+   - Load the provided dataset into a dataframe using `pandas`.  
+   - Display the first few rows of the dataset to get an overview of the data.
+
+3. **Checking data types**  
+   - Identify the data types of each column in the dataset (e.g., integer, float, object).  
+   - Ensure that the data types are apropriate for the analysis.
+
+4. **Dropping irrelevant columns**  
+   - Remove any columns that are not relevant to the analysis or do not contribute to the insights.
+
+5. **Renaming columns**  
+   - Rename columns to make them more descriptive or easier to work with.
+
+6. **Dropping duplicate rows**  
+   - Identify and remove any duplicate rows in the dataset to ensure data integrity.
+
+7. **Handling missing or null values**  
+   - Detect missing or null values in the dataset.  
+   - Decide on a strategy to handle them (e.g., imputation, removal).
+
+8. **Detecting outliers**  
+   - Identify outliers in the dataset using statistical methods or visualization techniques.  
+   - Decide whether to remove, transform, or keep the outliers based on the contex.
+
+9. **Univariate, bivariate, and multivariate analysis**  
+   - Perform univariate analysis to understand the distribution of individual variables.  
+   - Conduct bivariate analysis to explore relationships between two variables.  
+   - Perform multivariate analysis to understand interactions between multiple variables.  
+   - Use visualizations (e.g., histograms, scatter plots, heatmaps) to support your analysis.
\ No newline at end of file