Skip to content
Snippets Groups Projects
Commit 4192979a authored by Bouyahya Zied's avatar Bouyahya Zied
Browse files

Instructions

parent ed10c8be
Branches
No related tags found
No related merge requests found
# Practical Work statement: Exploratory Data Analysis (EDA) on the cars dataset
**Objective:**
The goal of this practical work is to further familiarize you with the key steps of Exploratory Data Analysis (EDA) using Python. You will manipulate a dataset, clean it, and analyze the data to extract useful insights.
---
## Steps to follow:
1. **Importing required libraries for EDA**
- Import essential Python libraries for data analysis, such as `pandas`, `numpy`, `matplotlib`, `seaborn`, and `scipy`.
2. **Loading the data into a dataframe**
- Load the provided dataset into a dataframe using `pandas`.
- Display the first few rows of the dataset to get an overview of the data.
3. **Checking data types**
- Identify the data types of each column in the dataset (e.g., integer, float, object).
- Ensure that the data types are apropriate for the analysis.
4. **Dropping irrelevant columns**
- Remove any columns that are not relevant to the analysis or do not contribute to the insights.
5. **Renaming columns**
- Rename columns to make them more descriptive or easier to work with.
6. **Dropping duplicate rows**
- Identify and remove any duplicate rows in the dataset to ensure data integrity.
7. **Handling missing or null values**
- Detect missing or null values in the dataset.
- Decide on a strategy to handle them (e.g., imputation, removal).
8. **Detecting outliers**
- Identify outliers in the dataset using statistical methods or visualization techniques.
- Decide whether to remove, transform, or keep the outliers based on the contex.
9. **Univariate, bivariate, and multivariate analysis**
- Perform univariate analysis to understand the distribution of individual variables.
- Conduct bivariate analysis to explore relationships between two variables.
- Perform multivariate analysis to understand interactions between multiple variables.
- Use visualizations (e.g., histograms, scatter plots, heatmaps) to support your analysis.
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment