Small project
Data visualization and exploration
Due date: 11:59pm on Sunday, 13 October 2024.
Checklist:
Use the corresponding invite link in this google doc (accessible with your EPFL account) to accept the project, and either join an existing team or create a new one. Once this is done, go to the course GitHub organization and locate the repo titled mini-project-TEAM-NAME
to get started.
The goal of this project is data exploration. Find an interesting (in the sense it interests you!) data set and
- lay out some questions about the data
- describe the data
- explore the data
- visualize the data
- use more detailed visualization techniques to hint answers
These steps correspond to the first three stages of the data science life cycle.
Note that the purpose of this project is to play around, demonstrating your data exploration, wrangling and visualization skills. Hopefully, you will also find scientifically interesting questions or questions of personal interest, e.g., does ball possession matters in a game of football? Is there a link between the GDP of a country and its contribution to global warming? Try though to avoid Kaggle data sets that have been analyzed zillion times before.
Here is an example of what your report could look like. See this page for some tips and resources (e.g., example datasets).