Small project

Data visualization and exploration

Important

Due date: 11:59pm on Sunday, 13 October 2024.

Checklist:

  1. report.pdf in GitHub repository of no more than 10 pages
  2. source code in GitHub repository (should be able to run from top to bottom)
  3. README.md with instructions on how to run the code and reproduce the PDF report

Use the corresponding invite link in this google doc (accessible with your EPFL account) to accept the project, and either join an existing team or create a new one. Once this is done, go to the course GitHub organization and locate the repo titled mini-project-TEAM-NAME to get started.

The goal of this project is data exploration. Find an interesting (in the sense it interests you!) data set and

These steps correspond to the first three stages of the data science life cycle.

Note that the purpose of this project is to play around, demonstrating your data exploration, wrangling and visualization skills. Hopefully, you will also find scientifically interesting questions or questions of personal interest, e.g., does ball possession matters in a game of football? Is there a link between the GDP of a country and its contribution to global warming? Try though to avoid Kaggle data sets that have been analyzed zillion times before.

Here is an example of what your report could look like. See this page for some tips and resources (e.g., example datasets).