You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Data-Science-For-Beginners/1-Introduction/01-defining-data-science/assignment.md

2.3 KiB

Assignment: Data Science Scenarios

In this first assignment, we ask you to think about some real-life process or problem in different problem domains, and how you can improve it using the Data Science process. Think about the following:

  1. Which data can you collect?
  2. How would you collect it?
  3. How would you store the data? How large the data is likely to be?
  4. Which insights you might be able to get from this data? Which decisions we would be able to take based on the data?

Try to think about 3 different problems/processes and describe each of the points above for each problem domain.

Here are some of the problem domains and problems that can get you started thinking:

  1. How can you use data to improve education process for children in schools?
  2. How can you use data to control vaccination during the pandemic?
  3. How can you use data to make sure you are being productive at work?

Instructions

Fill in the following table (substitute suggested problem domains for your own ones if needed):

Problem Domain Problem Which data to collect How to store the data Which insights/decisions we can make
Education improving exam perfomance students of highschool Attendance records, previous exam scores, hours spent studying in the library CSV files initially, then relational database (PostgreSQL) factors increase perfomance and factors decrease the perfomance. and improve teaching strategies
Vaccination Monitor vaccination coverage during a pandemic Vaccination dates, age, location, vaccine type, side effects SQL database Detect low coverage areas, prioritize vaccine distribution
Productivity Improve personal work productivity Task completion time, task count, break duration, app usage SQLite or cloud spreadsheets Optimize work schedules, reduce time waste

Rubric

Exemplary Adequate Needs Improvement
One was able to identify reasonable data sources, ways of storing data and possible decisions/insights for all problem domains Some of the aspects of the solution are not detailed, data storage is not discussed, at least 2 problem domains are described Only parts of the data solution are described, only one problem domain is considered.