3.4 KiB
Assignment: Data Science Scenarios
In this first assignment, we ask you to think about some real-life processes or problems in different domains, and how you can improve them using the Data Science process. Consider the following:
- What data can you collect?
- How would you collect it?
- How would you store the data? How large is the data likely to be?
- What insights might you be able to derive from this data? What decisions could be made based on the data?
Try to think about three different problems or processes and describe each of the points above for each domain.
Here are some domains and problems to help you start thinking:
- How can you use data to improve the education process for children in schools?
- How can you use data to manage vaccination during a pandemic?
- How can you use data to ensure you are being productive at work?
Instructions
Fill in the following table (replace the suggested domains with your own if needed):
Problem Domain | Problem | What data to collect | How to store the data | What insights/decisions we can make |
---|---|---|---|---|
Education | At universities, lecture attendance is often low, and we hypothesize that students who attend lectures more frequently tend to perform better in exams. We want to encourage attendance and test this hypothesis. | Attendance can be tracked using photos taken by security cameras in classrooms or by tracking the Bluetooth/Wi-Fi addresses of students' mobile phones in class. Exam data is already available in the university database. | If we use security camera images, we need to store a few (5-10) photos taken during class (unstructured data) and then use AI to identify students' faces (convert data to structured form). | We can calculate average attendance for each student and check for correlations with exam grades. We'll discuss correlation further in the probability and statistics section. To encourage attendance, we can publish weekly attendance rankings on the school portal and hold prize draws for students with the highest attendance. |
Vaccination | ||||
Productivity |
We provide just one example answer to give you an idea of what is expected in this assignment.
Rubric
Exemplary | Adequate | Needs Improvement |
---|---|---|
Reasonable data sources, storage methods, and possible decisions/insights are identified for all domains | Some aspects of the solution lack detail, data storage is not discussed, at least two domains are described | Only parts of the data solution are described, and only one domain is considered. |
Disclaimer:
This document has been translated using the AI translation service Co-op Translator. While we aim for accuracy, please note that automated translations may include errors or inaccuracies. The original document in its native language should be regarded as the authoritative source. For critical information, professional human translation is advised. We are not responsible for any misunderstandings or misinterpretations resulting from the use of this translation.