You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Data-Science-For-Beginners/translations/en/6-Data-Science-In-Wild/20-Real-World-Examples
leestott 8cdb64bac6
🌐 Update translations via Co-op Translator
3 weeks ago
..
README.md 🌐 Update translations via Co-op Translator 3 weeks ago
assignment.md 🌐 Update translations via Co-op Translator 3 weeks ago

README.md

Data Science in the Real World

 Sketchnote by (@sketchthedocs)
Data Science In The Real World - Sketchnote by @nitya

We're nearing the end of this learning journey!

We began by defining data science and ethics, explored tools and techniques for data analysis and visualization, reviewed the data science lifecycle, and examined how to scale and automate workflows using cloud computing services. Now, you might be wondering: "How do I apply all these learnings to real-world scenarios?"

In this lesson, we'll delve into real-world applications of data science across industries and explore specific examples in research, digital humanities, and sustainability. We'll also discuss student project opportunities and wrap up with resources to help you continue your learning journey.

Pre-Lecture Quiz

Pre-lecture quiz

Data Science + Industry

The democratization of AI has made it easier for developers to design and integrate AI-driven decision-making and data-driven insights into user experiences and development workflows. Here are some examples of how data science is applied in real-world industry scenarios:

  • Google Flu Trends used data science to correlate search terms with flu trends. Although the approach had flaws, it highlighted the potential (and challenges) of data-driven healthcare predictions.

  • UPS Routing Predictions - explains how UPS uses data science and machine learning to predict optimal delivery routes, factoring in weather, traffic, deadlines, and more.

  • NYC Taxicab Route Visualization - data obtained through Freedom Of Information Laws was used to visualize a day in the life of NYC cabs, providing insights into navigation, earnings, and trip durations over a 24-hour period.

  • Uber Data Science Workbench - leverages data from millions of daily Uber trips (pickup/dropoff locations, trip durations, preferred routes, etc.) to build analytics tools for pricing, safety, fraud detection, and navigation decisions.

  • Sports Analytics - focuses on predictive analytics (team and player analysis, e.g., Moneyball) and data visualization (team dashboards, fan engagement, etc.) with applications like talent scouting, sports betting, and venue management.

  • Data Science in Banking - showcases the role of data science in finance, including risk modeling, fraud detection, customer segmentation, real-time predictions, and recommender systems. Predictive analytics also support critical measures like credit scores.

  • Data Science in Healthcare - highlights applications such as medical imaging (MRI, X-Ray, CT-Scan), genomics (DNA sequencing), drug development (risk assessment, success prediction), predictive analytics (patient care and logistics), and disease tracking/prevention.

Data Science Applications in The Real World Image Credit: Data Flair: 6 Amazing Data Science Applications

The figure illustrates other domains and examples of data science applications. Interested in exploring more? Check out the Review & Self Study section below.

Data Science + Research

 Sketchnote by (@sketchthedocs)
Data Science & Research - Sketchnote by @nitya

While industry applications often focus on large-scale use cases, research projects can provide valuable insights in two key areas:

  • Innovation opportunities - rapid prototyping of advanced concepts and testing user experiences for next-generation applications.
  • Deployment challenges - identifying potential harms or unintended consequences of data science technologies in real-world contexts.

For students, research projects offer learning and collaboration opportunities that deepen understanding and foster connections with experts in areas of interest. What do research projects look like, and how can they make an impact?

Consider the MIT Gender Shades Study by Joy Buolamwini (MIT Media Labs), co-authored with Timnit Gebru (then at Microsoft Research). This study focused on:

  • What: Evaluating bias in automated facial analysis algorithms and datasets based on gender and skin type.
  • Why: Facial analysis is used in critical areas like law enforcement, airport security, and hiring systems, where inaccuracies (e.g., due to bias) can lead to economic and social harm. Addressing bias is essential for fairness.
  • How: Researchers noted that existing benchmarks predominantly featured lighter-skinned subjects. They curated a new dataset (1000+ images) balanced by gender and skin type, which was used to evaluate the accuracy of three gender classification products (Microsoft, IBM, Face++).

Results revealed that while overall accuracy was good, error rates varied significantly across subgroups, with misgendering being higher for females and individuals with darker skin tones, indicating bias.

Key Outcomes: The study emphasized the need for more representative datasets (balanced subgroups) and inclusive teams (diverse backgrounds) to identify and address biases early in AI solutions. Such research has influenced organizations to adopt principles and practices for responsible AI to enhance fairness in their AI products and processes.

Interested in Microsoft research efforts?

Data Science + Humanities

 Sketchnote by (@sketchthedocs)
Data Science & Digital Humanities - Sketchnote by @nitya

Digital Humanities is defined as "a collection of practices and approaches combining computational methods with humanistic inquiry." Stanford projects like "rebooting history" and "poetic thinking" demonstrate the connection between Digital Humanities and Data Science, using techniques like network analysis, information visualization, spatial analysis, and text analysis to revisit historical and literary datasets for new insights.

Want to explore a project in this field?

Check out "Emily Dickinson and the Meter of Mood" by Jen Looper. This project examines how data science can reinterpret familiar poetry and reevaluate its meaning and the author's contributions. For example, can we predict the season in which a poem was written by analyzing its tone or sentiment? What does this reveal about the author's mindset during that time?

To explore this, follow the data science lifecycle:

This workflow allows you to explore seasonal impacts on poem sentiment and develop your own interpretations of the author. Try it out, then extend the notebook to ask new questions or visualize the data differently!

Use tools from the Digital Humanities toolkit to pursue similar inquiries.

Data Science + Sustainability

 Sketchnote by (@sketchthedocs)
Data Science & Sustainability - Sketchnote by @nitya

The 2030 Agenda For Sustainable Development, adopted by all United Nations members in 2015, outlines 17 goals, including those aimed at Protecting the Planet from degradation and climate change. The Microsoft Sustainability initiative supports these goals by leveraging technology to build a more sustainable future, focusing on 4 key objectives: being carbon negative, water positive, zero waste, and bio-diverse by 2030.

Addressing these challenges requires large-scale data and cloud-based solutions. The Planetary Computer initiative provides four components to assist data scientists and developers:

  • Data Catalog - offers petabytes of Earth Systems data (free and Azure-hosted).

  • Planetary API - enables users to search for relevant data across space and time.

  • Hub - provides a managed environment for processing massive geospatial datasets.

  • Applications - showcases use cases and tools for sustainability insights. The Planetary Computer Project is currently in preview (as of Sep 2021) - here's how you can start contributing to sustainability solutions using data science.

  • Request access to begin exploring and connect with others.

  • Explore documentation to learn about supported datasets and APIs.

  • Check out applications like Ecosystem Monitoring for inspiration on project ideas.

Consider how you can use data visualization to highlight or amplify insights into issues like climate change and deforestation. Or think about how these insights can be leveraged to design new user experiences that encourage behavioral changes for more sustainable living.

Data Science + Students

We've discussed real-world applications in industry and research, and looked at examples of data science applications in digital humanities and sustainability. So how can you develop your skills and share your knowledge as data science beginners?

Here are some examples of student data science projects to inspire you:

🚀 Challenge

Look for articles that suggest beginner-friendly data science projects - like these 50 topic areas, these 21 project ideas, or these 16 projects with source code that you can analyze and remix. And don't forget to blog about your learning experiences and share your insights with the community.

Post-Lecture Quiz

Post-lecture quiz

Review & Self Study

Want to dive deeper into use cases? Here are some relevant articles:

Assignment

Explore A Planetary Computer Dataset


Disclaimer:
This document has been translated using the AI translation service Co-op Translator. While we aim for accuracy, please note that automated translations may include errors or inaccuracies. The original document in its native language should be regarded as the authoritative source. For critical information, professional human translation is advised. We are not responsible for any misunderstandings or misinterpretations resulting from the use of this translation.