You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Data-Science-For-Beginners/translations/pcm/1-Introduction/02-ethics
localizeflow[bot] b88ef67e42
chore(i18n): sync translations with latest source changes (chunk 1/1, 213 changes)
1 month ago
..
README.md chore(i18n): sync translations with latest source changes (chunk 1/1, 213 changes) 1 month ago
assignment.md chore(i18n): sync translations with latest source changes (chunk 1/1, 213 changes) 1 month ago

README.md

Introduction to Data Ethics

 Sketchnote by (@sketchthedocs)
Data Science Ethics - Sketchnote by @nitya

All of us na data people wey dey live for world wey full with data.

Market trend dey show say by 2022, 1 out of 3 big companies go dey buy and sell their data for online Marketplaces and Exchanges. As App Developers, e go dey easier and cheaper for us to use data to give insight and use algorithm to make things automatic for everyday user experience. But as AI dey everywhere now, we go need understand the wahala wey fit happen if dem use weaponization of this kind algorithm anyhow.

Trend dey show say by 2025, we go dey create and use over 180 zettabytes of data. For Data Scientists, this plenty information go give us chance to get access to personal and behavior data wey never happen before. With this power, we fit build detailed user profile and dey influence decision-making small small—sometimes in way wey go make people think say dem dey choose freely when e no be so. This kind thing fit help push users go where we want, but e also dey bring big question about data privacy, freedom, and the limit wey dey for how algorithm fit dey influence people.

Data ethics na necessary guardrails for data science and engineering, e dey help us reduce wahala and things wey we no plan for wey fit happen because of how we dey use data. The Gartner Hype Cycle for AI dey show trend for digital ethics, responsible AI, and AI governance as big things wey dey push democratization and industrialization of AI.

Gartner's Hype Cycle for AI - 2020

For this lesson, we go look the interesting area of data ethics - from the main ideas and wahala, to case studies and how AI dey work like governance - wey dey help build ethics culture for teams and companies wey dey work with data and AI.

Pre-lecture quiz 🎯

Basic Definitions

Make we first understand the basic words.

The word "ethics" come from the Greek word "ethikos" (and the root "ethos") wey mean character or moral nature.

Ethics na the shared values and moral rules wey dey guide how we dey behave for society. Ethics no dey based on law but on the things wey people agree say na "right vs. wrong". But, ethics fit affect how companies dey do their governance and how government dey make rules wey go make people follow.

Data Ethics na new branch of ethics wey dey "study and check moral wahala wey dey related to data, algorithms and the way we dey use them". For here, "data" dey talk about things like how we dey create, record, arrange, process, share, and use data, "algorithms" dey focus on AI, agents, machine learning, and robots, and "practices" dey talk about things like responsible innovation, programming, hacking, and ethics codes.

Applied Ethics na the practical way we dey use moral ideas. E mean say we dey look into ethical wahala for real-world actions, products and processes, and dey take action to make sure say dem dey follow the ethical values wey we don set.

Ethics Culture na about making applied ethics work to make sure say the ethical rules and ways wey we dey do things dey used well and fit work for the whole company. Good ethics culture dey define company-wide ethical rules, give good reason for people to follow, and dey encourage and show the kind behavior wey we want for every level of the company.

Ethics Concepts

For this part, we go talk about things like shared values (principles) and ethical wahala (problems) for data ethics - and we go look case studies wey go help you understand these ideas for real-world situations.

1. Ethics Principles

Every data ethics plan dey start with defining ethical principles - the "shared values" wey dey describe the kind behavior wey we go accept, and dey guide the actions wey go follow the rules, for our data & AI projects. You fit define these for individual or team level. But, most big companies dey write these for ethical AI mission statement or framework wey dem dey set for company level and dey make sure say all teams dey follow am.

Example: Microsoft's Responsible AI mission statement talk say: "We dey committed to the growth of AI wey dey follow ethical rules wey dey put people first" - dem identify 6 ethical principles for the framework wey dey below:

Responsible AI at Microsoft

Make we look these principles small. Transparency and accountability na the main values wey other principles dey build on - so make we start from there:

  • Accountability dey make people wey dey work with data & AI responsible for their actions, and make sure say dem dey follow these ethical rules.
  • Transparency dey make sure say data and AI actions dey clear (understandable) to users, explaining wetin and why dem make decisions.
  • Fairness - dey focus on making sure say AI dey treat everybody fairly, and dey fix any bias wey dey for data and systems.
  • Reliability & Safety - dey make sure say AI dey behave well with the values wey we don set, and dey reduce wahala or things wey we no plan for.
  • Privacy & Security - dey talk about understanding data lineage, and dey give data privacy and protection to users.
  • Inclusiveness - dey talk about designing AI solutions wey go fit meet plenty human needs & abilities.

🚨 Think about wetin your data ethics mission statement fit be. Check ethical AI frameworks from other companies - here na examples from IBM, Google, and Facebook. Wetin dem get in common for their shared values? How these principles dey connect to the AI product or industry wey dem dey work for?

2. Ethics Challenges

After we don define ethical principles, the next thing na to check our data and AI actions to see if dem dey follow those shared values. Think about your actions for two parts: data collection and algorithm design.

For data collection, actions go involve personal data or personally identifiable information (PII) wey fit identify living people. This one include different types of non-personal data wey together fit identify person. Ethical wahala fit dey about data privacy, data ownership, and things like informed consent and intellectual property rights for users.

For algorithm design, actions go involve collecting & arranging datasets, then using them to train & deploy data models wey dey predict things or dey make decisions for real-world situations. Ethical wahala fit come from dataset bias, data quality problems, unfairness, and misrepresentation for algorithms - including some wahala wey dey inside the system.

For both cases, ethics wahala dey show where our actions fit no follow our shared values. To find, reduce, or remove these problems - we need ask moral "yes/no" questions about our actions, then take action to fix am. Make we look some ethical wahala and the moral questions wey dem dey bring:

2.1 Data Ownership

Data collection dey involve personal data wey fit identify the people wey the data belong to. Data ownership na about control and user rights for how data dey created, processed, and shared.

The moral questions wey we need ask na:

  • Who get the data? (user or company)
  • Wetin be the rights wey data people get? (ex: access, erasure, portability)
  • Wetin be the rights wey companies get? (ex: fix bad user reviews)

Informed consent na when users agree to something (like data collection) with full understanding of the facts like the purpose, risks, and other options.

Questions to ask na:

  • The user (data person) give permission for data collection and use?
  • The user understand the reason why dem collect the data?
  • The user understand the risks wey fit happen because dem participate?

2.3 Intellectual Property

Intellectual property na the things wey people create wey fit get economic value for them or their business.

Questions to ask na:

  • The data wey dem collect get economic value for user or business?
  • The user get intellectual property for this matter?
  • The company get intellectual property for this matter?
  • If these rights dey, how we dey protect them?

2.4 Data Privacy

Data privacy or information privacy na about keeping user privacy and protecting their identity for personal data.

Questions to ask na:

  • The users' (personal) data dey safe from hacks and leaks?
  • The users' data dey accessible only to people wey suppose see am?
  • The users' anonymity dey kept when data dey shared or spread?
  • Person fit remove their identity from data wey dem don anonymize?

2.5 Right To Be Forgotten

The Right To Be Forgotten or Right to Erasure dey give extra protection for personal data to users. E dey allow users request make their personal data dey deleted or removed from Internet searches and other places, under certain conditions - so dem fit start fresh online without their past actions dey affect them.

Questions to ask na:

  • The system dey allow data people request make their data dey deleted?
  • If user withdraw their consent, e suppose trigger automatic deletion?
  • Dem collect data without consent or in illegal way?
  • We dey follow government rules for data privacy?

2.6 Dataset Bias

Dataset or Collection Bias na when we select data wey no represent everybody for algorithm development, e fit cause unfair results for different groups. Types of bias include selection or sampling bias, volunteer bias, and instrument bias.

Questions to ask na:

  • We recruit data people wey represent everybody well?
  • We test the data wey we collect or arrange for different bias?
  • We fit reduce or remove any bias wey we find?

2.7 Data Quality

Data Quality dey check the dataset wey we arrange for our algorithms, to see if the features and records dey meet the level of accuracy and consistency wey we need for our AI purpose.

Questions to ask na:

  • We capture valid features for our use case?
  • The data wey we capture dey consistent across different sources?
  • The dataset dey complete for different conditions or situations?
  • Di information wey dem capture dey accurate to show wetin dey happen for real life?

2.8 Algorithm Fairness

Algorithm Fairness dey check whether di way dem design di algorithm dey do bad thing against some group of people wey dey inside di data. Dis one fit cause wahala for allocation (where dem no gree give resources or dem hold am back for di group) and quality of service (where AI no dey accurate for some group like e dey for others).

Questions wey you fit ask for dis one na:

  • We don check di model accuracy for different group of people and conditions?
  • We don look di system well to see if e fit cause wahala (like stereotyping)?
  • We fit change di data or train di model again to stop di wahala wey we don see?

Make you check resources like AI Fairness checklists to sabi more.

2.9 Misrepresentation

Data Misrepresentation na to ask whether di way we dey show di data insight dey honest or we dey use am lie to push wetin we want make people believe.

Questions wey you fit ask for dis one na:

  • We dey report data wey no complete or wey no correct?
  • Di way we dey show di data fit make people misunderstand wetin e mean?
  • We dey use some kind statistical method to change di result?
  • Other explanation dey wey fit show different conclusion?

2.10 Free Choice

Di Illusion of Free Choice na when di system dey use decision-making algorithm to push people to choose wetin dem want, but e go look like say dem get options and control. Dis dark patterns fit cause social and economic wahala for users. Di choice wey users dey make fit affect their behavior profile, and dis one fit make di wahala big or long.

Questions wey you fit ask for dis one na:

  • Di user sabi wetin e mean to make dat choice?
  • Di user sabi di other options wey dey and di good and bad side of each one?
  • Di user fit change di choice wey di system or influence make am choose later?

3. Case Studies

To understand how dis ethical wahala dey affect real life, e good to look case studies wey dey show di wahala and di effect for people and society when dem no follow ethics.

Here be some examples:

Ethics Wahala Case Study
Informed Consent 1972 - Tuskegee Syphilis Study - African American men wey join di study dem promise dem free medical care but dem lie give dem. Di researchers no tell dem say dem get di sickness or say treatment dey. Many people die and di wahala affect their partners and children; di study last for 40 years.
Data Privacy 2007 - Di Netflix data prize give researchers 10M anonymized movie rankings from 50K customers to help improve recommendation algorithms. But researchers fit match di anonymized data with personal data from external datasets (like IMDb comments) - dem "de-anonymize" some Netflix subscribers.
Collection Bias 2013 - Di City of Boston develop Street Bump, one app wey citizens fit use report potholes, to help di city get better road data to fix di wahala. But people wey dey low income group no get cars and phones, so di wahala for their road no show for di app. Di developers work with academics to fix di wahala of equitable access and digital divides for fairness.
Algorithmic Fairness 2018 - Di MIT Gender Shades Study check di accuracy of gender classification AI products, dem see say e no dey accurate for women and people of color. For 2019 Apple Card e look like say dem give women less credit than men. Both show di wahala of algorithm bias wey dey cause socio-economic wahala.
Data Misrepresentation 2020 - Di Georgia Department of Public Health release COVID-19 charts wey dey mislead citizens about di trend of confirmed cases with di way dem arrange di x-axis. Dis one show misrepresentation through di way dem show di data.
Illusion of free choice 2020 - Learning app ABCmouse pay $10M to settle FTC complaint wey trap parents to dey pay for subscription wey dem no fit cancel. Dis one show dark patterns for di way dem design di choice system, wey dey push users to make bad choices.
Data Privacy & User Rights 2021 - Facebook Data Breach expose data of 530M users, dem pay $5B settlement to FTC. But dem no gree tell users about di breach, wey dey against user rights for data transparency and access.

You wan see more case studies? Check dis resources:

🚨 Think about di case studies wey you don see - you don experience or dem don affect you with similar ethical wahala for your life? You fit think of at least one other case study wey show one of di ethical wahala wey we don talk for dis section?

Applied Ethics

We don talk about ethics concepts, wahala, and case studies for real life. But how we go start to apply ethical principles and practices for our projects? And how we go operationalize dis practices for better governance? Make we look some real-life solutions:

1. Professional Codes

Professional Codes na one way wey organizations fit use to "encourage" members to support their ethical principles and mission statement. Codes na moral guidelines for professional behavior, wey dey help employees or members make decisions wey match di organization principles. Di code go work well if members dey follow am by choice; but many organizations dey give reward and punishment to make members follow am.

Examples na:

🚨 You dey part of any professional engineering or data science organization? Check their site to see if dem get professional code of ethics. Wetin e talk about their ethical principles? How dem dey "encourage" members to follow di code?

2. Ethics Checklists

While professional codes dey show di ethical behavior wey dem expect from practitioners, dem get known limitations for enforcement, especially for big projects. Instead, many data science experts dey support checklists, wey fit connect principles to practices in ways wey dem fit track and use.

Checklists dey turn questions to "yes/no" tasks wey dem fit use for standard product release workflows.

Examples na:

3. Ethics Regulations

Ethics na to define shared values and do di right thing by choice. Compliance na to follow di law if e dey. Governance na di way organizations dey operate to enforce ethical principles and follow di law.

Today, governance dey two ways for organizations. First, na to define ethical AI principles and set practices to make sure dem dey use am for all AI-related projects for di organization. Second, na to follow all government-mandated data protection regulations for di places wey dem dey operate.

Examples of data protection and privacy regulations:

🚨 Di European Union wey define GDPR (General Data Protection Regulation) na one of di most important data privacy regulations today. You sabi say e also define 8 user rights to protect citizens' digital privacy and personal data? Learn about wetin dem be and why dem matter.

4. Ethics Culture

Make you sabi say e get one gap between compliance (to do wetin di law talk) and di way we dey address big wahala (like ossification, information asymmetry, and distributional unfairness) wey fit make AI turn weapon fast.

To fix dis one, e need collaborative ways to define ethics cultures wey go build emotional connection and shared values across organizations for di industry. Dis one dey call for more formalized data ethics cultures for organizations - e go allow anybody to pull di Andon cord (to raise ethics wahala early for di process) and make ethical assessments (like for hiring) one important criteria for team formation for AI projects.


Post-lecture quiz 🎯

Review & Self Study

Courses and books dey help to understand di main ethics concepts and wahala, while case studies and tools dey help for applied ethics practices for real life. Here be some resources to start with.

Assignment

Write A Data Ethics Case Study


Disclaimer:
Dis dokyument don use AI translet service Co-op Translator do di translet. Even as we dey try make am correct, abeg make you sabi say machine translet fit get mistake or no dey accurate well. Di original dokyument for im native language na di one wey you go take as di correct source. For important mata, e good make professional human translet am. We no go fit take blame for any misunderstanding or wrong interpretation wey fit happen because you use dis translet.