diff --git a/03-defining-data/README.md b/03-defining-data/README.md index e73dc4e6..3617486c 100644 --- a/03-defining-data/README.md +++ b/03-defining-data/README.md @@ -1,12 +1,44 @@ # Defining Data +## Introduction +Data are facts, information, observations and measurements that are used to make discoveries and to support informed decisions. A dataset, which is a collection of data may come in different formats and structures, and will usually be based on its source, ot where the data came from. For example, a company's monthly earnings might be in a spreadsheet but hourly heart rate data from a smartwatch may be in [JSON](https://stackoverflow.com/a/383699) format. It's common for data scientists to work with different types of data within a dataset. + + +This lesson focuses on identifying and classifying data by its characteristics and its sources. + ## Pre-Lecture Quiz [Pre-lecture quiz]() -# What is Data? -# + +## The 5 V's of Big Data + + + +### Velocity + +The speed at which data is collected. + +### Veracity + +The quality of the data. Was is collected ethically? + +### Variety + +Structured +Semi-Structured +Unstructured + +### Value + +Is the data complete? + +### Volume + +The amount of data collected. + +## Sources of Data