structured data details

pull/38/head
Jasmine 4 years ago
parent 401629fbf4
commit 257d31423b

@ -1,7 +1,7 @@
# Defining Data
## Introduction
Data are facts, information, observations and measurements that are used to make discoveries and to support informed decisions. A dataset, which is a collection of data may come in different formats and structures, and will usually be based on its source, ot where the data came from. For example, a company's monthly earnings might be in a spreadsheet but hourly heart rate data from a smartwatch may be in [JSON](https://stackoverflow.com/a/383699) format. It's common for data scientists to work with different types of data within a dataset.
Data are facts, information, observations and measurements that are used to make discoveries and to support informed decisions. A dataset, which is a collection of data may come in different formats and structures, and will usually be based on its source, or where the data came from. For example, a company's monthly earnings might be in a spreadsheet but hourly heart rate data from a smartwatch may be in [JSON](https://stackoverflow.com/a/383699) format. It's common for data scientists to work with different types of data within a dataset.
This lesson focuses on identifying and classifying data by its characteristics and its sources.
@ -21,14 +21,24 @@ This lesson focuses on identifying and classifying data by its characteristics a
## How Data is Structured
## Structured Data
Structured data is data that is organized into rows and columns, where each row will have the same set of columns. A benefit of structured data is that it can be organized in such a way that it can be related to other structured data. However, because the data is designed to be organized in a specific way, making changes to its overall structure can take a lot of effort to do.
Structured data is data that is organized into rows and columns, where each row will have the same set of columns. Columns represent a value of a particular type and will be identified with a name describing what the value represents, while rows contain the actual values. A benefit of structured data is that it can be organized in such a way that it can be related to other structured data. However, because the data is designed to be organized in a specific way, making changes to its overall structure can take a lot of effort to do. For example, imagine you've been tasked to add additional data that currently doesn't exist in an existing structured data set. One option would be to add a new column, but you'll need to also decide if and how you'll add values to the existing rows in the dataset.
Examples of structured data: Spreadsheets, relational databases
Examples of structured data: spreadsheets, relational databases, phone numbers, bank statements
## Semi-structured
![image of type]()
*type description*
## Unstructured Data
Unstructured data
Unstructured data typically cannot be categorized into into rows or columns, or would take more effort to achieve when compared to structured data. Unstructured data is not organized into a specific format which makes it easier to add new information
Examples of unstructured data: email, text files, social media, text messages, video files
![image of type]()
*type description*
## Semi-structured
## Sources of Data
### Internet
#### APIs

Loading…
Cancel
Save