You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
app-ideas/Projects/Podcast-Directory-App.md

89 lines
3.5 KiB

# Podcast Directory
**Tier:** 2-Intermediate
In the [GitHub Status](./GitHub-Status-App.md) app you learned how to use the
Request package to scrape information from a web page. The Podcast Directory
continues this process and introduces you to another web scraping package -
[Puppeteer](https://github.com/GoogleChrome/puppeteer).
Although Request is a useful package it isn't built specifically for web
scraping like Puppeteer. As you gain experience with web scraping you'll find
that there are web sites and applications where web scraping is made easier
by using a tool like Puppeteer that is specifically built for scaping.
It is important to note that while web scraping has its place, the use of
an API or a data source such as a file or database is always preferrable to
scraping information from a page. The reason being that even minor changes to
page styling can render your web scraper inoperable. For example, the change
of a CSS class name your scraping logic is dependent on.
The goal of the Podcast Directory app is to pull the most recent episodes of
the [Javascript Jabber](https://www.podbean.com/podcast-detail/d4un8-57595/JavaScript-Jabber-Podcast)
and [Techpoint Charlie](https://www.podbean.com/podcast-detail/k76vd-8adc7/Techpoint-Charlie-Podcast)
podcasts from [Podbean](https://www.podbean.com) and create a new page that
creates a combined list of episodes, sorted by broadcast date.
## User Stories
- [ ] User can see a table of podcast episodes
- [ ] User can see rows in this table showing a clickable episode icon, the
title of the episode, and the date it was originally broadcast.
- [ ] User can scroll through the list
- [ ] User can click on the episode icon to display that episodes page on
the Podbean web site.
## Bonus features
- [ ] User can see rows in the episode table have alternating background
colors.
- [ ] User can see a summary above the episode table showing the number
of episodes for each podcast.
- [ ] User can see a clickable checkbox next to each podcast name in the
summary above the episode table.
- [ ] User can click the radio button next to the podcast name to include
episodes for that podcast in the episode table.
## Useful links and resources
- [Puppeteer](https://github.com/GoogleChrome/puppeteer)
- [Web Scraping with a Headless Browser: A Puppeteer Tutorial](https://www.toptal.com/puppeteer/headless-browser-puppeteer-tutorial)
- [querySelectorAll](https://developer.mozilla.org/en-US/docs/Web/API/ParentNode/querySelectorAll)
- Hint! You can use the following code to help you get started with this
project. You can execute this using the command line command `node puptest`.
```
// puptest.js
const puppeteer = require('puppeteer');
const run = async () => {
return new Promise(async (resolve, reject) => {
try {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto("https://www.podbean.com/podcast-detail/d4un8-57595/JavaScript-Jabber-Podcast");
let episodeLinks = await page.evaluate(() => {
return Array.from(document.querySelectorAll('a.title')).map((item) => ({
url: item.getAttribute('href'),
text: item.innerText
})
);
});
browser.close();
return resolve(episodeLinks);
} catch (e) {
return reject(e);
}
})
}
run()
.then(console.log)
.catch(console.error);
```
- When you have completed this project check out the advanced project
[MyPodcast Library](./Projects/MyPodcast-Library-app.md)
## Example projects
N/a