
jsaifc 20 2021-08-24 语音识别


Pitchfork回顾到12/6/17 (http://ds.jsai.org.cn/) 语音识别 第1张

Context This data set consists of the archive of Pitchfork reviews from January 5th, 1999 to December 6th, 2017 for a total of ~19,500 reviews. The data was scraped from Pitchfork.com in multiple sessions on 12/6/2017 using code written in Python with the BeautifulSoup library. Full code for scraping and analysis can be found [here][1]. Content Features scraped include the artist, album, date, genre, score, whether the album was given a "best" tag, and the text of the review itself. Acknowledgements Thank you to Pitchfork for making their website easily accessible to scraping! Inspiration Possible questions to explore: How does word usage (especially adjectives) change by genre? Do reviews with higher scores tend to have different language used? What are the words identified most with each genre of review? How does word usage change over time? How does average score change by genre, and over time? Can we accurately predict the genre of a review just based on its words? [1]: https://github.com/evanm31/p4k-scraper

