About
The purpose of this website is to create a easily searchable index for YouTube datasets created at different points in time.
It's possible to search through the index in two ways: using text search, or by supplying a video ID.
How was the data obtained?
All the data present in the database is currently sourced from various datasets taken in different points in time.
Which datasets were used?
Here's a list:
YouTube Crawl Survey Dataset 2009–2010 by chfoo
Items: 105,527,188
Date Range: ~2009-08-01 - 2010-11-15
Source: https://archive.org/details/YouTubeCrawlSurveyDataset2009-2010
Fields covered: title, duration, uploaded_at, view_count, rating, rates, favorite_count
YouTube Tagging Dataset (2006–2007) by Samuel A. Burns and Gary Geisler
Items: 1,092,310
Date Range: 2006-11-02 - 2007-01-29
Source: https://snrubmas.github.io/youtube-tagging-dataset-2006/
Fields covered: title, description, duration, uploaded_at, uploader, view_count, rating, rates, comment_count
Dataset for Statistics of YouTube Videos 2007–2008 by Xu Cheng, Cameron Dale and Jiangchuan Liu at Simon Fraser University
Items: 8,386,391
Date Range: 2007-02-22 - 2008-08-27
Source: https://netsg.cs.sfu.ca/youtubedata/
Fields covered: duration, uploaded_at, uploader, category, view_count, rating, rates, comment_count
YouTube snapshots from Common Crawl 2012 by Common Crawl [data] and SindexMon [parsing]
Items: 149,399,162
Date Range: ~2012
Source: https://archive.org/details/youtube_150m
Fields covered: title, description, uploaded_at, uploader, view_count, duration
Thanks to everyone mentioned here for making the datasets freely available to use.
More info will be here in the future