Startec

Startec

GitHub - NotJoeMartinez/yt-fts: Youtube Full Text Search - Search all of a YouTube's subtitles from the command line

Hoje, às 06:18

·

2 min de leitura

·

0 leituras

Youtube Full Text Search - Search all of a YouTube's subtitles from the command line - GitHub - NotJoeMartinez/yt-fts: Youtube Full Text Search - Search all of a YouTube's subtitles from t...
GitHub - NotJoeMartinez/yt-fts: Youtube Full Text Search - Search all of a YouTube's subtitles from the command line

yt-fts

yt-fts is a simple python script that uses yt-dlp to scrape all of a youtube channels subtitles and load them into an sqlite database that is searchable from the command line. It allows you to query a channel for specific key word or phrase and will generate time stamped youtube urls to the video containing the keyword.

Installation

git clone https://github.com/NotJoeMartinez/yt-fts
python3 -m venv .env
source .env/bin/activate
pip install -r requirements.txt

This project requires yt-dlp installed globally. See here if you have issues.

pip

python3 -m pip install -U yt-dlp

homebrew

Usage

Usage: yt_fts.py [OPTIONS] COMMAND [ARGS]...
Options:
 --help Show this message and exit.
Commands:
 delete delete [channel id]
 download download [channel url]
 export export [channel id] [search text]
 list Lists channels
 search search [channel id] [search text]

download

Will download all of a channels vtt files into your database

python yt_fts.py download "https://www.youtube.com/@TimDillonShow/videos"

If this fails you can manually input the channel id with the --channel-id flag

python yt_fts.py download "https://www.youtube.com/@TimDillonShow/videos" --channel-id "UC4woSp8ITBoYDmjkukhEhxg"

list

Will list all of your downloaded channels

output:

Listing channels
channel_id channel_name channel_url
------------------------ ------------------- ---------------------------------------------------------------
UC4woSp8ITBoYDmjkukhEhxg The Tim Dillon Show https://www.youtube.com/channel/UC4woSp8ITBoYDmjkukhEhxg/videos

search

Will search a channel for text based off the channel id you give it and will print a url to that point in the video

python yt_fts.py search [channel_id] "text you want to find"

EX:

python yt_fts.py search UC4woSp8ITBoYDmjkukhEhxg "life in the big city"

output:

Video title"("#208 - Let's Have A Party | The Tim Dillon Show - YouTube",)"
 Quote: "life in the big city Dan is wearing the"
 Time Stamp: 01:50:07.790
 Link: https://youtu.be/CJ_KAsz8rjQ?t=6604
Video title"('#176 - The Florida Project | The Tim Dillon Show - YouTube',)"
 Quote: "the show life in the big city love these"
 Time Stamp: 00:31:05.669
 Link: https://youtu.be/nKcqbHQndFQ?t=1862
Video title"('164 - Life In The Big City - YouTube',)"
 Quote: "life in the big city it was one of my"
 Time Stamp: 00:27:17.549
 Link: https://youtu.be/dqGyCTbzYmc?t=1634

Export

Similar to search except it will export all of the search results to a csv with the format: Video Title,Quote,Time Stamp,Link as it's headers

Delete

Will delete a channel from your database

python yt_fts.py delete [channel_id]

Continue lendo

DEV

The 3 best tools I use to validate the meta tags present on my websites
It is very important to validate and verify how your article or new web page will be displayed when shared. When a page on the web is shared — be it on Facebook, Twitter, Linkedin, Instagram, whatsapp, etc....

Hoje, às 15:27

DEV

hibernate 6 crud operations
step : create maven project step 2: edit pom.xml add maven compiler plugin add hibernate-core, lombok, mysql-connector-java maven dependencies pom.xml <project...

Hoje, às 14:50

TabNews

Como implementar um sistema de pesquisa (Crawler) no Llama 13B? · Commonwealth
Atenção especialistas em IAs!! Gostaria de implementar um Crawler no LLama 13B. Para isso eu já desenvolvi um Crawler que é um buscador e "escaneador" de conteúdos/elementos de páginas we...

Hoje, às 14:43

AI | Techcrunch

Chainlink co-founder wants web3 to provide cryptographic guarantees to the world
Welcome back to Chain Reaction, a podcast that interviews newsmakers in crypto to better understand the tech behind the hype and the people working to build a decentralized future. For this week’s episode,...

Hoje, às 14:00

Hacker News

A digital payments revolution in India
Take a walk on Mumbai’s Juhu beach and little has changed in five years—except for the QR codes adorning every food stall. Go to São Paulo in Brazil, Beijing in China, or many other cities across the emerging...

Hoje, às 13:58

DEV

Why ChatGPT Can't Replace Developers Anytime Soon: A Hilarious Showdown!
Artificial intelligence has made remarkable strides in various fields, but can it outwit developers? Not so fast! In this comical exploration, we'll dive into why ChatGPT falls short when it comes to...

Hoje, às 13:43

TabNews

[DÚVIDA] Quais são os melhores materiais escritos para estudar PHP? 📚👨‍💻 · tarcisiodev1
[DÚVIDA] Quais são os melhores materiais escritos para estudar PHP? 📚👨‍💻 Estou em busca de recursos de aprendizado para aprimorar minhas habilidades em PHP. Gostaria de saber quais são...

Hoje, às 13:05

Hacker News

Writing summaries is more important than reading more books — Andreas Fragner
One thing I’ve learned over time is to read fewer books but to take the time to write summaries for the good ones. The ROI of spending 2h writing a synopsis is much higher than spending those 2h powering through the next book on your list. Reading is not about page count or speed. What matters is how it changes your thinking and what you take away form it. Optimize for comprehension, not volume.

Hoje, às 13:02

DEV

What you learning about this weekend?
Michael Tharrington for CodeNewbie Posted on May 20 #codenewbie #discuss #learning #beginners Heyo 👋 What ya learning on this...

Hoje, às 13:00

DEV

Deployment Strategies for Applications
When deploying changes to an application, there are several strategies that can be taken. In this article, the different strategies will be explained, with an analogy, and an analysis of the benefits and...

Hoje, às 12:59