23
2mon
12

Made a script that pulls MP3s from YouTube podcasts, transcribes them with WhisperX, and then uses an LLM to generate a summary. It works surprisingly well.

https://git.sr.ht/~yogthos/transcribe-yt/tree/main/item/README.md
Commiejones - 2mon

That's sounds awsome. I have a thing that pulls the latest Mercouris rant and converts it to an mp3 at 2x speed. I got deep seek to make mine though because I don't know anything about coding. Every few weeks it stops working so I have to go back to deep seek and get it to fix it.

5
☆ Yσɠƚԋσʂ ☆ - 2mon

lol not having the time to listen to his latest rant was the motivation, I was like I'm curious if he's got any new juicy tidbits, but I just don't feel like listening through the whole video

originally I just used this site to get a transcript https://www.youtranscripts.com/ and then threw it in DeepSeek, and then I was like wait I can totally script this, the script is mostly vibe coded too 🤣

only downside right now is that whisperx isn't terribly fast, I'm trying to see if parakeet might work better

updated to use parakeet, and it's way faster and realized I can use subtitles when they're available skipping the need for the transcription step entirely

6
Commiejones - 2mon

What if you could get the AI summary to check itself against previous summaries of videos by the same channel? Maybe it could avoid too much brevity on some specifics while avoiding repetition of reoccurring summarized points.

2
☆ Yσɠƚԋσʂ ☆ - 2mon

yeah you could totally keep history of summaries per channel, and then have it read through them too, definitely lots of opportunities for improvement here

2
Commiejones - 2mon

Oh! Oh! and then you get the AI to make a video of the original presenter reading the summary.

3
☆ Yσɠƚԋσʂ ☆ - 2mon

love it 🤣

2
mistermodal @lemmy.ml - 2mon

I mean at this point you should know what he is going to say

3
Commiejones - 2mon

90% of it yeah. But that 10% would take hours to dig up on my own.

4
mistermodal @lemmy.ml - 2mon

I just wonder why I bother with them sometimes lol that's fair though

3
Commiejones - 2mon

Its just noise while I walk my dog.

4
Commiejones - 2mon

Just had a thought. Maybe you could just snatch the Youtube generated transcript instead.

1
☆ Yσɠƚԋσʂ ☆ - 2mon

Yeah, the script will try doing that by default now, and if it can't then it falls back to transcribing the audio. Also switched to parakeet from whisperx cause it's way faster.

2