Made a script that pulls MP3s from YouTube podcasts, transcribes them with WhisperX, and then uses an LLM to generate a summary. It works surprisingly well.
That's sounds awsome. I have a thing that pulls the latest Mercouris rant and converts it to an mp3 at 2x speed. I got deep seek to make mine though because I don't know anything about coding. Every few weeks it stops working so I have to go back to deep seek and get it to fix it.
5
☆ Yσɠƚԋσʂ ☆ - 2mon
lol not having the time to listen to his latest rant was the motivation, I was like I'm curious if he's got any new juicy tidbits, but I just don't feel like listening through the whole video
originally I just used this site to get a transcript https://www.youtranscripts.com/ and then threw it in DeepSeek, and then I was like wait I can totally script this, the script is mostly vibe coded too 🤣
only downside right now is that whisperx isn't terribly fast, I'm trying to see if parakeet might work better
updated to use parakeet, and it's way faster and realized I can use subtitles when they're available skipping the need for the transcription step entirely
6
Commiejones - 2mon
What if you could get the AI summary to check itself against previous summaries of videos by the same channel? Maybe it could avoid too much brevity on some specifics while avoiding repetition of reoccurring summarized points.
2
☆ Yσɠƚԋσʂ ☆ - 2mon
yeah you could totally keep history of summaries per channel, and then have it read through them too, definitely lots of opportunities for improvement here
2
Commiejones - 2mon
Oh! Oh! and then you get the AI to make a video of the original presenter reading the summary.
3
☆ Yσɠƚԋσʂ ☆ - 2mon
love it 🤣
2
mistermodal @lemmy.ml - 2mon
I mean at this point you should know what he is going to say
3
Commiejones - 2mon
90% of it yeah. But that 10% would take hours to dig up on my own.
4
mistermodal @lemmy.ml - 2mon
I just wonder why I bother with them sometimes lol that's fair though
3
Commiejones - 2mon
Its just noise while I walk my dog.
4
Commiejones - 2mon
Just had a thought. Maybe you could just snatch the Youtube generated transcript instead.
1
☆ Yσɠƚԋσʂ ☆ - 2mon
Yeah, the script will try doing that by default now, and if it can't then it falls back to transcribing the audio. Also switched to parakeet from whisperx cause it's way faster.
yogthos in programming
Made a script that pulls MP3s from YouTube podcasts, transcribes them with WhisperX, and then uses an LLM to generate a summary. It works surprisingly well.
https://git.sr.ht/~yogthos/transcribe-yt/tree/main/item/README.mdThat's sounds awsome. I have a thing that pulls the latest Mercouris rant and converts it to an mp3 at 2x speed. I got deep seek to make mine though because I don't know anything about coding. Every few weeks it stops working so I have to go back to deep seek and get it to fix it.
lol not having the time to listen to his latest rant was the motivation, I was like I'm curious if he's got any new juicy tidbits, but I just don't feel like listening through the whole video
originally I just used this site to get a transcript https://www.youtranscripts.com/ and then threw it in DeepSeek, and then I was like wait I can totally script this, the script is mostly vibe coded too 🤣
only downside right now is that whisperx isn't terribly fast, I'm trying to see if parakeet might work better
updated to use parakeet, and it's way faster and realized I can use subtitles when they're available skipping the need for the transcription step entirely
What if you could get the AI summary to check itself against previous summaries of videos by the same channel? Maybe it could avoid too much brevity on some specifics while avoiding repetition of reoccurring summarized points.
yeah you could totally keep history of summaries per channel, and then have it read through them too, definitely lots of opportunities for improvement here
Oh! Oh! and then you get the AI to make a video of the original presenter reading the summary.
love it 🤣
I mean at this point you should know what he is going to say
90% of it yeah. But that 10% would take hours to dig up on my own.
I just wonder why I bother with them sometimes lol that's fair though
Its just noise while I walk my dog.
Just had a thought. Maybe you could just snatch the Youtube generated transcript instead.
Yeah, the script will try doing that by default now, and if it can't then it falls back to transcribing the audio. Also switched to parakeet from whisperx cause it's way faster.