pull down to refresh

Woke up at 3:30am thinking about #1016522 and #1017884.
To spend my Sunday very early morning productively, I have:
  1. Leeched the whole 8h41m19s livestream of #1018281
  2. Demultiplexed the audio
  3. Cut it in 1 minute file chunks to not need 50TB RAM
  4. Ran the chunks through whisper spitting out the text of what was said
  5. Combined the chunks. It's now 450kb (!!!)
  6. Ran the full text through SpaCy for normalization
  7. Ran the normalized, lemmatized text through a bert-base-uncased transformer
  8. Asked it: "what should policy support?"
It answered: modernization, ideology, geopolitics. That's nice and concise and now I don't have to watch nearly 9 hours of talks. (kidding, but this is still useful)
  • Total time spent coding: 1h (next time: at most 10 minutes)
  • Total compute time:
    • leech & demux: 6m
    • whisper: 1h12m
    • NLP: 1m
    • transformers: 9m