Hardest part is data cleaning. What source/API do you use to scrape for "news" ?
Do these sources have limits or require subscriptions?
How do you know an article is about a particular country/region?
Maybe the article doesn't mention any country by name, instead uses other tangential references like "Putin, Federal Reserve, etc."
You can just search Google News for "bitcoin [country name]"
You're absolutely right.
IMO, there's one angle: Use a language model
i'm sorry it's that angle... the alternative is "incentivize users to provide (only) clean data"
reply