pull down to refresh

So if I look at what the FOSS implementations do, it's the same as what the image analyzers do: shifting a smaller window to reduce context. So perhaps the issue is that there is too much information within the window if the speed is too high (because the windowing is probably static?)
i.e how SonicVerse operates on large audio files: (note the 10s chunks)