pull down to refresh
Gotta push the limits. Also the readme says its multimodal, so I was expecting a jpg lol.
It's multimodal for input, not output unfortunately.
I wonder how much can be improved by removing 139 languages, and audio and video modality.
Gotta push the limits. Also the readme says its multimodal, so I was expecting a jpg lol.