pull down to refresh

Can see a bunch of PoW went into this, so kudos! 🚀

Can you talk a bit about all of these models you are currently using? How did you arrive at your current list?

  • Text = Mixtral 8x7B-Instruct
  • Audio = tortoise-tts model
  • Image = Stable Diffusion XL
  • ‘Vision’ = LLaVA-13b

Thank you ! PoW is the only way!

We are constantly monitoring for the latest and greatest models. If a new model performs better, we deploy it! We are also tracking new things (Vision is a good example) and if we see value in it, we put it on the platform.

But to answer your question specifically:

Text:

  • Mixtral 8x7B is currently the best Open-Source model based on our internal tests as well as multiple benchmarks. It's also very efficient which is what always us to charge only 21 sats per prompt.
  • We also offer a "Code" Model called Call-Llama2 70b, which can produce better results than GPT-4 on this specific task.
  • We are looking into adding another totally uncensored model, where no subject/topics are off-limits.

Image:

  • Stable Diffusion is for now the best Open-Source image model. Other models exist that could be cheaper, but given the performance and limits (see above comment) of the best model, we don't think they actually bring a lot of value.

Vision

  • LLaVA is an incredible model that came out just a few days after GPT-Vision and is the best we tested so far. Multimodal is key to unlocking new use cases to LLM.

Audio:

  • This is more of a "toy" model. It's fun to try but is still very limited in its current form. We just wanted to put it out there so people can see another "side" of AI.
reply