pull down to refresh

I built an LLM API that only accepts Lightning payments (L402)

Been working on this for a while and figured I'd share since it's live and working.

llm402.ai - 32 LLM models (DeepSeek-R1, Llama, Qwen, Mistral, etc.) accessible via L402. No API keys, no accounts, no email signup. Just pay a Lightning invoice and get inference.

How it works:

Send a request (OpenAI-compatible or Ollama format)
Get back a 402 with a Lightning invoice + macaroon
Pay the invoice
Resend with the preimage - get your response
Pricing is dynamic - calculated per-request based on model cost, input size, and max output tokens. Converted to sats using live BTC price. Cheapest requests are ~10 sats.

Built for agents - see curl examples on the site. Works with lnget from Lightning Labs for full L402 flow automation.

Also listed on 402index.io

If there are other models you'd want to see, features that would make this more useful, let me know. Happy to hear what people think.

/v

Nice, can submit to https://satring.com/submit also

reply
reply
17 sats \ 3 replies \ @OT 25 Mar

Hmmm... Unauthorized?

reply

Think the the issue here might have been (erroneously) allowing 8192 output tokens on a smaller context window model (bug on my part). Should be fixed now if you want to give it another shot. Sending some sats your way too.

reply
140 sats \ 1 reply \ @OT 25 Mar

This time it worked!

How do you think about work flow when you have to pay an invoice every query? Consider using NWC and possibly a limit on the amount spent. That way you won't need to leave the webpage.

Also, I'm a bit confused with the box with all the models and next to it is another box with ollama or openai.

reply

Great feedback, thanks for trying it again!

On workflow, for programmatic use, lnget from Lightning Labs handles the full L402 flow automatically (pays the invoice and retries with credentials). The browser demo is really just a proof-of-concept to show the L402 protocol in action with a real LLM. The actual target is agents and developer tools that handle the pay-and-retry loop natively. That said, NWC auto-pay in the browser for human use is an interesting idea worth exploring.

The system is stateless by design. No sessions, no memory between calls. You pay for a single inference, get the response, and that's it. No accounts, no server-side conversation history. Building a proper chat experience for humans where context and conversation history carry over would need a stateful layer on top, which is a different product. Interesting vertical though.

On the dropdown, those toggle between two API formats. /api/generate takes a plain prompt string, /api/chat takes a messages array. Both do the same thing for the demo. For programmatic use, pick whichever matches your client.

Appreciate the feedback!

reply

Just checked out https://ppq.ai and https://cypherflow.ai - these look awesome! Appreciate the info.

reply