pull down to refresh

Still working on Innocuous, a way to encode/decode messages in LLM generated output: https://github.com/sutt/innocuous
I like my new example. This text:
Amidst the ancient forest, dwelt a wondrous Wizard renowned for his arcane might. One day, he unearthed true power not in enchantments, but in wellaimed words. Fearing misuse, he penned this power three ways in his magnum opus obscuring it with tedious trifles.
Time passed, the wizard departed, while countless seekers puzzled over his laborious tome. Until one sage, whose curiosity was matched only by fortitude, finally discovered the cryptic keys tucked deep within these words un
Decodes to: "pip install innocuous"
Still need to wrap my head around this. In simple terms, are initial_prompt, chunk_size, num_logprobs, and encoded_prompt the only things I'll need to make sure to remember or save somewhere when I decide to decode the output?
LLM are something that is changing really fast. Is this compatible with any model?
You mention as use case examples the encoding of PGP keys, url, cryptocurrency addresses, nostr pubkeys... You'd trust this method to hide wallet seeds like traditional stenography do?
reply
Thanks for taking a look! The idea that you're getting at is quite important. I've imagined there's a "standard" where the first two bytes will represent a "version number" which will set a value for the free floating paramteres, and have lots of different initial_prompts to produce a variety of texts.

version=96

prompt = "Once upon a time, in a kingdom far away, there lived a"
chunk_size = 3
model = Mistral7Bv0.2Q4

version=154

inital_prompt="The algorithm processes data by first analyzing the input and then"
chunk_size=2
model = Llama4.1-8B-Q6
You will only need to remeber encoded_prompt which will have the data + version number encoded in it. So it should eventually be able to work like opening up .docx with MSWord. If it's a valid text create by the encode, it will open up the message, if it's invalid it will fail or show random characters.
I get that people's instincts are let me hide my seedphrase in there because it's the most obvious thing to hide but it's not really the correct fit (IMO). There's other opportunities that I see opening up after people think about this concept for a week...
reply
You will only need to remeber encoded_prompt which will have the data + version number encoded in it.
Are you sure will be enough? I feel another detail to remember is the model? Or any model can be used?
Why not seedphrase and what are the other opportunities you see at the horizon?
reply