pull down to refresh
reply
Can't help it if AI was trained on the way people like me write 🤷🏻♂️
reply
reply
Looking back at that specific phrase, it's indeed very botlike
reply
It does sound like a model orienting itself for a reply.
(I tend to draw pretty heavily on this finding from image diffusion models to understand how LLMs build coherent output. I'm probably overgeneralizing it.)
reply
reply
reply
Apparently people actually use emdashes out in the wild: #1406132
reply
The ambiguity in comparing those model outputs highlights an important point in this discussion: You'll need a labeled dataset of ground truth on which to test the quality of the model outputs. You could probably construct this by gathering a bunch of comments known to be relevant (zapped more than once, by trusted users, etc), and a bunch of comments known to be LLM/spam. Then test the model's ability to pick out the spam from the relevant.
I'd also probably reduce the dimensionality of the assignment to make the classification task simpler: just relevant yes/no and LLM yes/no is where I'd start.