pull down to refresh
firstly, thanks for expressing your bafflement!
what your original post was asking
sparking conversation, with the poll options in case someone reads, has an opinion, and doesn't wish to waste verbal entropy [let alone CCs] in comments
sounds like more of a ~devs question though
yeah it's the most likely crossposting target; however, my concern is with SN specifically, not some born-yesterday "what's a robots file good for" chitchat
some people are actively anti-crawlers, which I think is a bit of an extreme stance, although they prefer that folks who wish to deploy spiders literally shake hands and reach agreement, written or otherwise, rather than releasing them across the whirled wild web and hoping for good faith and mutually-profitable results. "devs" could have all sorts of opinions, and I'm interested specifically in how people think this website should behave when visited by crawlers.
an important note, the following are the only information I have personally reviewed about the site's behavior towards crawlers:
- your lack of a proper
/robots.txt[cute gif! sad code...] - Archive.today's
/wip/dynamic tracing monitor - Wayback Machine's
/save/summary resource listing
most importantly, I've not actually read ~200Ksloc of the site's code, despite actually cloning the repo and telling myself several times that doing so would be a better use of my electricity bill than running a lightning node.
despite actually cloning the repo and telling myself several times that doing so would be a better use of my electricity bill than running a lightning node.
@remindme in 1 year
... want them to be kind and considerate. Does that clear this up?
honestly, only for me thinking about my own personal use of the site.
once I begin imagining even something as "common good" as a script for archiving "best of" posts and comments, I run up against the following problem: all the socioeconomic machinery that you and your community are innovating is irrelevant, from the perspective of a client without even the occasional temptation to comment anonymously.
right now, I'm guessing that robots either ignore the site completely, or crawl it without much consideration for anything beyond HTTP errors and IP bans, which is a sad state of affairs. it should be possible to reach a better compromise.
from the perspective of a client without even the occasional temptation to comment anonymously
The perspective of anyone is out of our control.
Robots may be inconsiderate, and that is sad, but being permissive to who we think are humans will let some robots in and being restrictive to who we think are robots will keep some humans out.
It was unclear what your original post was asking. That's why I didn't vote in the poll. It sounds like more of a ~devs question though.