@anon
sign up
@anon
sign up
pull down to refresh
A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code
arxiv.org/abs/2508.18106
32 sats
\
0 comments
\
@optimism
1 Sep
AI
related
MCP-Bench: Benchmarking Tool-Using LLM Agents
arxiv.org/abs/2508.20453
239 sats
\
0 comments
\
@optimism
30 Aug
AI
Claude 3.5 Sonnet
www.anthropic.com/news/claude-3-5-sonnet
411 sats
\
0 comments
\
@k00b
21 Jun 2024
tech
LLM Rankings: programming | OpenRouter
openrouter.ai/rankings/programming
96 sats
\
0 comments
\
@m0wer
28 May
tech
Qwen3-235B-A22B-2507
xcancel.com/Alibaba_Qwen/status/1947344511988076547
218 sats
\
0 comments
\
@m0wer
24 Jul
AI
"Benchwashing" - how do you defend against this?
1648 sats
\
10 comments
\
@optimism
9 Aug
AskSN
Alibaba has released its flagship Qwen3-Max model with a trillion parameters
chat.qwen.ai/
167 sats
\
0 comments
\
@lunin
25 Sep
AI
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
arxiv.org/abs/2509.03867
306 sats
\
0 comments
\
@optimism
7 Sep
AI
Researchers discover impressive learning capabilities in long-context LLMs
venturebeat.com/ai/deepmind-researchers-discover-impressive-learning-capabilities-in-long-context-llms/
297 sats
\
0 comments
\
@ch0k1
25 Apr 2024
tech
Wairdle
1133 sats
\
4 comments
\
@crrdlx
9 Aug
AI
LLM Alignment: Reward-Based vs Reward-Free Methods
towardsdatascience.com/llm-alignment-reward-based-vs-reward-free-methods-ef0c0f6e8d88?gi=90f7a78bfcff
17 sats
\
0 comments
\
@ch0k1
6 Jul 2024
news
My lived experience writing with ChatGPT
567 sats
\
10 comments
\
@realBitcoinDog
15 Apr
BooksAndArticles
The flagship model, Qwen3-Max-Preview, has been released
100 sats
\
0 comments
\
@lunin
5 Sep
AI
No More Floating Points, The Era of 1.58-bit Large Language Models
medium.com/ai-insights-cobet/no-more-floating-points-the-era-of-1-58-bit-large-language-models-b9805879ac0a
100 sats
\
1 comment
\
@0xbitcoiner
11 Mar 2024
science
freebie
Vals AI — Finance Agent Benchmark
www.vals.ai/benchmarks/finance_agent-04-22-2025?utm_campaign=wp_the_technology_202&utm_medium=email&utm_source=newsletter
54 sats
\
3 comments
\
@BlokchainB
24 Apr
AI
pylint MCP provider
1428 sats
\
6 comments
\
@optimism
4 Jun
builders
To Understand AI, Watch How It Evolves
www.quantamagazine.org/to-understand-ai-watch-how-it-evolves-20250924/
100 sats
\
0 comments
\
@0xbitcoiner
24 Sep
AI
Meet Open Interpreter: Open-Source Project that Lets GPT-4 Execute Python Code
www.marktechpost.com/2024/03/28/meet-open-interpreter-an-open-source-project-that-lets-gpt-4-execute-python-code-locally/
126 sats
\
1 comment
\
@ch0k1
3 Apr 2024
devs
Self Healing Code – Auto patching vulns with Gen AI
www.dylandavis.net/2024/11/self-healing-code/
271 sats
\
0 comments
\
@aljaz
4 Nov 2024
devs
Building LLMs is probably not going be a brilliant business
calpaterson.com/porter.html
446 sats
\
3 comments
\
@byzantine
29 Nov 2024
tech
NVIDIA: Transforming LLM Alignment with Efficient Reinforcement Learning
www.marktechpost.com/2024/05/05/nvidia-ai-open-sources-nemo-aligner-transforming-large-language-model-alignment-with-efficient-reinforcement-learning/
20 sats
\
0 comments
\
@ch0k1
7 May 2024
tech
OpenAI o1 vs GPT 4o – Is it worth paying 6x more? - Bind AI
blog.getbind.co/2024/09/13/openai-o1-vs-gpt-4o-is-it-worth-paying-6x-more/
110 sats
\
0 comments
\
@ch0k1
15 Sep 2024
tech
more