@anon
sign up
@anon
sign up
pull down to refresh
Efficient LLM Inference
arxiv.org/abs/2507.14397
121 sats
\
0 comments
\
@carter
14h
AI
related
LLM in a Flash: Efficient LLM Inference with Limited Memory
huggingface.co/papers/2312.11514
13 sats
\
1 comment
\
@hn
20 Dec 2023
tech
Compiling LLMs into a MegaKernel: A path to low-latency inference
zhihaojia.medium.com/compiling-llms-into-a-megakernel-a-path-to-low-latency-inference-cf7840913c17
10 sats
\
0 comments
\
@hn
19 Jun
tech
DBRX: A new open LLM
www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm
10 sats
\
1 comment
\
@hn
31 Mar 2024
tech
1-Bit LLM: The Most Efficient LLM Possible?
www.youtube.com/watch?v=7hMoz9q4zv0
533 sats
\
1 comment
\
@carter
24 Jun
AI
Defeating Nondeterminism in LLM Inference
thinkingmachines.ai/blog/defeating-nondeterminism-in-llm-inference/
267 sats
\
0 comments
\
@carter
11 Sep
AI
OpenCoder: Open-Source LLM for Coding
arxiv.org/abs/2411.04905
52 sats
\
0 comments
\
@hn
9 Nov 2024
tech
Sampling and structured outputs in LLMs
parthsareen.com/blog.html#sampling.md
157 sats
\
0 comments
\
@carter
23 Sep
AI
What We Know About LLMs (A Primer)
willthompson.name/what-we-know-about-llms-primer
163 sats
\
1 comment
\
@hn
25 Jul 2023
tech
Lm.rs: Minimal CPU LLM inference in Rust with no dependency
github.com/samuel-vitorino/lm.rs
10 sats
\
0 comments
\
@hn
11 Oct 2024
tech
LLM evaluation at scale with the NeurIPS Efficiency Challenge
blog.mozilla.ai/exploring-llm-evaluation-at-scale-with-the-neurips-large-language-model-efficiency-challenge/
110 sats
\
0 comments
\
@localhost
22 Feb 2024
tech
LLMs use a surprisingly simple mechanism to retrieve some stored knowledge
news.mit.edu/2024/large-language-models-use-surprisingly-simple-mechanism-retrieve-stored-knowledge-0325
128 sats
\
1 comment
\
@hn
31 Mar 2024
tech
LiveBench - A Challenging, Contamination-Free LLM Benchmark
livebench.ai
161 sats
\
0 comments
\
@supratic
17 Jul
AI
Ladder: Self-Improving LLMs Through Recursive Problem Decomposition
arxiv.org/abs/2503.00735
39 sats
\
0 comments
\
@hn
7 Mar
tech
Lessons learned from programming with LLMs
crawshaw.io/blog/programming-with-llms
120 sats
\
1 comment
\
@m0wer
5 Jul
AI
Exploring Advanced Reasoning Techniques for LLMs: CoT, STaR, and ToT
www.eddieoz.com/exploring-advanced-reasoning-techniques-for-llms-chain-of-thought-cot-step-by-step-rationalization-star-and-tree-of-thoughts-tot/
11 sats
\
0 comments
\
@Rsync25
20 Aug 2024
tech
LLM-Deflate: Extracting LLMs Into Datasets
www.scalarlm.com/blog/llm-deflate-extracting-llms-into-datasets/
100 sats
\
1 comment
\
@carter
29 Sep
AI
Compute Where It Counts: High Quality Sparsely Activated LLMs
crystalai.org/blog/2025-08-18-compute-where-it-counts
100 sats
\
0 comments
\
@carter
21 Aug
AI
Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM
www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm
306 sats
\
1 comment
\
@nullama
13 Apr 2023
bitcoin
Coping with dumb LLMs using classic ML
softwaredoug.com/blog/2025/01/21/llm-judge-decision-tree
31 sats
\
0 comments
\
@hn
24 Jan
tech
Detecting when LLMs are uncertain
www.thariq.io/blog/entropix/
49 sats
\
0 comments
\
@hn
25 Oct 2024
tech
Communication Efficient LLM Pre-training with SparseLoCo
arxiv.org/abs/2508.15706
100 sats
\
0 comments
\
@carter
1 Sep
AI
more