📊
Hot点·热榜
首页
综合
科技
娱乐
社区
购物
财经
更多 ▾
开发
AI
设计
🔍
✕
← 返回
Blog on EleutherAI Blog
AI
更新于 2026-05-15 01:28
共 50 条
1
Early Indicators of Reward Hacking via Reasoning Interpolation
↗
2
Reward Hacking Resarch Update
↗
3
Pretraining Data Filtering for Open-Weight AI Safety
↗
4
Attention Probes
↗
5
Research Update: Applications of Local Volume Measurement
↗
6
Studying inductive biases of random networks via local volumes
↗
7
The Common Pile v0.1
↗
8
Product Key Memory Sparse Coders
↗
9
SAEs trained on the same data don’t learn the same features
↗
10
Partially rewriting an LLM in natural language
↗
11
Third-party evaluation to identify risks in LLMs’ training data
↗
12
Mechanistic Anomaly Detection Research Update 2
↗
13
RLHF and RLAIF in GPT-NeoX
↗
14
The Practitioner's Guide to the Maximal Update Parameterization
↗
15
Mechanistic Anomaly Detection Research Update
↗
16
Open Source Automated Interpretability for Sparse Autoencoder Features
↗
17
Experiments in Weak-to-Strong Generalization
↗
18
Free Form Least-Squares Concept Erasure Without Oracle Concept Labels
↗
19
VINC-S: Closed-form Optionally-supervised Knowledge Elicitation with Paraphrase Invariance
↗
20
Pile-T5
↗
21
Yi-34B, Llama 2, and common practices in LLM training: a fact check of the New York Times
↗
22
The Foundation Model Development Cheatsheet
↗
23
Least-Squares Concept Erasure with Oracle Concept Labels
↗
24
Diff-in-Means Concept Editing is Worst-Case Optimal
↗
25
The third New England RLHF Hackers Hackathon
↗
26
Extending the RoPE
↗
27
How the Foundation Model Transparency Index Distorts Transparency
↗
28
Llemma: An Open Language Model For Mathematics
↗
29
The second New England RLHF Hackers Hackathon
↗
30
Contributor Spotlight: Mohammad Aflah Khan
↗
31
The first New England RLHF Hackers Hackathon
↗
32
EleutherAI's Thoughts on the EU AI Act
↗
33
Minetester: A fully open RL environment built on Minetest
↗
34
🐶Safetensors audited as really safe and becoming the default
↗
35
Alignment Research @ EleutherAI
↗
36
Transformer Math 101
↗
37
Exploratory Analysis of TRLX RLHF Transformers with TransformerLens
↗
38
EleutherAI Second Retrospective: The long version
↗
39
The View from 30,000 Feet: Preface to the Second EleutherAI Retrospective
↗
40
Announcing GPT-NeoX-20B
↗
41
A Preliminary Exploration into Factored Cognition with Language Models
↗
42
Multiple Choice Normalization in LM Evaluation
↗
43
Downstream Evaluations of Rotary Position Embeddings
↗
44
What A Long, Strange Trip It's Been: EleutherAI One Year Retrospective
↗
45
Why Release a Large Language Model?
↗
46
On the Sizes of OpenAI API Models
↗
47
Evaluating Different Fewshot Description Prompts on GPT-3
↗
48
Finetuning Models on Downstream Tasks
↗
49
Activation Function Ablation
↗
50
Rotary Embeddings: A Relative Revolution
↗
🏠
全部
📡
综合
💻
科技
🎬
娱乐
💬
社区
↑