📊
Hot点·热榜
首页
综合
科技
娱乐
社区
购物
财经
更多 ▾
开发
AI
设计
🔍
✕
← 返回
stanford-crfm-website
AI
更新于 2026-05-15 01:28
共 10 条
1
HELM Arabic
↗
2
HELM Arabic
↗
3
HELM Long Context
↗
4
Reliable and Efficient Amortized Model-Based Evaluation
↗
5
Surprisingly Fast AI-Generated Kernels We Didn’t Mean to Publish (Yet)
↗
6
BountyBench: Dollar Impact of AI Agent Attackers and Defenders on Real-World Cybersecurity Systems
↗
7
HELM Capabilities: Evaluating LMs Capability by Capability
↗
8
General-Purpose AI Needs Coordinated Flaw Reporting
↗
9
HELM Safety: Towards Standardized Safety Evaluations of Language Models
↗
10
Advancing Customizable Benchmarking in HELM via Unitxt Integration
↗
🏠
全部
📡
综合
💻
科技
🎬
娱乐
💬
社区
↑