← 返回
Hugging Face Daily Papers

Hugging Face Daily Papers

开发
更新于 2026-05-15 01:27 共 43 条
  1. 1 FeatCal: Feature Calibration for Post-Merging Models
  2. 2 MAP: A Map-then-Act Paradigm for Long-Horizon Interactive Agent Reasoning
  3. 3 Context Training with Active Information Seeking
  4. 4 RoboEvolve: Co-Evolving Planner-Simulator for Robotic Manipulation with Limited Data
  5. 5 Revisiting DAgger in the Era of LLM-Agents
  6. 6 Asymmetric Flow Models
  7. 7 Retrieval is Cheap, Show Me the Code: Executable Multi-Hop Reasoning for Retrieval-Augmented Generation
  8. 8 Many-Shot CoT-ICL: Making In-Context Learning Truly Learn
  9. 9 FrameSkip: Learning from Fewer but More Informative Frames in VLA Training
  10. 10 Edit-Compass & EditReward-Compass: A Unified Benchmark for Image Editing and Reward Modeling
  11. 11 Qwen-Image-VAE-2.0 Technical Report
  12. 12 Training Long-Context Vision-Language Models Effectively with Generalization Beyond 128K Context
  13. 13 RealICU: Do LLM Agents Understand Long-Context ICU Data? A Benchmark Beyond Behavior Imitation
  14. 14 AgentLens: Revealing The Lucky Pass Problem in SWE-Agent Evaluation
  15. 15 MinT: Managed Infrastructure for Training and Serving Millions of LLMs
  16. 16 Frequency Bias and OOD Generalization in Neural Operators under a Variable-Coefficient Wave Equation
  17. 17 AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation
  18. 18 F-GRPO: Factorized Group-Relative Policy Optimization for Unified Candidate Generation and Ranking
  19. 19 Vividh-ASR: A Complexity-Tiered Benchmark and Optimization Dynamics for Robust Indic Speech Recognition
  20. 20 PersonalAI 2.0: Enhancing knowledge graph traversal/retrieval with planning mechanism for Personalized LLM Agents
  21. 21 IndicMedDialog: A Parallel Multi-Turn Medical Dialogue Dataset for Accessible Healthcare in Indic Languages
  22. 22 Learning to Explore: Scaling Agentic Reasoning via Exploration-Aware Policy Optimization
  23. 23 ShapeCodeBench: A Renewable Benchmark for Perception-to-Program Reconstruction of Synthetic Shape Scenes
  24. 24 Position: LLM Inference Should Be Evaluated as Energy-to-Token Production
  25. 25 Visual Aesthetic Benchmark: Can Frontier Models Judge Beauty?
  26. 26 Orthrus: Memory-Efficient Parallel Token Generation via Dual-View Diffusion
  27. 27 PresentAgent-2: Towards Generalist Multimodal Presentation Agents
  28. 28 Learning Agentic Policy from Action Guidance
  29. 29 The DAWN of World-Action Interactive Models
  30. 30 TrackCraft3R: Repurposing Video Diffusion Transformers for Dense 3D Tracking
  31. 31 WriteSAE: Sparse Autoencoders for Recurrent State
  32. 32 Predicting Decisions of AI Agents from Limited Interaction through Text-Tabular Modeling
  33. 33 M2Retinexformer: Multi-Modal Retinexformer for Low-Light Image Enhancement
  34. 34 MemReread: Enhancing Agentic Long-Context Reasoning via Memory-Guided Rereading
  35. 35 HAGE: Harnessing Agentic Memory via RL-Driven Weighted Graph Evolution
  36. 36 MulTaBench: Benchmarking Multimodal Tabular Learning with Text and Image
  37. 37 From Pixels to Concepts: Do Segmentation Models Understand What They Segment?
  38. 38 Offline Preference Optimization for Rectified Flow with Noise-Tracked Pairs
  39. 39 The Extrapolation Cliff in On-Policy Distillation of Near-Deterministic Structured Outputs
  40. 40 FAAST: Forward-Only Associative Learning via Closed-Form Fast Weights for Test-Time Supervised Adaptation
  41. 41 Retrieval from Within: An Intrinsic Capability of Attention-Based Models
  42. 42 Results and Retrospective Analysis of the CODS 2025 AssetOpsBench Challenge
  43. 43 SafeHarbor: Hierarchical Memory-Augmented Guardrail for LLM Agent Safety