Together.ai热榜 - Hot点·热榜

1 Introducing voice finder — a new tool to quickly find the right voice for your app from over 600+ voices ↗

2 Serving DeepSeek-V4: why million-token context is an inference systems problem ↗

3 Deploy and inference any model from HuggingFace ↗

4 Foundational research powering efficient inference at scale ↗

5 Announcing Together AI and Adaption Partnership ↗

6 From 732 bytes to nowhere: shutting down Copy Fail in production ↗

7 DeepSeek-V4 Pro now available on Together AI ↗

8 Together AI Brings NVIDIA Nemotron 3 Nano Omni to Developers on Day 0 ↗

9 Accelerate RL rollouts by up to 50% with distribution-aware speculative decoding ↗

10 Capacity without conflict: A guide to multi-tenant GPU cluster design for AI-native teams ↗

11 Parcae: Doing more with fewer parameters using stable looped models ↗

12 EinsteinArena: Harnessing the collective intelligence of agents in the wild to advance science ↗

13 What is an AI Native Cloud? ↗

14 Wan 2.7 video model suite now available on Together AI ↗

15 AI for Systems: Using LLMs to Optimize Database Query Execution ↗

16 Deepgram speech-to-text and voice models now available natively on Together AI ↗

17 Inside the Together AI kernels team ↗

18 Aurora ↗

19 Plan, divide, and conquer: How weak models excel at long context tasks ↗

20 Together AI expands fine-tuning service with tool calling, reasoning, and vision support ↗

21 Mamba-3 ↗

22 Together AI at NVIDIA GTC 2026: Explore our latest innovations across research and products ↗

23 Build real-time voice agents on Together AI ↗

24 Together AI Brings NVIDIA Nemotron 3 to Developers on Day 0 ↗

25 New in Together GPU Clusters: Autoscaling, observability, and self-healing ↗

26 FlashAttention-4: Algorithm and Kernel Pipelining Co-Design for Asymmetric Hardware Scaling ↗

27 Key research and product announcements at the AI Native Conf ↗

28 Cache-aware prefill–decode disaggregation (CPD) for up to 40% faster long-context LLM serving ↗

29 Introducing Together AI’s new look ↗

30 CoderForge-Preview: SOTA open dataset for training efficient coding agents ↗

31 How speech models fail where it matters the most and what to do about it ↗

32 Consistency diffusion language models: Up to 14x faster inference without sacrificing quality ↗

33 Introducing Dedicated Container Inference: Delivering 2.6x faster inference for custom AI models ↗

34 What do LLMs think when you don't tell them what to think about? ↗

35 Rime Arcana V3 Turbo and Rime Arcana V3 now available on Together AI ↗

36 Together AI welcomes Alon Gavrielov as VP of Infrastructure Strategy ↗

37 Together Evaluations now supports comparing top commercial APIs vs. open source models ↗

38 Fine-tuning open LLM judges to outperform GPT-5.2 ↗

39 DSGym: A holistic framework for evaluating and training data science agents ↗

40 Optimizing inference speed and costs: Lessons learned from large-scale deployments ↗

41 Learn how Cursor partnered with Together AI to deliver real-time, low-latency inference at scale ↗

42 Inside multi-node training: How to scale model training across GPU clusters ↗

43 How to choose the right open model for production ↗

44 MiniMax Speech 2.6 Turbo now available natively on Together AI ↗

45 Rime voice models now available on Together AI ↗

46 Research POV: Yes, AGI Can Happen – A Computational Perspective ↗

47 Announcing native availability of NVIDIA Nemotron 3 Nano, NVIDIA’s latest reasoning model ↗

48 Announcing Together Python SDK v2.0 ↗

49 How to run TorchForge reinforcement learning pipelines in the Together AI Native Cloud ↗

50 Together AI and Meta partner to bring PyTorch Reinforcement Learning to the AI Native Cloud ↗

51 Introducing AutoJudge: Streamlined inference acceleration via automated dataset curation ↗

52 Together AI delivers fastest inference for the top open-source models ↗

53 FLUX.2: Multi-reference image generation now available on Together AI ↗

54 How to evaluate and benchmark Large Language Models (LLMs) ↗

55 Announcing the fastest inference for realtime voice AI agents ↗

56 Dynamic AI agent testing for the real world with Collinear Simulations and Together Evals ↗

57 Large Reasoning Models Fail to Follow Instructions During Reasoning: A Benchmark Study ↗

58 Expanding Together AI Model Library into multimedia generation with 40+ new image and video models ↗

59 Announcing the Together AI Startup Accelerator, purpose-built for AI Native Apps ↗

60 AdapTive-LeArning Speculator System (ATLAS): A New Paradigm in LLM Inference via Runtime-Learning Accelerators ↗

61 Improved Batch Inference API: Enhanced UI, Expanded Model Support, and 3000× Rate Limit Increase ↗

62 Fine-Tuning Platform Upgrades: Larger Models, Longer Contexts, Enhanced Hugging Face Integrations ↗

63 Together AI welcomes Mahadev Konar as SVP for Infrastructure Engineering ↗

64 Announcing General Availability of Together Instant Clusters, offering ready to use, self-service NVIDIA GPUs ↗

65 DeepSeek-V3.1: Hybrid Thinking Model Now Available on Together AI ↗

66 How Together AI Uses AI Agents to Automate Complex Engineering Tasks: Lessons from Developing Efficient LLM Inference Systems ↗

67 Transform OpenAI gpt-oss Models into Domain Experts with Together AI Fine-Tuning ↗

68 Fine-Tuning Small Open-Source LLMs to Outperform Large Closed-Source Models by 60% on Specialized Tasks ↗

69 OpenAI's New Open gpt-oss Models vs o4-mini: A Real-World Comparison ↗

70 Announcing the Availability of OpenAI's Open Models on Together AI ↗

71 VirtueGuard: Enterprise-Grade AI Security and Safety Now on Together AI ↗

72 Together Evaluations: Benchmark Models for Your Tasks ↗

73 Qwen3-Coder: The Most Capable Agentic Coding Model Now Available on Together AI ↗

74 Back to The Future: Evaluating AI Agents on Predicting Future Events ↗

75 Together AI Delivers Top Speeds for DeepSeek-R1-0528 Inference on NVIDIA Blackwell ↗

76 Kimi K2: Leading Open-Source Model Now Available on Together AI ↗

77 Together AI Launches Speech-to-Text: High-Performance Whisper APIs ↗

78 Powering Secure AI: Together AI Achieves SOC 2 Type 2 Compliance ↗

79 DeepSWE: Training a Fully Open-sourced, State-of-the-Art Coding Agent by Scaling RL ↗

80 From Zero to One: Building An Autonomous and Open Data Scientist Agent from Scratch ↗

81 Bringing 100,000 GPUs to Europe ↗

82 Introducing the Together AI Batch API: Process Thousands of LLM Requests at 50% Lower Cost ↗

83 The Frontier is Open ↗

84 Model-Preserving Adaptive Rounding with YAQA ↗

85 FLUX.1 Kontext models: Character consistency and precise image editing without fine-tuning ↗

86 Mixture-of-Agents Alignment: Harnessing the Collective Intelligence of Open-Source LLMs to Improve Post-Training ↗

87 Together Code Sandbox: the most robust infrastructure for building AI coding products at scale ↗

88 Introducing Together Code Sandbox & Together Code Interpreter: SOTA code execution for AI ↗

89 Together Code Interpreter: execute LLM-generated code seamlessly with a simple API call ↗

90 Together AI acquires Refuel.ai to unlock data for developers and businesses building production-grade AI applications ↗

91 Boosting DeepSeek-R1’s Speed with Customized Speculative Decoding ↗

92 From AWS to Together Dedicated Endpoints: Arcee AI's journey to greater inference flexibility ↗

93 Salesforce, Zoom, InVideo Train Faster with Together AI Turbocharged with NVIDIA Blackwell ↗

94 Chipmunk: Training-Free Acceleration of Diffusion Transformers with Dynamic Column-Sparse Deltas ↗

95 Continued Fine-tuning of LLMs: A Technical Deep Dive ↗

96 Direct Preference Optimization: A Technical Deep Dive ↗

97 Together Fine-Tuning Platform, Now With Preference Optimization and Continued Training ↗

98 Open Deep Research ↗

99 DeepCoder: A Fully Open-Source 14B Coder at O3-mini Level ↗

100 Together AI partners with Meta to offer Llama 4: SOTA Multimodal MoE Models ↗