2026 年 4 月 11 日 星期六
  • 登录
  • 注册
周天财经
广告
  • 首页
  • 24 小时
  • 世界
  • 商业
  • 基金
  • 期货
  • 股票
  • 行业新闻
  • 黄金
没有结果
查看所有结果
  • 首页
  • 24 小时
  • 世界
  • 商业
  • 基金
  • 期货
  • 股票
  • 行业新闻
  • 黄金
没有结果
查看所有结果
周天财经
没有结果
查看所有结果
首页 商业

Nvidia Open Sources Latest VLA—Can It Break Through L4 Autonomous Driving Barriers?

2025 年 12 月 3 日
在 商业
阅读时间: 7 mins read
阅读:780
A A

Related articles

睿能科技「两连板」关键并购:押注宁王系供应商,「豪赌」基本面破局 | 并购一线​

睿能科技 「两连板」 关键并购:押注宁王系供应商,「豪赌」 基本面破局 | 并购一线​

2026 年 4 月 10 日
雪饼猴们「整顿」演技综艺

雪饼猴们 「整顿」 演技综艺

2026 年 4 月 10 日


Image Source: AI-generated

As large AI models are increasingly integrated into the automotive industry, competition in the sector is shifting from basic functionality to high-level intelligent driving capabilities. The VLA (Vision-Language-Action Model) is now seen as the key variable driving the next wave of technological advancement.

On December 1, NVIDIA officially announced the open-sourcing of its latest autonomous driving Vision-Language-Action (VLA) model, Alpamayo-R1. This model can simultaneously process vehicle camera footage and textual instructions to output driving decisions. It is now open-sourced on both GitHub and Hugging Face, with the release of the Cosmos Cookbook development toolkit.

This marks the industry』s first open-source VLA model dedicated to autonomous driving. NVIDIA aims to use this move to provide core technical support for the adoption of L4-level autonomous driving.

Notably, compared with traditional black-box autonomous driving algorithms, NVIDIA』s Alpamayo-R1 emphasizes explainability, capable of providing the reasoning behind its decisions. This feature assists with safety validation, regulatory review, and accident liability determination. Accompanying tools like the Cosmos Cookbook make it easier for companies and developers to efficiently train, evaluate, and deploy solutions.

Industry experts believe NVIDIA is attempting to lower development barriers, accelerate the standardization of the software stack, and break away from the costly 「fully in-house development」 approach prevalent in Robotaxi operations. The goal is to create an 「Android-style」 ecosystem that allows for rapid modular assembly.

However, some insiders told the author that NVIDIA』s open-sourcing of Alpamayo-R1 is similar to Baidu』s Apollo initiative: valuable for newcomers to the autonomous driving field, but not particularly significant for established, specialized companies.

Currently, VLA technology is widely recognized as the next-generation core for intelligent driving, prompting increased investment from major players. In China, companies such as Li Auto, XPeng Motors, GWM (already applied in the Wey Lanshan model), and DeepRoute have all achieved mass production deployments based on VLA.

Addressing the Pain Points of Traditional End-to-End Models

Traditional end-to-end models often function as a black box—they may be "visible but not understandable," and are prone to failure when encountering long-tail scenarios such as illegal left turns or pedestrians darting into the road.

Compared with traditional end-to-end models, VLA introduces language modality as an intermediate layer, transforming visual perception into an interpretable logical chain. This enhances its potential to handle long-tail and complex, unpredictable scenarios, allowing the machine to observe, reason, and decide like a human, rather than merely mapping massive amounts of data into outputs.

In the field of autonomous driving, the VLA (Vision-Language-Action) large model represents a technological direction that deeply integrates visual perception, language understanding, and decision-making control. It can directly output driving actions, and its core advantages include more powerful environmental comprehension and reasoning abilities, more efficient integrated decision-making, superior handling of long-tail situations, more transparent human-machine interaction and trust-building, and more natural vehicle control methods.

The VLA model Alpamayo-R1, newly open-sourced by NVIDIA, is trained on an entirely new Chain of Causation (CoC) dataset. Each segment of driving data is annotated not only with what the vehicle did, but also with why it took that action.

For example: 「The vehicle slowed down and changed lanes to the left because there was a moped stopped at the red light ahead, and the left lane was clear.」 This means the model learns reasoning based on causality, rather than memorizing fixed patterns by rote.

At the same time, thanks to a modular VLA architecture, NVIDIA』s Alpamayo-R1 combines Cosmos-Reason—a vision-language model pretrained for physical AI applications—with a diffusion-based trajectory decoder, enabling real-time generation of feasible dynamic plans. In addition, a multi-stage training strategy is used: supervised fine-tuning is first employed to enhance reasoning ability, followed by reinforcement learning (RL) to optimize inference quality—leveraging feedback from large reasoning models and ensuring consistency between reasoning and action.

According to data released by NVIDIA, the Alpamayo-R1 has achieved a 12% improvement in trajectory planning performance in complex scenarios, a 25% reduction in close-range collision rates, a 45% enhancement in inference quality, and a 37% boost in inference-action consistency. As the model parameters expanded from 0.5B to 7B, performance continued to improve. On-road vehicle testing has verified its real-time capabilities (with a latency of 99 milliseconds) and the feasibility of deployment in urban environments.

As a result, NVIDIA』s Alpamayo-R1 is expected to enable a major leap in capabilities for L4 autonomous driving, paving the way for Robotaxi services to safely integrate into real-world, chaotic public roads.

Becoming the 「Android」 of the Autonomous Driving Arena

The open-sourcing of Alpamayo-R1 once again demonstrates NVIDIA's ambition in the autonomous driving sector. The company is no longer content with merely being a hardware supplier—it aims to become the 「Android」 of the autonomous driving industry.

In fact, as early as this October, NVIDIA quietly released its Alpamayo-R1 large model to the public. At the Washington GTC conference, NVIDIA also unveiled its autonomous driving platform—NVIDIA DRIVE AGX Hyperion 10.

Hyperion 10 is regarded as the 「body」 of NVIDIA』s autonomous driving solution, while Alpamayo-R1 is its 「brain.」

Notably, Hyperion 10 achieves a closed-loop process 「from simulation to real vehicle」: In the cloud, DGX supercomputers use DRIVE Sim to generate high-fidelity simulation data, which is used to train DRIVE AV models. On the vehicle end, sensor data from Hyperion 10 seamlessly integrates with Thor chips.

Thus, if an automaker wants to quickly launch a model with L4 capabilities, it no longer needs to build separate, large-scale teams for hardware integration, software algorithms, and data training. By adopting NVIDIA』s full-stack solutions, rapid vehicle deployment can be achieved.

At the same time, NVIDIA is building an 「Android-style」 Robotaxi ecosystem, and has announced a clear timeline for rollout: deployment of 100,000 Robotaxis starting in 2027.

Currently, NVIDIA has announced partnerships with Uber, Mercedes-Benz, Stellantis, Lucid, and others to jointly build the 「world』s largest L4-level autonomous driving fleet.」 As of October 2025, NVIDIA』s cloud platform will have accumulated over 5 million hours of real-world road data.

NVIDIA』s entry is shifting Robotaxi competition from a pure technology race to a battle of ecosystems.

The closed ecosystem not only leads to redundant R&D investment, but its more profound drawback is the creation of data silos. For example, Waymo』s driving data on U.S. roads is difficult to benefit Chinese automakers, and each player is independently—and slowly—climbing the technology curve.

Nvidia』s open ecosystem creates an opportunity for players within it to share anonymized feature data while ensuring data privacy and security. For instance, if Automaker A encounters an extreme scenario at a specific intersection, that scenario』s data can be anonymized and converted into training features, helping Automaker B』s models more quickly recognize similar risks.

If Nvidia can become the Android of the autonomous driving sector, it could shift the entire industry』s technological evolution from a linear pace to exponential acceleration. This is more than just technology sharing—it』s also about shared costs. The marginal cost of collectively addressing long-tail scenarios, the industry』s biggest challenge, will continue to decrease as the ecosystem expands.

According to Zhou Guang, CEO of Yuanrong Qixing, the VLA could bring a leapfrog lead and become the critical variable in the next round of competition.

Tian Shan, CTO of DeepWay, told the author that VLA is currently a hot trend in autonomous driving, attracting many researchers. It can significantly improve both the generalization and reasoning capabilities of autonomous driving models. Nvidia』s open-sourcing of Alpamayo-R1 allows more people to participate in research and contribute to this hot and promising self-driving technology, which will actively promote the development and implementation of VLA. Moreover, this technology can also be applied to embodied intelligence and other physical AI scenarios.

Invisible barriers still lie ahead

However, in order for Alpamayo-R1 to meet automotive-grade latency requirements, it still needs to run on top-tier cards like the RTX A6000 Pro Blackwell—the INT8 performance of this card reaches 4000T, which is about six times that of Thor.

Nvidia』s business model means its open-source initiatives are ultimately designed to better sell its hardware and full-stack solutions. The Alpamayo-R1 model is deeply integrated with Nvidia』s chips (such as Thor) and development platforms (such as Drive), allowing for greater computational efficiency.

In other words, joining the Nvidia ecosystem brings convenience, but also creates a deep dependency on Nvidia for core computing power.

Additionally, as DeepWay CTO Tian Shan pointed out, whether VLA is the best autonomous driving technology is still an open question. The Alpamayo-R1 model』s toolchain is based on Nvidia』s platforms, which is a limitation for many developers. As a result, other technologies and computing platforms are also pushing forward the development of autonomous driving.

In Tian Shan』s view, most companies should focus more on the application of technology in real-world scenarios—that is, the engineering implementation of technology. Addressing practical, real-life problems and achieving a commercially viable closed loop for intelligent driving technology as soon as possible will be more beneficial for the industry』s long-term, healthy development.

Furthermore, the large-scale commercialization and implementation of L4 autonomous driving—or Robotaxi services—are closely tied to policies and regulations. The ability to operate within compliance frameworks, undergo safety assessments, and strike a balance between data utilization and privacy protection is just as important as technological capability itself.

Jensen Huang, the founder and CEO of NVIDIA, has always regarded Robotaxi as the 「first commercial application of robotics technology.」 Instead of building a single driverless taxi, NVIDIA』s goal has always been to provide the technological foundation enabling all players to create their own driverless taxis. Now, he is attempting to establish a fast-replicable production line for this application through the open-sourcing of VLA.

However, whether open source can truly lower the barriers to entry and accelerate the arrival of L4 autonomous driving—ultimately unleashing technology』s full potential across broader commercial horizons—remains to be seen. The open-sourcing of NVIDIA』s Alpamayo-R1 model is only the beginning of the game. More hurdles still lie ahead, and it will take the market to validate its true potential. (Writing by Zhang Min; Editing by Chelsea Sun and Li Chengcheng)

广告

相关 文章

睿能科技「两连板」关键并购:押注宁王系供应商,「豪赌」基本面破局 | 并购一线​

睿能科技 「两连板」 关键并购:押注宁王系供应商,「豪赌」 基本面破局 | 并购一线​

来自 周天财经
2026 年 4 月 10 日
0

图片系 AI 生成 4 月 9 日晚间,停牌多日的睿...

雪饼猴们「整顿」演技综艺

雪饼猴们 「整顿」 演技综艺

来自 周天财经
2026 年 4 月 10 日
0

(本文作者为 犀牛娱乐,钛媒体经授权发布...

AI带火了储能,但「升咖」之路依然漫长

AI 带火了储能,但 「升咖」 之路依然漫长

来自 周天财经
2026 年 4 月 10 日
0

(本文作者为 飞向 TAI 空,钛媒体经授权...

AI教育风口下,有人乘风破浪,有人艰难求生

手握专利武器大杀四方的 Maxeon,为何走到了破产边缘?

来自 周天财经
2026 年 4 月 10 日
0

(本文作者为 华夏能源网,钛媒体经授权发...

Anthropic挖角微软Azure AI掌舵人,负责AI 基础设施建设

Anthropic 挖角微软 Azure AI 掌舵人,负责 AI 基础设施建设

来自 周天财经
2026 年 4 月 10 日
0

4 月 8 日,前微软企业副总裁埃里克&mid...

加载更多
广告
  • 热门
  • 评论
  • 最新
神马经典投研: 集资讯、策略、研报一站式期货投研工具

神马经典投研: 集资讯、策略、研报一站式期货投研工具

2025 年 11 月 7 日
「我们也深陷残酷价格战」,德资巨头中国区高管警告

「我们也深陷残酷价格战」,德资巨头中国区高管警告

2025 年 8 月 4 日
一周产业基金|上海市人工智能CVC基金发布;湖北百亿人形机器人母基金来了

一周产业基金|上海市人工智能 CVC 基金发布;湖北百亿人形机器人母基金来了

2025 年 8 月 4 日
「硬科技」指数携手上涨,半导体设备ETF易方达(159558)、芯片ETF易方达(516350)等产品助力布局板块龙头

基民懵了!这个火爆的板块年内涨超 37%,主力却借道 ETF 狂抛逾 400 亿元

2025 年 9 月 20 日
Lesson 1: Basics Of Photography With Natural Lighting

The Single Most Important Thing You Need To Know About Success

4
Lesson 1: Basics Of Photography With Natural Lighting

Lesson 1: Basics Of Photography With Natural Lighting

3
Lesson 1: Basics Of Photography With Natural Lighting

5 Ways Animals Will Help You Get More Business

2
Lesson 1: Basics Of Photography With Natural Lighting

New Cryptocurrency That Will Kill Of Bitcoin

2

国务院印发 《中国 (内蒙古) 自由贸易试验区总体方案》

2026 年 4 月 10 日
星巴克发布「千店千面」战略,未来三年覆盖1500个县级行政区

星巴克发布 「千店千面」 战略,未来三年覆盖 1500 个县级行政区

2026 年 4 月 10 日
睿能科技「两连板」关键并购:押注宁王系供应商,「豪赌」基本面破局 | 并购一线​

睿能科技 「两连板」 关键并购:押注宁王系供应商,「豪赌」 基本面破局 | 并购一线​

2026 年 4 月 10 日

4 月 8 日周大福黄金价格报 1476 元/克 较上一日上涨 43 元/克

2026 年 4 月 10 日
  • 隐私政策
  • 联系我们
  • 关于周天
  • 登录
  • 注册
投诉建议:+86 13326565461

© 2025 广州小舟天传媒有限公司 by 周天财经 - 粤 ICP 备 2025452169 号-1

没有结果
查看所有结果
  • 首页
  • 24 小时
  • 世界
  • 商业
  • 基金
  • 期货
  • 股票
  • 行业新闻
  • 黄金

© 2025 广州小舟天传媒有限公司 by 周天财经 - 粤 ICP 备 2025452169 号-1

欢迎回来!

在下面登录您的帐户

忘记密码? 注册

创建新帐户!

填写以下表格进行注册

所有项目需要填写。 登录

重置您的密码

请输入您的用户名或电子邮件地址以重置密码。

登录

用户登录

还没有账号?立即注册

用户注册

已有账号?立即登录