2026 年 7 月 2 日 星期四
  • 登录
  • 注册
周天财经
广告
  • 首页
  • 24 小时
  • 世界
  • 商业
  • 基金
  • 期货
  • 股票
  • 行业新闻
  • 黄金
没有结果
查看所有结果
  • 首页
  • 24 小时
  • 世界
  • 商业
  • 基金
  • 期货
  • 股票
  • 行业新闻
  • 黄金
没有结果
查看所有结果
周天财经
没有结果
查看所有结果
首页 商业

Nvidia Open Sources Latest VLA—Can It Break Through L4 Autonomous Driving Barriers?

2025 年 12 月 3 日
在 商业
阅读时间: 7 mins read
阅读:782
A A

Related articles

A股大消费已经跌出幻觉,食品饮料龙头们需要一场叙事重启

旧船已漏,新陆未现 — 一个普通人在 AI 时代的观察与思考

2026 年 7 月 2 日
灵光一号位转岗,蚂蚁阿福收兵

灵光一号位转岗,蚂蚁阿福收兵

2026 年 7 月 2 日


Image Source: AI-generated

As large AI models are increasingly integrated into the automotive industry, competition in the sector is shifting from basic functionality to high-level intelligent driving capabilities. The VLA (Vision-Language-Action Model) is now seen as the key variable driving the next wave of technological advancement.

On December 1, NVIDIA officially announced the open-sourcing of its latest autonomous driving Vision-Language-Action (VLA) model, Alpamayo-R1. This model can simultaneously process vehicle camera footage and textual instructions to output driving decisions. It is now open-sourced on both GitHub and Hugging Face, with the release of the Cosmos Cookbook development toolkit.

This marks the industry』s first open-source VLA model dedicated to autonomous driving. NVIDIA aims to use this move to provide core technical support for the adoption of L4-level autonomous driving.

Notably, compared with traditional black-box autonomous driving algorithms, NVIDIA』s Alpamayo-R1 emphasizes explainability, capable of providing the reasoning behind its decisions. This feature assists with safety validation, regulatory review, and accident liability determination. Accompanying tools like the Cosmos Cookbook make it easier for companies and developers to efficiently train, evaluate, and deploy solutions.

Industry experts believe NVIDIA is attempting to lower development barriers, accelerate the standardization of the software stack, and break away from the costly 「fully in-house development」 approach prevalent in Robotaxi operations. The goal is to create an 「Android-style」 ecosystem that allows for rapid modular assembly.

However, some insiders told the author that NVIDIA』s open-sourcing of Alpamayo-R1 is similar to Baidu』s Apollo initiative: valuable for newcomers to the autonomous driving field, but not particularly significant for established, specialized companies.

Currently, VLA technology is widely recognized as the next-generation core for intelligent driving, prompting increased investment from major players. In China, companies such as Li Auto, XPeng Motors, GWM (already applied in the Wey Lanshan model), and DeepRoute have all achieved mass production deployments based on VLA.

Addressing the Pain Points of Traditional End-to-End Models

Traditional end-to-end models often function as a black box—they may be "visible but not understandable," and are prone to failure when encountering long-tail scenarios such as illegal left turns or pedestrians darting into the road.

Compared with traditional end-to-end models, VLA introduces language modality as an intermediate layer, transforming visual perception into an interpretable logical chain. This enhances its potential to handle long-tail and complex, unpredictable scenarios, allowing the machine to observe, reason, and decide like a human, rather than merely mapping massive amounts of data into outputs.

In the field of autonomous driving, the VLA (Vision-Language-Action) large model represents a technological direction that deeply integrates visual perception, language understanding, and decision-making control. It can directly output driving actions, and its core advantages include more powerful environmental comprehension and reasoning abilities, more efficient integrated decision-making, superior handling of long-tail situations, more transparent human-machine interaction and trust-building, and more natural vehicle control methods.

The VLA model Alpamayo-R1, newly open-sourced by NVIDIA, is trained on an entirely new Chain of Causation (CoC) dataset. Each segment of driving data is annotated not only with what the vehicle did, but also with why it took that action.

For example: 「The vehicle slowed down and changed lanes to the left because there was a moped stopped at the red light ahead, and the left lane was clear.」 This means the model learns reasoning based on causality, rather than memorizing fixed patterns by rote.

At the same time, thanks to a modular VLA architecture, NVIDIA』s Alpamayo-R1 combines Cosmos-Reason—a vision-language model pretrained for physical AI applications—with a diffusion-based trajectory decoder, enabling real-time generation of feasible dynamic plans. In addition, a multi-stage training strategy is used: supervised fine-tuning is first employed to enhance reasoning ability, followed by reinforcement learning (RL) to optimize inference quality—leveraging feedback from large reasoning models and ensuring consistency between reasoning and action.

According to data released by NVIDIA, the Alpamayo-R1 has achieved a 12% improvement in trajectory planning performance in complex scenarios, a 25% reduction in close-range collision rates, a 45% enhancement in inference quality, and a 37% boost in inference-action consistency. As the model parameters expanded from 0.5B to 7B, performance continued to improve. On-road vehicle testing has verified its real-time capabilities (with a latency of 99 milliseconds) and the feasibility of deployment in urban environments.

As a result, NVIDIA』s Alpamayo-R1 is expected to enable a major leap in capabilities for L4 autonomous driving, paving the way for Robotaxi services to safely integrate into real-world, chaotic public roads.

Becoming the 「Android」 of the Autonomous Driving Arena

The open-sourcing of Alpamayo-R1 once again demonstrates NVIDIA's ambition in the autonomous driving sector. The company is no longer content with merely being a hardware supplier—it aims to become the 「Android」 of the autonomous driving industry.

In fact, as early as this October, NVIDIA quietly released its Alpamayo-R1 large model to the public. At the Washington GTC conference, NVIDIA also unveiled its autonomous driving platform—NVIDIA DRIVE AGX Hyperion 10.

Hyperion 10 is regarded as the 「body」 of NVIDIA』s autonomous driving solution, while Alpamayo-R1 is its 「brain.」

Notably, Hyperion 10 achieves a closed-loop process 「from simulation to real vehicle」: In the cloud, DGX supercomputers use DRIVE Sim to generate high-fidelity simulation data, which is used to train DRIVE AV models. On the vehicle end, sensor data from Hyperion 10 seamlessly integrates with Thor chips.

Thus, if an automaker wants to quickly launch a model with L4 capabilities, it no longer needs to build separate, large-scale teams for hardware integration, software algorithms, and data training. By adopting NVIDIA』s full-stack solutions, rapid vehicle deployment can be achieved.

At the same time, NVIDIA is building an 「Android-style」 Robotaxi ecosystem, and has announced a clear timeline for rollout: deployment of 100,000 Robotaxis starting in 2027.

Currently, NVIDIA has announced partnerships with Uber, Mercedes-Benz, Stellantis, Lucid, and others to jointly build the 「world』s largest L4-level autonomous driving fleet.」 As of October 2025, NVIDIA』s cloud platform will have accumulated over 5 million hours of real-world road data.

NVIDIA』s entry is shifting Robotaxi competition from a pure technology race to a battle of ecosystems.

The closed ecosystem not only leads to redundant R&D investment, but its more profound drawback is the creation of data silos. For example, Waymo』s driving data on U.S. roads is difficult to benefit Chinese automakers, and each player is independently—and slowly—climbing the technology curve.

Nvidia』s open ecosystem creates an opportunity for players within it to share anonymized feature data while ensuring data privacy and security. For instance, if Automaker A encounters an extreme scenario at a specific intersection, that scenario』s data can be anonymized and converted into training features, helping Automaker B』s models more quickly recognize similar risks.

If Nvidia can become the Android of the autonomous driving sector, it could shift the entire industry』s technological evolution from a linear pace to exponential acceleration. This is more than just technology sharing—it』s also about shared costs. The marginal cost of collectively addressing long-tail scenarios, the industry』s biggest challenge, will continue to decrease as the ecosystem expands.

According to Zhou Guang, CEO of Yuanrong Qixing, the VLA could bring a leapfrog lead and become the critical variable in the next round of competition.

Tian Shan, CTO of DeepWay, told the author that VLA is currently a hot trend in autonomous driving, attracting many researchers. It can significantly improve both the generalization and reasoning capabilities of autonomous driving models. Nvidia』s open-sourcing of Alpamayo-R1 allows more people to participate in research and contribute to this hot and promising self-driving technology, which will actively promote the development and implementation of VLA. Moreover, this technology can also be applied to embodied intelligence and other physical AI scenarios.

Invisible barriers still lie ahead

However, in order for Alpamayo-R1 to meet automotive-grade latency requirements, it still needs to run on top-tier cards like the RTX A6000 Pro Blackwell—the INT8 performance of this card reaches 4000T, which is about six times that of Thor.

Nvidia』s business model means its open-source initiatives are ultimately designed to better sell its hardware and full-stack solutions. The Alpamayo-R1 model is deeply integrated with Nvidia』s chips (such as Thor) and development platforms (such as Drive), allowing for greater computational efficiency.

In other words, joining the Nvidia ecosystem brings convenience, but also creates a deep dependency on Nvidia for core computing power.

Additionally, as DeepWay CTO Tian Shan pointed out, whether VLA is the best autonomous driving technology is still an open question. The Alpamayo-R1 model』s toolchain is based on Nvidia』s platforms, which is a limitation for many developers. As a result, other technologies and computing platforms are also pushing forward the development of autonomous driving.

In Tian Shan』s view, most companies should focus more on the application of technology in real-world scenarios—that is, the engineering implementation of technology. Addressing practical, real-life problems and achieving a commercially viable closed loop for intelligent driving technology as soon as possible will be more beneficial for the industry』s long-term, healthy development.

Furthermore, the large-scale commercialization and implementation of L4 autonomous driving—or Robotaxi services—are closely tied to policies and regulations. The ability to operate within compliance frameworks, undergo safety assessments, and strike a balance between data utilization and privacy protection is just as important as technological capability itself.

Jensen Huang, the founder and CEO of NVIDIA, has always regarded Robotaxi as the 「first commercial application of robotics technology.」 Instead of building a single driverless taxi, NVIDIA』s goal has always been to provide the technological foundation enabling all players to create their own driverless taxis. Now, he is attempting to establish a fast-replicable production line for this application through the open-sourcing of VLA.

However, whether open source can truly lower the barriers to entry and accelerate the arrival of L4 autonomous driving—ultimately unleashing technology』s full potential across broader commercial horizons—remains to be seen. The open-sourcing of NVIDIA』s Alpamayo-R1 model is only the beginning of the game. More hurdles still lie ahead, and it will take the market to validate its true potential. (Writing by Zhang Min; Editing by Chelsea Sun and Li Chengcheng)

广告

相关 文章

A股大消费已经跌出幻觉,食品饮料龙头们需要一场叙事重启

旧船已漏,新陆未现 — 一个普通人在 AI 时代的观察与思考

来自 周天财经
2026 年 7 月 2 日
0

(本文作者为 孙律解竞业,钛媒体经授权发...

灵光一号位转岗,蚂蚁阿福收兵

灵光一号位转岗,蚂蚁阿福收兵

来自 周天财经
2026 年 7 月 2 日
0

(本文作者为 象先志,钛媒体经授权发布)...

【钛晨报】新能源汽车6月交付情况速览:13家交付量超3万,5家同比增超50%;世界黄金协会:预计黄金投资将在下半年步入关键期;国内航线燃油附加费7月5日起下调

【钛晨报】 新能源汽车 6 月交付情况速览:13 家交付量超 3 万,5 家同比增超 50%;世界黄金协会:预计黄金投资将在下半年步入关键期;国内航线燃油附加费 7 月 5 日起下调

来自 周天财经
2026 年 7 月 2 日
0

【钛媒体综合】7 月 1 日,车企公布今年 6 月...

在天孚,增速81% 的业务毛利率46%,增速32% 的业务毛利率63%

在天孚,增速 81% 的业务毛利率 46%,增速 32% 的业务毛利率 63%

来自 周天财经
2026 年 7 月 2 日
0

(本文作者为 半呆君,钛媒体经授权发布)...

回顾股票推荐:凯悦酒店、Sensient Technologies、Harmony Biosciences

Shipping Gridlock: New US Customs Rule Forces Freight Forwarders to Reject E-Commerce Cargo

来自 周天财经
2026 年 7 月 1 日
0

By | Yi EnNextFin Ne...

加载更多
广告
  • 热门
  • 评论
  • 最新
神马经典投研: 集资讯、策略、研报一站式期货投研工具

神马经典投研: 集资讯、策略、研报一站式期货投研工具

2025 年 11 月 7 日
「我们也深陷残酷价格战」,德资巨头中国区高管警告

「我们也深陷残酷价格战」,德资巨头中国区高管警告

2025 年 8 月 4 日
一周产业基金|上海市人工智能CVC基金发布;湖北百亿人形机器人母基金来了

一周产业基金|上海市人工智能 CVC 基金发布;湖北百亿人形机器人母基金来了

2025 年 8 月 4 日
「硬科技」指数携手上涨,半导体设备ETF易方达(159558)、芯片ETF易方达(516350)等产品助力布局板块龙头

基民懵了!这个火爆的板块年内涨超 37%,主力却借道 ETF 狂抛逾 400 亿元

2025 年 9 月 20 日
Lesson 1: Basics Of Photography With Natural Lighting

The Single Most Important Thing You Need To Know About Success

4
Lesson 1: Basics Of Photography With Natural Lighting

Lesson 1: Basics Of Photography With Natural Lighting

3
Lesson 1: Basics Of Photography With Natural Lighting

5 Ways Animals Will Help You Get More Business

2
Lesson 1: Basics Of Photography With Natural Lighting

New Cryptocurrency That Will Kill Of Bitcoin

2

科技板块再度领涨,创业板指收涨 3%,关注创业板 ETF 易方达 (159915) 后续走势

2026 年 7 月 2 日

工商银行龙凤呈祥金条 25 克价格今天多少一克 (2026 年 06 月 23 日)

2026 年 7 月 2 日

债市日报:7 月 1 日

2026 年 7 月 2 日
脉脉校招数据:名校生对芯片企业关注度直逼大厂,海光、寒武纪进热搜榜

脉脉校招数据:名校生对芯片企业关注度直逼大厂,海光、寒武纪进热搜榜

2026 年 7 月 2 日
  • 隐私政策
  • 联系我们
  • 关于周天
  • 登录
  • 注册
投诉建议:+86 13326565461

© 2025 广州小舟天传媒有限公司 by 周天财经 - 粤 ICP 备 2025452169 号-1

没有结果
查看所有结果
  • 首页
  • 24 小时
  • 世界
  • 商业
  • 基金
  • 期货
  • 股票
  • 行业新闻
  • 黄金

© 2025 广州小舟天传媒有限公司 by 周天财经 - 粤 ICP 备 2025452169 号-1

欢迎回来!

在下面登录您的帐户

忘记密码? 注册

创建新帐户!

填写以下表格进行注册

所有项目需要填写。 登录

重置您的密码

请输入您的用户名或电子邮件地址以重置密码。

登录

用户登录

还没有账号?立即注册

用户注册

已有账号?立即登录