Scalable RISC-V CPUs for Data Center, Automotive, and Intelligent Edge

Veyron Series High-Performance Compute Processors

Roadmap

Veyron V2

Ventana’s second-generation high-performance RISC-V CPU delivers a major leap in compute capability, designed for deployment across data center, automotive, and edge applications. Veyron V2 is available as licensable IP for integration into custom SoCs or as a complete silicon platform.

Engineered as a server-class processor, Veyron V2 exceeds the demands of modern, virtualized, cloud-native workloads. Ventana's IP portfolio includes key system-level components such as a RISC-V-compliant IOMMU and supports standard AMBA interfaces, enabling seamless integration with third-party IP and acceleration blocks.

IP available now.
Silicon platforms launching in early 2026.

Modern Architecture, RVA23-Aligned

Fully compliant with the RVA23 RISC-V specification
Comparable PPA to Arm Neoverse V3 / Cortex-X4
Standard AMBA CHI.E coherent interface for SoC and chiplet integration
Co-architected with Veyron E2 for seamless vector, AI acceleration, and big-little style heterogeneous compute configurations

Extreme Performance and Power Efficiency

Optimized for high IPC and 3+ GHz core frequency
15-wide out-of-order core: fetch, decode, and execute up to 15 instructions per cycle
Balanced performance-per-watt architecture optimized to scale from hyperscale to edge environments
Advanced power gating and DVFS support for fine-grained control
Up to 32 cores per cluster with decoupled front-end and advanced branch prediction
High-performance 512-bit RVV 1.0 vector unit with INT8 and BF16 support
Integrated matrix unit delivering up to 0.5 TOPS/GHz/core (INT8)
Macro-op caching and aggressive prefetching for instruction and data streams

Advanced Cache & Cluster Architecture

1.5 MB private L2 cache per core
Shared L3 cache configurable from 1–4 MB per core (up to 128 MB per cluster)
Low-latency coherent cluster fabric
High-bandwidth shared resources optimized for multithreaded workloads

Server-Class Reliability, Virtualization, and Optimization

Full architectural virtualization support for cloud-native workloads
Comprehensive RAS:
- ECC on all caches and functional RAMs
- End-to-end data poisoning protection
- Background error scrubbing and logging
Built-in side-channel attack mitigation
Comprehensive performance profiling and tuning support

Flexible IP Integration for Custom SoCs

Clean, portable RTL – no custom macros or proprietary RAMs
Modular multi-core cluster design for high-core-count scaling
Integration-ready with standardized IP interface

Chiplet Integration & Packaging

Standardized chiplet interfaces:
- Ventana D2D multi-protocol controller
- UCIe PHY for chiplet-based system integration
Compatible with cost-effective organic packaging for volume deployment
Configurable TDP for deployment across power-performance targets
Turbo profile management with real-time power behavior control
Digital power models at both core and cluster level for dynamic scaling

Veyron V3

Key Upgrades Over Veyron V2

11+ SPECint2017 (rate=1) at up to 4.2 GHz
Enhanced RISC-V standard matrix extension support alongside RVV 1.0 and scalar compute
24 TeraFLOPS/core of FP8 matrix compute for AI/ML acceleration, or up to 4.5 PFLOPS in a 192-core chiplet-based SiP
Significantly higher IPC and performance-per-watt, driven by new microarchitecture innovations

Next-Level Microarchitecture: Wide Efficiency via Macro-Ops

Macro-op optimized design:
- Internal macro-ops encode 1–5 RISC-V instructions
- Advanced fusion engine dynamically creates optimized macro-ops from hot instruction sequences
- Acts much wider than it looks — macro-ops magnify effective decode width, backend capacity, and parallelism without physically increasing resources
- Achieves high performance and power efficiency without brute-force frontend or backend scaling
Hardware-optimized for software-transparent ILP improvements and shorter execution paths

Superscalar Execution and Predictive Throughput Architecture

16 execution pipelines and schedulers:
- 5 integer, 3 load/store, 3 scalar FP, 5 vector/matrix
200+ scheduler entries, large resource queues and buffers
Sophisticated branch and memory prediction engines:
- Multiple primary and secondary branch predictors
- Load value predictor, memory dependency and bypass prediction

Cluster Architecture and Coherency Fabric

32-core cluster design with 128MB L3 cache
Upgraded high-bandwidth, low-latency coherent fabric enables efficient cluster scaling

Compute Density and AI Performance

Unified scalar/vector/matrix execution model
Exceptional AI compute density with FP8 matrix engine
Optimized for inference acceleration at scale

Physical Design and Packaging

Frequency-optimized physical implementation:
- Semi-custom place-and-route for timing-critical logic
- Custom standard cells and SRAM macros on advanced process node
Support for UCIe-based D2D,
Optional 3D stacking and direct memory attach

Careers

We are looking to fill multiple technical R&D positions to continue the development of innovative RISC-V processors and subsystems

Join our team