Scalable RISC-V CPUs for Data Center, Automotive, and Intelligent Edge

Veyron Series High-Performance Compute Processors

Roadmap

Veyron V2

Ventana’s second-generation high-performance RISC-V CPU delivers a major leap in compute capability, designed for deployment across data center, automotive, and edge applications. Veyron V2 is available as licensable IP for integration into custom SoCs or as a complete silicon platform.

Engineered as a server-class processor, Veyron V2 exceeds the demands of modern, virtualized, cloud-native workloads. Ventana's IP portfolio includes key system-level components such as a RISC-V-compliant IOMMU and supports standard AMBA interfaces, enabling seamless integration with third-party IP and acceleration blocks.

IP available now.
Silicon platforms launching in early 2026.

Veryon V2 CPU Cluster
Modern Architecture, RVA23-Aligned
  • Fully compliant with the RVA23 RISC-V specification
  • Comparable PPA to Arm Neoverse V3 / Cortex-X4
  • Standard AMBA CHI.E coherent interface for SoC and chiplet integration
  • Co-architected with Veyron E2 for seamless vector, AI acceleration, and big-little style heterogeneous compute configurations
Extreme Performance and Power Efficiency
  • Optimized for high IPC and 3+ GHz core frequency
  • 15-wide out-of-order core: fetch, decode, and execute up to 15 instructions per cycle
  • Balanced performance-per-watt architecture optimized to scale from hyperscale to edge environments
  • Advanced power gating and DVFS support for fine-grained control
  • Up to 32 cores per cluster with decoupled front-end and advanced branch prediction
  • High-performance 512-bit RVV 1.0 vector unit with INT8 and BF16 support
  • Integrated matrix unit delivering up to 0.5 TOPS/GHz/core (INT8)
  • Macro-op caching and aggressive prefetching for instruction and data streams
Advanced Cache & Cluster Architecture
  • 1.5 MB private L2 cache per core
  • Shared L3 cache configurable from 1–4 MB per core (up to 128 MB per cluster)
  • Low-latency coherent cluster fabric
  • High-bandwidth shared resources optimized for multithreaded workloads
Server-Class Reliability, Virtualization, and Optimization
  • Full architectural virtualization support for cloud-native workloads
  • Comprehensive RAS:
    • ECC on all caches and functional RAMs
    • End-to-end data poisoning protection
    • Background error scrubbing and logging
  • Built-in side-channel attack mitigation
  • Comprehensive performance profiling and tuning support
Flexible IP Integration for Custom SoCs
  • Clean, portable RTL – no custom macros or proprietary RAMs
  • Modular multi-core cluster design for high-core-count scaling
  • Integration-ready with standardized IP interface
Chiplet Integration & Packaging
  • Standardized chiplet interfaces:
    • Ventana D2D multi-protocol controller
    • UCIe PHY for chiplet-based system integration
  • Compatible with cost-effective organic packaging for volume deployment
  • Configurable TDP for deployment across power-performance targets
  • Turbo profile management with real-time power behavior control
  • Digital power models at both core and cluster level for dynamic scaling

Veyron V3

Key Upgrades Over Veyron V2
  • 11+ SPECint2017 (rate=1) at up to 4.2 GHz
  • Enhanced RISC-V standard matrix extension support alongside RVV 1.0 and scalar compute
  • 24 TeraFLOPS/core of FP8 matrix compute for AI/ML acceleration, or up to 4.5 PFLOPS in a 192-core chiplet-based SiP
  • Significantly higher IPC and performance-per-watt, driven by new microarchitecture innovations
Next-Level Microarchitecture: Wide Efficiency via Macro-Ops
  • Macro-op optimized design:
    • Internal macro-ops encode 1–5 RISC-V instructions
    • Advanced fusion engine dynamically creates optimized macro-ops from hot instruction sequences
    • Acts much wider than it looks — macro-ops magnify effective decode width, backend capacity, and parallelism without physically increasing resources
    • Achieves high performance and power efficiency without brute-force frontend or backend scaling
  • Hardware-optimized for software-transparent ILP improvements and shorter execution paths
Superscalar Execution and Predictive Throughput Architecture
  • 16 execution pipelines and schedulers:
    • 5 integer, 3 load/store, 3 scalar FP, 5 vector/matrix
  • 200+ scheduler entries, large resource queues and buffers
  • Sophisticated branch and memory prediction engines:
    • Multiple primary and secondary branch predictors
    • Load value predictor, memory dependency and bypass prediction
Cluster Architecture and Coherency Fabric
  • 32-core cluster design with 128MB L3 cache
  • Upgraded high-bandwidth, low-latency coherent fabric enables efficient cluster scaling
Compute Density and AI Performance
  • Unified scalar/vector/matrix execution model
  • Exceptional AI compute density with FP8 matrix engine
  • Optimized for inference acceleration at scale
Physical Design and Packaging
  • Frequency-optimized physical implementation:
    • Semi-custom place-and-route for timing-critical logic
    • Custom standard cells and SRAM macros on advanced process node
  • Support for UCIe-based D2D,
  • Optional 3D stacking and direct memory attach

Careers

We are looking to fill multiple technical R&D positions to continue the development of innovative RISC-V processors and subsystems