MYRA stack - modern JAVA FFM based libraries
MYRA — Memory Yielded, Rapid Access — is a production-grade ecosystem of Java libraries built on the Foreign Function & Memory (FFM) API, designed for deterministic, sub-microsecond latency applications. MYRA — Memory Yielded, Rapid Access — is a production-grade ecosystem of Java libraries built on the Foreign Function & Memory (FFM) API, designed for deterministic, sub-microsecond latency applications. Unlike approaches that rely on The ecosystem is built on four core principles: Performance-sensitive Java systems have historically relied on MYRA is built entirely on FFM, proving that it’s not just a replacement for MYRA comprises six libraries designed for vertical integration: The libraries share a design philosophy: zero allocation in the hot path. If you’re processing millions of messages per second, you shouldn’t be at the mercy of GC pauses. A key enabler is the flyweight pattern — reusable, stateless views over raw memory. Instead of deserializing into objects, myra-codec and myra-transport wrap off-heap buffers directly. No copies, no allocations, no GC pressure. Just pointer arithmetic and bounds checks. MYRA is built for systems where every microsecond counts: Any system processing high-volume, low-latency, deterministic workloads is a candidate for MYRA. Serialization is where myra-codec shines. On an order book snapshot workload (a common HFT/trading message type), here’s how it stacks up against established codecs: Myra decode is 2-3x faster than Kryo/FlatBuffers and leads the pack. SBE edges out Myra on encode, but Myra’s decode dominance makes it the better choice for read-heavy workloads (most real systems decode more than they encode). Benchmark: order_book_snapshots workload, JMH on c6a.4xlarge, JDK 25, 5 forks × 5 iterations. For networking, myra-transport uses Linux MYRA_TOKEN beats Netty by 27% on latency (28.7 μs vs 39.3 μs) and 37% on throughput. The token-based completion tracking provides the best balance of latency and consistency for io_uring-based networking. Benchmark: RealWorldPayload ping-pong, JMH on ARM64 (AWS Graviton), JDK 25, Nov 29, 2025. A common question: “If you need this kind of performance, why not just write it in C/C++/Rust?” It’s a fair question. The short answer: developer velocity, safety, and maintainability matter more than the last 5-10% of performance. Writing correct, memory-safe high-performance C/C++ code is genuinely hard: Rust solves memory safety, but introduces different tradeoffs: The MYRA stack bridges the gap: C/C++/Rust shine for: But for most systems — trading platforms, market data feeds, game servers, real-time analytics — Java + MYRA offers a pragmatic middle ground: MYRA’s thesis is simple: Most teams are over-optimized for raw speed and under-optimized for correctness and velocity. The JVM with FFM tips that balance back toward sanity. I’ve spent years in systems where latency matters — where 100μs is slow and a GC pause is a production incident. Java is plenty fast for this, but the tooling hasn’t caught up to the platform’s capabilities. FFM is the missing piece. It’s finally safe, stable, and performant enough to build real infrastructure on. MYRA is my attempt to do exactly that. I’m currently in the final stretch — optimizations, cleanup, and documentation. The goal is to publicly open source the entire ecosystem by Christmas 2025. The MYRA ecosystem will always remain free and open source. No enterprise tier, no gated features, no open-core model. Development will be sustained through open sponsorships from individuals and organizations who find value in the work. If you’re curious about FFM, high-performance Java, or just want to see where this goes: Follow the project: github.com/mvp-express More soon.Introducing MYRA: What I’ve Been Building
Overview
Unsafe or JNI boilerplate, MYRA leverages the standardized FFM primitives introduced in Java 22, providing memory safety and future-proof compatibility without sacrificing performance.Design Principles
The Problem FFM Solves
Unsafe — a powerful but unstable internal API that breaks with each JDK release. The Foreign Function & Memory API provides a safe, standardized alternative for off-heap memory access and native interoperability.Unsafe, but a foundation for a new class of infrastructure libraries that were previously impossible to build safely on the JVM.What’s in the Box
io_uring. Fewer syscalls, higher throughput.Use Cases & Industries
Benchmarks: Codec
Decode Throughput (ops/sec) — Higher is Better
════════════════════════════════════════════════════════════════════
Myra ████████████████████████████████████████ 4,150,079 ⭐
SBE ████████████████████ 2,204,557
FlatBuffers █████████████████ 1,968,855
Kryo ███████████████ 1,322,754
Avro █████ 454,553
Encode Throughput (ops/sec) — Higher is Better
════════════════════════════════════════════════════════════════════
SBE ████████████████████████████████████████ 4,990,071
Myra ███████████████ 1,911,781 ⭐
Kryo ███████████ 1,342,611
FlatBuffers ████████ 1,045,843
Avro ████ 466,816
Codec Decode (ops/s) Encode (ops/s) vs Myra (decode) vs Myra (encode) Myra 4,150,079 1,911,781 — — SBE 2,204,557 4,990,071 -47% +161% FlatBuffers 1,968,855 1,045,843 -53% -45% Kryo 1,322,754 1,342,611 -68% -30% Avro 454,553 466,816 -89% -76% Benchmarks: Transport
io_uring to bypass the traditional syscall overhead. Here’s how it compares in a ping-pong latency test with realistic payloads:Mean Latency (μs) — Lower is Better
════════════════════════════════════════════════════════════════════
NIO █████████████ 13.22 μs
MYRA_TOKEN ███████████████████████ 28.70 μs ⭐
MYRA ████████████████████████████ 35.12 μs
MYRA_SQPOLL █████████████████████████████ 35.88 μs
Netty ███████████████████████████████ 39.34 μs
Throughput (ops/sec) — Higher is Better
════════════════════════════════════════════════════════════════════
NIO ████████████████████████████████████████ 75,645
MYRA_TOKEN ██████████████████ 34,843 ⭐
MYRA ███████████████ 28,471
MYRA_SQPOLL ██████████████ 27,873
Netty █████████████ 25,417
Implementation Mean (μs) p50 (μs) p99 (μs) Throughput vs Netty NIO (baseline) 13.22 12.27 28.35 75.6K ops/s +198% MYRA_TOKEN ⭐ 28.70 26.72 45.76 34.8K ops/s +37% MYRA 35.12 32.16 53.25 28.5K ops/s +12% MYRA_SQPOLL 35.88 25.50 63.36 27.9K ops/s +10% Netty 39.34 38.34 62.40 25.4K ops/s — Why Java Instead of C/C++/Rust?
The C/C++ Problem
The Rust Problem
Why Java/FFM Changes the Equation
The Reality Check
Why I’m Building This
What’s Next