
For our next OCPTAP session, we have Ertza Warraich, systems and networking researcher and recent Ph.D. graduate from Purdue University. Ertza will present OptiNIC, a domain-specific RDMA transport… |
For our next OCPTAP session, we have Ertza Warraich, systems and networking researcher and recent Ph.D. graduate from Purdue University. Ertza will present OptiNIC, a domain-specific RDMA transport designed for large-scale distributed machine learning. His talk explores how relaxing traditional reliability and in-order delivery guarantees can dramatically reduce tail latency and improve throughput across multi-GPU, high-speed interconnects. The session will cover: • Why strict RDMA semantics be

SPLIDT Accepted to NSDI2026: Scalable Stateful Inference at Line Rate | Muhammad Shahbaz posted on the topic | LinkedIn
🚨 Big and humbling news! Our paper SPLIDT: Partitioned Decision Trees for Scalable Stateful Inference at Line Rate has been accepted to #NSDI2026! 🎉 In-network ML has long been caught between a rock and a hard place—accuracy or scalability. SPLIDT says: why not both? SPLIDT reimagines how decision trees operate in programmable data planes by: • ✂️ Partitioning trees into subtrees with their own stateful features, • 🔁 Recirculating packets to reuse registers and match-action tables (MATs) ac


How ML can transform transport: A new paper on RDMA and ML | Muhammad Shahbaz posted on the topic | LinkedIn
Transport is the next frontier in accelerating foundation models, and getting there means "Reimagining RDMA Through the Lens of ML"! In our upcoming paper in IEEE CAL'25, we explore how a domain-specific focus can supercharge transport for ML workloads. https://lnkd.in/eF4EaciF This work is being spearheaded by my daring and relentless students, Ertza Warraich, Ali Imran, and Annus Zulfiqar, along with our amazing collaborators, Shay Vargaftik and Sonia Fahmy! ACM SIGARCH | Purdue Computer Sc

#techcon2025 | Marilyn Rego
I’m thrilled to share that I’ll be giving a talk on “SpliDT: Partitioned Decision Trees for Scalable Stateful ML Inference at Line Rate” at #TECHCON2025, hosted by Semiconductor Research Corporation (SRC). SpliDT rethinks how machine learning can run inside programmable switches. Instead of forcing all flows to use the same fixed features, SpliDT partitions decision trees so each part uses the features it actually needs. The result: higher accuracy, 5× more features, and supports millions of fl

Sparse workloads are everywhere—from ML to scientific computing—but scaling them across datacenter nodes hits a hard wall: network bottlenecks. | Muhammad Shahbaz
Sparse workloads are everywhere—from ML to scientific computing—but scaling them across datacenter nodes hits a hard wall: network bottlenecks. 🧱🧱🧱 Enter NetSparse: a new network architecture designed to supercharge distributed sparse computations. By pushing communication logic into NICs and switches, NetSparse slashes traffic, eliminates redundancy, and brings us dramatically closer to ideal scaling! 🚀🚀🚀 Huge congrats to Gerasimos Gerogiannis for leading the effort and my amazing co-au

Tech Xplore
Gigaflow cache streamlines cloud traffic, with 51% higher hit rate and 90% lower misses for programmable SmartNICs
A new way to temporarily store memory, Gigaflow, helps direct heavy traffic in cloud data centers caused by AI and machine learning workloads, according to a study led by University of Michigan researchers.

15:34
YouTube
Gigaflow - Pipeline-Aware Sub-Traversal Caching for Modern SmartNICs (ASPLOS 2025)
Learn about Gigaflow: a high hit rate, SmartNIC-native cache for virtual switches (like OVS) that expands rule space coverage by two orders of magnitude and reduces cache misses by up to 90%. This work was presented as ASPLOS'25.