IPC Design Notes¶
Overview¶
This document outlines the design considerations for adding Inter-Process Communication (IPC) support to this WebNN implementation, drawing from Chromium's architecture.
Current Architecture (Single-Process)¶
Intermediate Representation¶
Format: Rust structs with JSON attributes
pub struct Operation {
pub op_type: String, // e.g., "conv2d"
pub input_operands: Vec<u32>, // operand IDs
pub output_operand: Option<u32>,
pub attributes: serde_json::Value, // Flexible JSON
pub label: Option<String>,
}
Benefits: - Simple: no code generation - Flexible: easy to add operations - Debuggable: human-readable JSON - Serializable: can save/load graphs - Cross-language: works with Python/Rust/CLI
Limitations for IPC: - JSON parsing overhead on every access - No structured validation at serialization boundaries - String-based keys prone to typos - Runtime-only validation
Chromium's Architecture (Multi-Process)¶
Process Model¶
Browser Process (JavaScript)
↓ Mojo IPC
Service Process (C++ WebNN)
↓ Platform APIs
GPU Process / ML Hardware
Intermediate Representation¶
Format: Mojo IDL - strongly-typed structs
Example Operation:
struct Conv2d {
OperandId input_operand_id;
OperandId filter_operand_id;
OperandId? bias_operand_id; // Optional
Padding2d padding;
Size2d strides;
Size2d dilations;
uint32 groups;
InputOperandLayout input_layout;
Conv2dKind kind;
};
union Operation {
Conv2d conv2d;
ElementWiseBinary elementwise_binary;
Reduce reduce;
// ... 50+ operation types
};
struct GraphInfo {
array<Operand> operands;
array<Operation> operations; // Sorted topologically
map<uint64, ConstantOperandData> constant_operand_data;
};
Benefits: - Type safety at compile time - Binary serialization for efficient IPC - Structured validation by Mojo compiler - Auto-generated bindings (C++/JavaScript/etc.) - Versioned interfaces for compatibility
Drawbacks: - Requires Mojo build system - Less flexible - changes require IDL updates - Browser-specific infrastructure - More complex build process
Reference:
- Mojo interface: services/webnn/public/mojom/webnn_graph.mojom
- Implementation: services/webnn/webnn_graph_impl.{h,cc}
Design Options for IPC Support¶
Option 1: Cap'n Proto (Recommended)¶
Cap'n Proto is a modern, efficient serialization format similar to Mojo but platform-agnostic.
Architecture:
// Define schema in schema.capnp
struct Conv2d {
inputOperandId @0 :UInt32;
filterOperandId @1 :UInt32;
biasOperandId @2 :UInt32; # 0 = none
strides @3 :List(UInt32);
dilations @4 :List(UInt32);
pads @5 :List(UInt32);
groups @6 :UInt32;
inputLayout @7 :Text;
}
struct Operation {
union {
conv2d @0 :Conv2d;
add @1 :ElementWiseBinary;
# ... more operations
}
}
struct GraphInfo {
operands @0 :List(Operand);
operations @1 :List(Operation);
constantData @2 :List(ConstantOperandData);
}
Benefits:
- Zero-copy deserialization - directly reference serialized data
- Rust native - excellent Rust support via capnp crate
- No runtime dependencies - generated code is pure Rust
- Versioning built-in - forward/backward compatibility
- Faster than protobuf - no parsing step
- Type safe - compile-time validation
Implementation Path:
1. Define Cap'n Proto schema (schema/webnn.capnp)
2. Generate Rust bindings at build time (build.rs)
3. Implement conversion: GraphInfo → Cap'n Proto → GraphInfo
4. Add IPC transport layer (Unix sockets, pipes, or TCP)
5. Keep JSON format as optional human-readable export
Gradual Migration: - Phase 1: Add Cap'n Proto as parallel format (JSON still works) - Phase 2: Use Cap'n Proto for internal IPC - Phase 3: Optional - deprecate JSON for IPC (keep for debugging)
Option 2: Protocol Buffers¶
Similar to ONNX protobuf but for graph IR.
Benefits: - Already using protobuf for ONNX/CoreML conversion - Well-known format - Good tooling
Drawbacks: - Parsing overhead (not zero-copy) - More verbose than Cap'n Proto - Requires protobuf runtime
Would reuse existing infrastructure:
// Already in build.rs for ONNX
prost_build::compile_protos(&["protos/webnn/graph.proto"], &["protos/"])?;
Option 3: Typed Rust Enums (No Serialization)¶
Replace JSON with strongly-typed Rust enums in-process.
Architecture:
#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum Operation {
Conv2d {
input_operand_id: u32,
filter_operand_id: u32,
bias_operand_id: Option<u32>,
strides: Vec<u32>,
dilations: Vec<u32>,
pads: Vec<u32>,
groups: u32,
input_layout: String,
},
Add {
lhs_operand_id: u32,
rhs_operand_id: u32,
},
// ... 50+ variants
}
Benefits: - Type safety in-process - No serialization overhead - Exhaustive pattern matching - Can still use Serde for JSON export
Drawbacks: - No IPC support - Large enum (50+ variants) - Doesn't solve cross-process problem
Recommended Approach: Cap'n Proto¶
For future IPC support, Cap'n Proto is recommended because:
- Rust-first design - excellent Rust integration
- Zero-copy - critical for large models
- Type safety - structured validation
- No runtime - pure generated code
- Platform agnostic - not tied to Chromium
Migration Strategy¶
Phase 1: Parallel Format (Backwards Compatible)
pub struct GraphInfo {
// Current fields remain
pub operands: Vec<Operand>,
pub operations: Vec<Operation>, // Still uses JSON attributes
// ...
}
impl GraphInfo {
// New: serialize to Cap'n Proto
pub fn to_capnp(&self) -> capnp::message::Builder<capnp::message::HeapAllocator> {
// Convert to Cap'n Proto format
}
// New: deserialize from Cap'n Proto
pub fn from_capnp(reader: capnp::message::Reader) -> Result<Self, GraphError> {
// Convert from Cap'n Proto format
}
// Existing JSON support unchanged
pub fn to_json(&self) -> Result<String, GraphError> { ... }
pub fn from_json(s: &str) -> Result<Self, GraphError> { ... }
}
Phase 2: IPC Transport Layer
// New module: src/ipc/mod.rs
pub struct GraphService {
// Unix socket, pipe, or TCP listener
}
impl GraphService {
pub fn serve(&self) -> Result<(), GraphError> {
// Accept connections
// Receive Cap'n Proto messages
// Deserialize to GraphInfo
// Execute operations
// Send results back
}
}
// Client side
pub struct GraphClient {
// Connection to service
}
impl GraphClient {
pub fn build_graph(&self, info: &GraphInfo) -> Result<GraphHandle, GraphError> {
// Serialize to Cap'n Proto
// Send over IPC
// Receive handle
}
pub fn compute(&self, handle: GraphHandle, inputs: &[Tensor]) -> Result<Vec<Tensor>, GraphError> {
// Send compute request over IPC
// Receive results
}
}
Phase 3: Optional JSON Deprecation - Keep JSON for debugging and CLI tools - Use Cap'n Proto exclusively for IPC - Document migration path for users
Process Model Options¶
Option A: Separate Service Process (Chromium-like)¶
Client Process (Python/Rust)
↓ Cap'n Proto IPC
Service Process (Rust WebNN)
↓ Direct FFI
Backend (ONNX Runtime / CoreML / TensorRT)
Benefits: - Isolates GPU/ML hardware failures - Sandboxing possible - Multiple clients can share service - Resource pooling
Use Cases: - Web browser integration - Multi-tenant ML serving - Fault isolation
Option B: Worker Thread Pool (Simpler)¶
Main Thread (Python/Rust)
↓ Channel/Queue
Worker Thread Pool
↓ Direct calls
Backend (ONNX Runtime / CoreML / TensorRT)
Benefits: - Simpler than multi-process - Lower overhead - Shared memory (no serialization within process)
Use Cases: - Desktop applications - ML tools/libraries - Lower latency critical
Option C: Hybrid (Flexible)¶
Support both in-process and IPC:
pub enum GraphExecutor {
InProcess(DirectExecutor), // Current implementation
Worker(ThreadPoolExecutor), // Thread pool
Service(IpcExecutor), // Separate process via Cap'n Proto
}
User chooses at runtime:
let executor = GraphExecutor::new_service()?; // IPC
let executor = GraphExecutor::new_worker(4)?; // 4 worker threads
let executor = GraphExecutor::new_direct()?; // Current behavior
Implementation Checklist¶
When adding IPC support:
- [ ] Choose serialization format (Cap'n Proto recommended)
- [ ] Define schema for all operations
- [ ] Conv2d, ConvTranspose2d
- [ ] Pool2d (Average, Max)
- [ ] Normalization (Batch, Instance, Layer)
- [ ] Element-wise operations
- [ ] Reduction operations
- [ ] Activation functions
- [ ] Shape operations (Reshape, Transpose, etc.)
- [ ] All other WebNN operations (50+ total)
- [ ] Add schema compilation to build.rs
- [ ] Implement GraphInfo ↔ Schema conversions
- [ ] Add transport layer (sockets/pipes)
- [ ] Implement service/client split
- [ ] Add authentication/security (if multi-user)
- [ ] Add resource limits and quotas
- [ ] Test serialization performance vs JSON
- [ ] Update Python bindings to support IPC mode
- [ ] Add IPC mode examples
- [ ] Document IPC setup and usage
Performance Considerations¶
Serialization Overhead¶
| Format | Serialize | Deserialize | Size | Zero-Copy |
|---|---|---|---|---|
| JSON | ~1-5ms | ~2-10ms | Large | No |
| Protobuf | ~0.5-2ms | ~1-3ms | Medium | No |
| Cap'n Proto | ~0.1-0.5ms | ~0ms | Small | Yes |
Estimates for typical WebNN graph (100 ops, 1MB constants)
When to Use IPC¶
IPC is beneficial when: - Need process isolation (security/stability) - Multiple clients sharing resources - Different privilege levels required - Large model sizes (reduces memory copies)
In-process is better when: - Single user application - Low latency critical (<1ms) - Simple deployment requirements - Development/debugging
Security Considerations¶
If implementing IPC for multi-user scenarios:
- Authentication
- Token-based client authentication
-
Per-client resource quotas
-
Sandboxing
- Run service with minimal privileges
-
Use seccomp/pledge to restrict syscalls
-
Validation
- Validate all inputs at service boundary
- Enforce memory limits on graphs
-
Rate limit requests
-
Constant Data
- Validate constant operand sizes
- Prevent memory exhaustion attacks
- Consider shared memory for large constants
References¶
- Chromium WebNN Implementation:
- Mojo Interface: https://chromium.googlesource.com/chromium/src/+/lkgr/services/webnn/public/mojom/webnn_graph.mojom
- Graph Implementation: https://chromium.googlesource.com/chromium/src/+/lkgr/services/webnn/webnn_graph_impl.h
-
Builder Implementation: https://chromium.googlesource.com/chromium/src/+/lkgr/services/webnn/webnn_graph_builder_impl.h
-
Cap'n Proto:
- Official Site: https://capnproto.org/
- Rust Crate: https://crates.io/crates/capnp
-
Schema Language: https://capnproto.org/language.html
-
W3C WebNN Specification:
- Main Spec: https://www.w3.org/TR/webnn/
- Device Selection: https://github.com/webmachinelearning/webnn/blob/main/device-selection-explainer.md
Future Work¶
- Benchmark serialization formats (JSON vs Protobuf vs Cap'n Proto)
- Design Cap'n Proto schema for WebNN operations
- Implement parallel format support (keep JSON, add Cap'n Proto)
- Add IPC transport layer (Unix sockets for POSIX, named pipes for Windows)
- Update Python bindings to support IPC mode
- Add service/client examples
- Document migration path for users
- Consider WebAssembly integration (WASI sockets)
Status¶
Current: Single-process with JSON attributes (adequate for current use cases)
Future: Multi-process with Cap'n Proto when IPC becomes necessary
This document will be updated as IPC requirements become clearer.