Skip to content

WebNN API Specification Reference

Source: https://www.w3.org/TR/webnn/ Status: W3C Candidate Recommendation Draft (December 3, 2025) Local Copy: Saved for offline reference and easy parsing

Overview

The Web Neural Network API (WebNN) defines a dedicated low-level API for neural network inference hardware acceleration. It provides hardware-agnostic access to ML acceleration capabilities across CPU, GPU, and dedicated ML accelerators.

Core Interfaces

ML

Entry point for creating ML contexts.

MLContext

Global execution state managing device resources and graph compilation.

MLGraphBuilder

Constructs computational graphs using operator methods.

MLOperand

Represents data flowing through the graph (inputs, constants, intermediate values, outputs).

MLGraph

Compiled, immutable representation of the computational graph.

MLTensor

Runtime data binding for graph execution.

Reduction Operations

Reduction operations reduce input tensor dimensions by applying a reduction function across specified axes.

Common Parameters (MLReduceOptions)

dictionary MLReduceOptions : MLOperatorOptions {
  sequence<[EnforceRange] unsigned long> axes;
  boolean keepDimensions = false;
};

Parameters: - axes: Array of dimension indices to reduce. If not specified, reduces all dimensions. - keepDimensions: If true, retains reduced dimensions with size 1. Default is false.

reduceSum()

Reduces the input tensor by summing elements along specified axes.

Formula: output = Σ input[i] for i in reduced dimensions

Signature:

MLOperand reduceSum(MLOperand input, optional MLReduceOptions options = {});

ONNX Mapping: ReduceSum

reduceMean()

Reduces the input tensor by computing the arithmetic mean along specified axes.

Formula: output = (Σ input[i]) / n where n is the number of elements reduced

Signature:

MLOperand reduceMean(MLOperand input, optional MLReduceOptions options = {});

ONNX Mapping: ReduceMean

reduceMax()

Reduces the input tensor by computing the maximum value along specified axes.

Formula: output = max(input[i]) for i in reduced dimensions

Signature:

MLOperand reduceMax(MLOperand input, optional MLReduceOptions options = {});

ONNX Mapping: ReduceMax

reduceMin()

Reduces the input tensor by computing the minimum value along specified axes.

Formula: output = min(input[i]) for i in reduced dimensions

Signature:

MLOperand reduceMin(MLOperand input, optional MLReduceOptions options = {});

ONNX Mapping: ReduceMin

reduceProduct()

Reduces the input tensor by computing the product of elements along specified axes.

Formula: output = Π input[i] for i in reduced dimensions

Signature:

MLOperand reduceProduct(MLOperand input, optional MLReduceOptions options = {});

ONNX Mapping: ReduceProd

reduceL1()

Reduces the input tensor by computing the L1 norm (sum of absolute values) along specified axes.

Formula: output = Σ |input[i]| for i in reduced dimensions

Signature:

MLOperand reduceL1(MLOperand input, optional MLReduceOptions options = {});

ONNX Mapping: ReduceL1

reduceL2()

Reduces the input tensor by computing the L2 norm (Euclidean norm) along specified axes.

Formula: output = sqrt(Σ input[i]²) for i in reduced dimensions

Signature:

MLOperand reduceL2(MLOperand input, optional MLReduceOptions options = {});

ONNX Mapping: ReduceL2

reduceLogSum()

Reduces the input tensor by computing the natural logarithm of the sum along specified axes.

Formula: output = log(Σ input[i]) for i in reduced dimensions

Signature:

MLOperand reduceLogSum(MLOperand input, optional MLReduceOptions options = {});

ONNX Mapping: ReduceLogSum

reduceLogSumExp()

Reduces the input tensor by computing the log of the sum of exponentials along specified axes.

Formula: output = log(Σ exp(input[i])) for i in reduced dimensions

Signature:

MLOperand reduceLogSumExp(MLOperand input, optional MLReduceOptions options = {});

ONNX Mapping: ReduceLogSumExp

reduceSumSquare()

Reduces the input tensor by computing the sum of squares along specified axes.

Formula: output = Σ input[i]² for i in reduced dimensions

Signature:

MLOperand reduceSumSquare(MLOperand input, optional MLReduceOptions options = {});

ONNX Mapping: ReduceSumSquare

Shape Inference for Reduction Operations

Input shape: [d0, d1, d2, ..., dn]

If keepDimensions = false: - Output shape removes the reduced dimensions - Example: [2, 3, 4] with axes=[1][2, 4]

If keepDimensions = true: - Output shape keeps reduced dimensions with size 1 - Example: [2, 3, 4] with axes=[1] and keepDimensions=true[2, 1, 4]

If axes is empty or not specified: - Reduces all dimensions - Output is a scalar (rank-0 tensor) with keepDimensions=false - Output is [1, 1, ..., 1] with keepDimensions=true

Implementation Notes

Excluded Operations

localResponseNormalization - NOT part of WebNN spec as of 2025-12-07 - Decision: Use decomposition in higher layers (e.g., ONNX Runtime's WebNN EP) - Reason: Rarity in modern models, awkward backend differences - Source: W3C WebML WG meeting notes (2024-10-31)

Data Type Support

Reduction operations typically support: - float32 (required) - float16 (optional) - int32 (optional, for min/max operations) - int8/uint8 (optional, for min/max operations)

Numerical Stability

reduceLogSumExp uses the log-sum-exp trick for numerical stability:

output = log(Σ exp(input[i]))
       = max_val + log(Σ exp(input[i] - max_val))
where max_val = max(input[i]) for i in reduced dimensions.

Additional Operations

For a complete list of all WebNN operations, see: - Official spec: https://www.w3.org/TR/webnn/ - Implementation status: https://webmachinelearning.github.io/webnn-status/


Last Updated: 2025-12-07 Spec Version: W3C Candidate Recommendation Draft (2025-12-03)