API Reference¶

Complete reference for the WebNN Python API.

Module: `webnn`¶

The main module exports all public classes and types.

import webnn

Class: `ML`¶

Entry point for the WebNN API. Provides methods to create execution contexts.

Constructor¶

ml = webnn.ML()

Creates a new ML namespace instance.

Methods¶

`create_context(accelerated=True, power_preference="default")`¶

Creates a new execution context following the W3C WebNN Device Selection spec.

Parameters:

accelerated (bool): Request GPU/NPU acceleration. Default: True
True: Platform selects GPU or NPU if available
False: CPU-only execution
power_preference (str): Power/performance hint. Options: "default", "high-performance", "low-power". Default: "default"
"low-power": Prefers NPU over GPU (Neural Engine on Apple Silicon)
"high-performance": Prefers GPU over NPU
"default": Platform decides (typically GPU > NPU > CPU)

Returns: MLContext

Example:

ml = webnn.ML()

# Request acceleration (default)
context = ml.create_context(accelerated=True, power_preference="default")
print(f"Accelerated: {context.accelerated}")  # Check actual capability

# CPU-only execution
context = ml.create_context(accelerated=False)

Note: Per the WebNN Device Selection Explainer, accelerated is a hint. The platform autonomously selects the actual device based on availability and runtime conditions.

Class: `MLContext`¶

Represents an execution context for neural network operations.

Properties¶

`accelerated` (bool, read-only)¶

Indicates if GPU/NPU acceleration is available for this context.

True: Platform can provide GPU or NPU resources
False: Only CPU execution available

This represents platform capability, not a guarantee of specific device allocation.

`power_preference` (str, read-only)¶

The power preference hint for this context.

Methods¶

`create_graph_builder()`¶

Creates a new graph builder for constructing computational graphs.

Returns: MLGraphBuilder

Example:

builder = context.create_graph_builder()

`compute(graph, inputs, outputs=None)`¶

Executes the graph with given inputs (placeholder implementation).

Parameters:

graph (MLGraph): The compiled graph to execute
inputs (dict): Dictionary mapping input names to NumPy arrays
outputs (dict, optional): Pre-allocated output arrays

Returns: dict - Dictionary mapping output names to result NumPy arrays

Example:

results = context.compute(graph, {
    "input": np.array([[1, 2, 3]], dtype=np.float32)
})

`convert_to_onnx(graph, output_path)`¶

Converts the graph to ONNX format and saves it to a file.

Parameters:

graph (MLGraph): The graph to convert
output_path (str): Path where the ONNX model will be saved

Example:

context.convert_to_onnx(graph, "model.onnx")

`convert_to_coreml(graph, output_path)`¶

Converts the graph to CoreML format (macOS only).

Parameters:

graph (MLGraph): The graph to convert
output_path (str): Path where the CoreML model will be saved

Note: Only available on macOS. Supports limited operations (add, matmul).

Example:

context.convert_to_coreml(graph, "model.mlmodel")

`create_tensor(shape, data_type, readable=True, writable=True, exportable_to_gpu=False)`¶

Creates an MLTensor for explicit tensor management.

Following the W3C WebNN MLTensor Explainer.

Parameters:

shape (list[int]): Shape of the tensor
data_type (str): Data type (e.g., "float32")
readable (bool): If True, tensor data can be read back to CPU. Default: True
writable (bool): If True, tensor data can be written from CPU. Default: True
exportable_to_gpu (bool): If True, tensor can be exported for use as GPU texture. Default: False

Returns: MLTensor

Example:

# Create default tensor (readable and writable)
tensor = context.create_tensor([2, 3], "float32")

# Create read-only tensor
ro_tensor = context.create_tensor([2, 3], "float32", readable=True, writable=False)

# Create write-only tensor
wo_tensor = context.create_tensor([2, 3], "float32", readable=False, writable=True)

# Create GPU-exportable tensor
gpu_tensor = context.create_tensor([2, 3], "float32", exportable_to_gpu=True)

`read_tensor(tensor)`¶

Reads data from an MLTensor into a numpy array.

Parameters:

tensor (MLTensor): The tensor to read from (must have readable=True)

Returns: numpy.ndarray

Raises:

RuntimeError: If tensor is not readable or has been destroyed

Example:

tensor = context.create_tensor([2, 3], "float32")
result = context.read_tensor(tensor)

`write_tensor(tensor, data)`¶

Writes data from a numpy array into an MLTensor.

Parameters:

tensor (MLTensor): The tensor to write to (must have writable=True)
data (numpy.ndarray): Data to write

Raises:

RuntimeError: If tensor is not writable or has been destroyed
ValueError: If data shape doesn't match tensor shape

Example:

tensor = context.create_tensor([2, 3], "float32")
data = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32)
context.write_tensor(tensor, data)

`dispatch(graph, inputs, outputs)`¶

Dispatches graph execution asynchronously with MLTensor inputs/outputs.

Following the W3C WebNN MLTensor Explainer timeline model.

Parameters:

graph (MLGraph): The compiled graph to execute
inputs (dict): Dictionary mapping input names to MLTensor objects
outputs (dict): Dictionary mapping output names to MLTensor objects

Returns: None (results are written to output tensors)

Example:

# Create tensors
input_tensor = context.create_tensor([2, 3], "float32")
output_tensor = context.create_tensor([2, 3], "float32")

# Write input data
input_data = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.float32)
context.write_tensor(input_tensor, input_data)

# Dispatch execution
context.dispatch(graph, {"x": input_tensor}, {"output": output_tensor})

# Read results
result = context.read_tensor(output_tensor)

Class: `MLTensor`¶

Represents an opaque typed tensor for explicit resource management.

Following the W3C WebNN MLTensor Explainer.

Properties¶

`shape` (list[int], read-only)¶

The shape of the tensor.

`data_type` (str, read-only)¶

The data type of the tensor.

`size` (int, read-only)¶

The total number of elements in the tensor.

`readable` (bool, read-only)¶

Whether tensor data can be read back to CPU.

`writable` (bool, read-only)¶

Whether tensor data can be written from CPU.

`exportable_to_gpu` (bool, read-only)¶

Whether tensor can be exported for use as GPU texture.

Methods¶

`destroy()`¶

Explicitly destroys the tensor and releases its resources.

After calling destroy(), the tensor cannot be used for any operations.

Raises:

RuntimeError: If tensor is already destroyed

Example:

tensor = context.create_tensor([2, 3], "float32")
# ... use tensor ...
tensor.destroy()  # Explicit cleanup

Class: `MLGraphBuilder`¶

Builder for constructing computational graphs using a declarative API.

Input/Constant Operations¶

`input(name, shape, data_type="float32")`¶

Creates an input operand.

Parameters:

name (str): Name of the input
shape (list[int]): Shape of the tensor
data_type (str): Data type. Options: "float32", "float16", "int32", "uint32", "int8", "uint8"

Returns: MLOperand

Example:

x = builder.input("x", [1, 3, 224, 224], "float32")

`constant(value, shape=None, data_type=None)`¶

Creates a constant operand from a NumPy array or Python list.

Parameters:

value (array-like): NumPy array or Python list
shape (list[int], optional): Shape override
data_type (str, optional): Data type override

Returns: MLOperand

Example:

import numpy as np

weights = builder.constant(np.random.randn(784, 10).astype('float32'))
bias = builder.constant(np.zeros(10, dtype='float32'))

Binary Operations¶

All binary operations take two operands and return a new operand.

`add(a, b)`¶

Element-wise addition: a + b

`sub(a, b)`¶

Element-wise subtraction: a - b

`mul(a, b)`¶

Element-wise multiplication: a * b

`div(a, b)`¶

Element-wise division: a / b

`matmul(a, b)`¶

Matrix multiplication: a @ b

Example:

x = builder.input("x", [2, 3], "float32")
y = builder.input("y", [2, 3], "float32")

sum_result = builder.add(x, y)
product = builder.mul(x, y)

Convolution Operations¶

`conv2d(input, filter, strides=None, dilations=None, pads=None, groups=None, input_layout=None, filter_layout=None)`¶

2D convolution operation for neural networks.

Parameters:

input (MLOperand): Input tensor (4D: batch, channels, height, width or batch, height, width, channels)
filter (MLOperand): Filter/kernel weights (4D constant tensor)
strides (list[int], optional): Stride along each spatial axis. Default: [1, 1]
dilations (list[int], optional): Dilation along each spatial axis. Default: [1, 1]
pads (list[int], optional): Padding [begin_height, begin_width, end_height, end_width]. Default: [0, 0, 0, 0]
groups (int, optional): Number of groups for grouped/depthwise convolution. Default: 1
input_layout (str, optional): Input tensor layout, either "nchw" (channels-first) or "nhwc" (channels-last). Default: "nchw"
filter_layout (str, optional): Filter tensor layout: "oihw", "hwio", "ohwi", or "ihwo". Default: "oihw"

Returns: MLOperand with output tensor

Shape Inference:

For NCHW input [N, C_in, H_in, W_in] and OIHW filter [C_out, C_in/groups, K_h, K_w]:

output_h = (H_in + pad_begin_h + pad_end_h - dilation_h * (K_h - 1) - 1) / stride_h + 1
output_w = (W_in + pad_begin_w + pad_end_w - dilation_w * (K_w - 1) - 1) / stride_w + 1
output_shape = [N, C_out, output_h, output_w]

Example: Standard Convolution

# Input: [batch=1, channels=3, height=32, width=32] (RGB image)
input_op = builder.input("input", [1, 3, 32, 32], "float32")

# Filter: [out_channels=64, in_channels=3, height=3, width=3]
filter_weights = np.random.randn(64, 3, 3, 3).astype(np.float32)
filter_op = builder.constant(filter_weights)

# Apply conv2d with stride=2 and padding=1
output = builder.conv2d(
    input_op,
    filter_op,
    strides=[2, 2],
    pads=[1, 1, 1, 1]
)
# Output shape: [1, 64, 16, 16]

Example: Depthwise Convolution

# Depthwise convolution: each input channel is convolved separately
input_op = builder.input("input", [1, 32, 28, 28], "float32")

# Filter: [out_channels=32, in_channels=1, height=3, width=3]
# groups=32 means 32 separate 1-channel convolutions
filter_weights = np.random.randn(32, 1, 3, 3).astype(np.float32)
filter_op = builder.constant(filter_weights)

output = builder.conv2d(
    input_op,
    filter_op,
    pads=[1, 1, 1, 1],
    groups=32  # Depthwise: groups = input channels
)
# Output shape: [1, 32, 28, 28]

Example: Dilated Convolution

# Dilated (atrous) convolution increases receptive field
input_op = builder.input("input", [1, 3, 32, 32], "float32")
filter_weights = np.random.randn(64, 3, 3, 3).astype(np.float32)
filter_op = builder.constant(filter_weights)

output = builder.conv2d(
    input_op,
    filter_op,
    dilations=[2, 2],  # Dilation factor of 2
    pads=[2, 2, 2, 2]  # Larger padding for dilated kernels
)
# Effective kernel size: 3 + (3-1)*2 = 5x5

Example: NHWC Layout (Channels-Last)

# Input in NHWC format: [batch, height, width, channels]
input_op = builder.input("input", [1, 32, 32, 3], "float32")
filter_weights = np.random.randn(64, 3, 3, 3).astype(np.float32)
filter_op = builder.constant(filter_weights)

output = builder.conv2d(
    input_op,
    filter_op,
    input_layout="nhwc",  # Channels-last input
    pads=[1, 1, 1, 1]
)
# Output shape: [1, 32, 32, 64] (also NHWC)

`conv_transpose2d(input, filter, strides=None, dilations=None, pads=None, output_padding=None, output_sizes=None, groups=None, input_layout=None, filter_layout=None)`¶

2D transposed convolution (deconvolution) operation for upsampling.

Parameters:

input (MLOperand): Input tensor (4D)
filter (MLOperand): Filter weights (4D constant tensor)
strides (list[int], optional): Stride along each spatial axis. Default: [1, 1]
dilations (list[int], optional): Dilation along each spatial axis. Default: [1, 1]
pads (list[int], optional): Padding. Default: [0, 0, 0, 0]
output_padding (list[int], optional): Additional output padding. Default: [0, 0]
output_sizes (list[int], optional): Explicit output spatial dimensions. Default: None (computed)
groups (int, optional): Number of groups. Default: 1
input_layout (str, optional): "nchw" or "nhwc". Default: "nchw"
filter_layout (str, optional): Filter layout. Default: "oihw"

Returns: MLOperand with upsampled output tensor

Shape Inference:

For NCHW input [N, C_in, H_in, W_in] and OIHW filter [C_in, C_out/groups, K_h, K_w]:

output_h = (H_in - 1) * stride_h + effective_kernel_h - pad_begin_h - pad_end_h + output_pad_h
output_w = (W_in - 1) * stride_w + effective_kernel_w - pad_begin_w - pad_end_w + output_pad_w
output_shape = [N, C_out, output_h, output_w]

Example: Basic Upsampling

# Upsample 14x14 to 29x29 with stride=2
input_op = builder.input("input", [1, 64, 14, 14], "float32")
filter_weights = np.random.randn(64, 32, 3, 3).astype(np.float32)
filter_op = builder.constant(filter_weights)

output = builder.conv_transpose2d(input_op, filter_op, strides=[2, 2])
# Output shape: [1, 32, 29, 29]

Example: With Output Padding

# Use output_padding to control exact output size
input_op = builder.input("input", [1, 64, 14, 14], "float32")
filter_weights = np.random.randn(64, 32, 3, 3).astype(np.float32)
filter_op = builder.constant(filter_weights)

output = builder.conv_transpose2d(
    input_op,
    filter_op,
    strides=[2, 2],
    output_padding=[1, 1]
)
# Output shape: [1, 32, 30, 30]

Example: Explicit Output Sizes

# Specify exact output dimensions
input_op = builder.input("input", [1, 64, 14, 14], "float32")
filter_weights = np.random.randn(64, 32, 3, 3).astype(np.float32)
filter_op = builder.constant(filter_weights)

output = builder.conv_transpose2d(
    input_op,
    filter_op,
    strides=[2, 2],
    pads=[1, 1, 1, 1],
    output_sizes=[28, 28]
)
# Output shape: [1, 32, 28, 28]

Pooling Operations¶

`average_pool2d(input, window_dimensions=None, strides=None, dilations=None, pads=None, layout=None)`¶

2D average pooling operation for downsampling by computing the average of values in a pooling window.

Parameters:

input (MLOperand): Input tensor (4D)
window_dimensions (list[int], optional): Pooling window size [height, width]. Default: [1, 1]
strides (list[int], optional): Stride along each spatial axis. Default: [1, 1]
dilations (list[int], optional): Dilation along each spatial axis. Default: [1, 1]
pads (list[int], optional): Padding [begin_height, begin_width, end_height, end_width]. Default: [0, 0, 0, 0]
layout (str, optional): "nchw" or "nhwc". Default: "nchw"

Returns: MLOperand - Output tensor after pooling

Shape Inference:

For each spatial dimension:

output_size = floor((input_size + pad_begin + pad_end - effective_window_size) / stride) + 1

where effective_window_size = (window_size - 1) * dilation + 1

Example: Basic Average Pooling

# Input: [1, 64, 28, 28]
input_op = builder.input("input", [1, 64, 28, 28], "float32")

# Apply 2x2 average pooling with stride 2
output = builder.average_pool2d(
    input_op,
    window_dimensions=[2, 2],
    strides=[2, 2]
)
# Output shape: [1, 64, 14, 14]

Example: Average Pooling with Padding

input_op = builder.input("input", [1, 64, 28, 28], "float32")

output = builder.average_pool2d(
    input_op,
    window_dimensions=[3, 3],
    strides=[2, 2],
    pads=[1, 1, 1, 1]  # Padding on all sides
)
# Output shape: [1, 64, 14, 14]

Example: NHWC Layout

# Input in NHWC format: [batch, height, width, channels]
input_op = builder.input("input", [1, 28, 28, 64], "float32")

output = builder.average_pool2d(
    input_op,
    window_dimensions=[2, 2],
    strides=[2, 2],
    layout="nhwc"
)
# Output shape: [1, 14, 14, 64] (also NHWC)

`max_pool2d(input, window_dimensions=None, strides=None, dilations=None, pads=None, layout=None)`¶

2D max pooling operation for downsampling by taking the maximum value in a pooling window.

Parameters:

input (MLOperand): Input tensor (4D)
window_dimensions (list[int], optional): Pooling window size [height, width]. Default: [1, 1]
strides (list[int], optional): Stride along each spatial axis. Default: [1, 1]
dilations (list[int], optional): Dilation along each spatial axis. Default: [1, 1]
pads (list[int], optional): Padding [begin_height, begin_width, end_height, end_width]. Default: [0, 0, 0, 0]
layout (str, optional): "nchw" or "nhwc". Default: "nchw"

Returns: MLOperand - Output tensor after pooling

Shape Inference:

Same as average_pool2d - for each spatial dimension:

output_size = floor((input_size + pad_begin + pad_end - effective_window_size) / stride) + 1

Example: Basic Max Pooling

# Input: [1, 64, 28, 28]
input_op = builder.input("input", [1, 64, 28, 28], "float32")

# Apply 2x2 max pooling with stride 2
output = builder.max_pool2d(
    input_op,
    window_dimensions=[2, 2],
    strides=[2, 2]
)
# Output shape: [1, 64, 14, 14]

Example: Overlapping Max Pooling

input_op = builder.input("input", [1, 32, 14, 14], "float32")

# Window size 2x2, stride 1x1 (overlapping windows)
output = builder.max_pool2d(
    input_op,
    window_dimensions=[2, 2],
    strides=[1, 1]
)
# Output shape: [1, 32, 13, 13]

Example: Max Pooling with Padding

input_op = builder.input("input", [1, 64, 28, 28], "float32")

output = builder.max_pool2d(
    input_op,
    window_dimensions=[3, 3],
    strides=[2, 2],
    pads=[1, 1, 1, 1]
)
# Output shape: [1, 64, 14, 14]

`global_average_pool(input, layout=None)`¶

Global average pooling operation that reduces spatial dimensions to 1x1 by averaging over all spatial locations.

Parameters:

input (MLOperand): Input tensor (4D)
layout (str, optional): "nchw" or "nhwc". Default: "nchw"

Returns: MLOperand - Output tensor with spatial dimensions 1x1

Shape Inference:

NCHW: [N, C, H, W] → [N, C, 1, 1]
NHWC: [N, H, W, C] → [N, 1, 1, C]

Example: Basic Global Average Pooling

# Input: [1, 64, 28, 28]
input_op = builder.input("input", [1, 64, 28, 28], "float32")

# Global average pool reduces spatial dimensions to 1x1
output = builder.global_average_pool(input_op)
# Output shape: [1, 64, 1, 1]

Example: For Classification (Typical ResNet-style)

# After last conv layer: [1, 2048, 7, 7]
features = builder.input("features", [1, 2048, 7, 7], "float32")

# Global average pooling instead of flatten
pooled = builder.global_average_pool(features)
# Output shape: [1, 2048, 1, 1]

# Reshape for fully connected layer
flattened = builder.reshape(pooled, [1, 2048])

Example: NHWC Layout

# Input in NHWC: [1, 28, 28, 64]
input_op = builder.input("input", [1, 28, 28, 64], "float32")

output = builder.global_average_pool(input_op, layout="nhwc")
# Output shape: [1, 1, 1, 64]

`global_max_pool(input, layout=None)`¶

Global max pooling operation that reduces spatial dimensions to 1x1 by taking the maximum value over all spatial locations.

Parameters:

input (MLOperand): Input tensor (4D)
layout (str, optional): "nchw" or "nhwc". Default: "nchw"

Returns: MLOperand - Output tensor with spatial dimensions 1x1

Shape Inference:

Same as global_average_pool: - NCHW: [N, C, H, W] → [N, C, 1, 1] - NHWC: [N, H, W, C] → [N, 1, 1, C]

Example: Basic Global Max Pooling

# Input: [2, 128, 7, 7]
input_op = builder.input("input", [2, 128, 7, 7], "float32")

# Global max pool reduces spatial dimensions to 1x1
output = builder.global_max_pool(input_op)
# Output shape: [2, 128, 1, 1]

Example: Multi-scale Feature Extraction

# Extract features at different scales
input_op = builder.input("input", [1, 512, 14, 14], "float32")

# Global max pooling captures strongest activations
max_pooled = builder.global_max_pool(input_op)
# Output shape: [1, 512, 1, 1]

# Global average pooling captures average response
avg_pooled = builder.global_average_pool(input_op)
# Output shape: [1, 512, 1, 1]

# Can concatenate both for richer representation

Normalization Operations¶

Normalization operations standardize activations to improve training stability and model performance.

`batch_normalization(input, mean, variance, scale=None, bias=None, epsilon=1e-5, axis=1)`¶

Batch normalization operation that normalizes the input across the batch dimension using pre-computed mean and variance statistics.

Parameters:

input (MLOperand): Input tensor to normalize
mean (MLOperand): Pre-computed mean values (1D tensor, size = channels)
variance (MLOperand): Pre-computed variance values (1D tensor, size = channels)
scale (MLOperand, optional): Learnable scale parameter (gamma)
bias (MLOperand, optional): Learnable bias parameter (beta)
epsilon (float, optional): Small constant for numerical stability. Default: 1e-5
axis (int, optional): Feature axis along which normalization occurs. Default: 1

Returns: MLOperand - Normalized output tensor (same shape as input)

Shape Inference: - Output shape = Input shape (preserves dimensions)

Formula:

y = scale * ((x - mean) / sqrt(variance + epsilon)) + bias

Example: Basic Batch Normalization

# Input: [2, 64, 28, 28] (batch=2, channels=64, height=28, width=28)
input_op = builder.input("input", [2, 64, 28, 28], "float32")
mean = builder.input("mean", [64], "float32")
variance = builder.input("variance", [64], "float32")

# Apply batch normalization
output = builder.batch_normalization(input_op, mean, variance)
# Output shape: [2, 64, 28, 28]

Example: With Learnable Parameters

# Include scale and bias for training
input_op = builder.input("input", [4, 128, 14, 14], "float32")
mean = builder.input("mean", [128], "float32")
variance = builder.input("variance", [128], "float32")
scale = builder.input("scale", [128], "float32")  # gamma
bias = builder.input("bias", [128], "float32")    # beta

output = builder.batch_normalization(
    input_op, mean, variance,
    scale=scale, bias=bias,
    epsilon=1e-5
)

Example: Custom Epsilon for Numerical Stability

# Use larger epsilon for very small variance values
input_op = builder.input("input", [1, 256, 7, 7], "float32")
mean = builder.input("mean", [256], "float32")
variance = builder.input("variance", [256], "float32")

output = builder.batch_normalization(
    input_op, mean, variance,
    epsilon=1e-3  # Larger epsilon for stability
)

`instance_normalization(input, scale=None, bias=None, epsilon=1e-5, layout="nchw")`¶

Instance normalization operation that normalizes each instance in a batch independently across spatial dimensions. Commonly used in style transfer and image generation tasks.

Parameters:

input (MLOperand): Input tensor to normalize (typically 4D: [N, C, H, W])
scale (MLOperand, optional): Learnable scale parameter (1D tensor, size = channels)
bias (MLOperand, optional): Learnable bias parameter (1D tensor, size = channels)
epsilon (float, optional): Small constant for numerical stability. Default: 1e-5
layout (str, optional): "nchw" or "nhwc". Default: "nchw"

Returns: MLOperand - Normalized output tensor (same shape as input)

Shape Inference: - Output shape = Input shape (preserves dimensions)

Formula:

For each instance i and channel c:
  y[i,c] = scale[c] * ((x[i,c] - mean[i,c]) / sqrt(variance[i,c] + epsilon)) + bias[c]

Example: Basic Instance Normalization

# Input: [2, 64, 28, 28]
input_op = builder.input("input", [2, 64, 28, 28], "float32")

# Apply instance normalization (computes stats per instance)
output = builder.instance_normalization(input_op)
# Output shape: [2, 64, 28, 28]

Example: With Scale and Bias (For Style Transfer)

# Instance norm with learnable parameters
input_op = builder.input("input", [1, 32, 256, 256], "float32")
scale = builder.input("scale", [32], "float32")
bias = builder.input("bias", [32], "float32")

output = builder.instance_normalization(
    input_op,
    scale=scale,
    bias=bias,
    epsilon=1e-5
)

Example: NHWC Layout

# Use NHWC layout (channels-last)
input_op = builder.input("input", [2, 28, 28, 64], "float32")

output = builder.instance_normalization(input_op, layout="nhwc")
# Output shape: [2, 28, 28, 64]

`layer_normalization(input, scale=None, bias=None, epsilon=1e-5, axes=None)`¶

Layer normalization operation that normalizes across feature dimensions within each example. Fundamental for transformer architectures and modern language models.

Parameters:

input (MLOperand): Input tensor to normalize
scale (MLOperand, optional): Learnable scale parameter (gamma)
bias (MLOperand, optional): Learnable bias parameter (beta)
epsilon (float, optional): Small constant for numerical stability. Default: 1e-5
axes (list[int], optional): Dimensions over which to compute normalization statistics. Default: [-1] (last dimension)

Returns: MLOperand - Normalized output tensor (same shape as input)

Shape Inference: - Output shape = Input shape (preserves dimensions)

Formula:

y = scale * ((x - mean(x, axes)) / sqrt(variance(x, axes) + epsilon)) + bias

Example: Basic Layer Normalization (2D)

# Input: [2, 512] (batch=2, features=512) - typical for transformers
input_op = builder.input("input", [2, 512], "float32")

# Normalize over last dimension (features)
output = builder.layer_normalization(input_op)
# Output shape: [2, 512]

Example: With Scale and Bias (Transformer Block)

# Layer norm with learnable parameters
input_op = builder.input("input", [4, 768], "float32")
scale = builder.input("scale", [768], "float32")  # gamma
bias = builder.input("bias", [768], "float32")    # beta

output = builder.layer_normalization(
    input_op,
    scale=scale,
    bias=bias,
    epsilon=1e-12  # Common in transformers
)

Example: 3D Input (Sequence Data)

# Input: [batch, sequence_length, features]
input_op = builder.input("input", [2, 10, 512], "float32")

# Normalize over last dimension (feature dimension)
output = builder.layer_normalization(input_op, axes=[-1])
# Output shape: [2, 10, 512]

Example: Multiple Axes Normalization

# Normalize over multiple dimensions
input_op = builder.input("input", [2, 8, 256], "float32")

# Normalize over last two dimensions
output = builder.layer_normalization(input_op, axes=[-2, -1])
# Output shape: [2, 8, 256]

Example: Vision Transformer (ViT) Style

# Typical ViT layer normalization setup
# Input: [batch, num_patches, embedding_dim]
input_op = builder.input("patches", [1, 196, 768], "float32")
scale = builder.input("ln_scale", [768], "float32")
bias = builder.input("ln_bias", [768], "float32")

# Normalize over embedding dimension
normalized = builder.layer_normalization(
    input_op,
    scale=scale,
    bias=bias,
    axes=[-1],
    epsilon=1e-6
)
# Output shape: [1, 196, 768]

Unary Operations¶

All unary operations take one operand and return a new operand.

`relu(x)`¶

Rectified Linear Unit activation: max(0, x)

`sigmoid(x)`¶

Sigmoid activation: 1 / (1 + exp(-x))

`tanh(x)`¶

Hyperbolic tangent activation

`softmax(x)`¶

Softmax activation (normalizes to probability distribution)

Example:

x = builder.input("x", [1, 10], "float32")

relu_out = builder.relu(x)
sigmoid_out = builder.sigmoid(x)
tanh_out = builder.tanh(x)
softmax_out = builder.softmax(x)

Shape Operations¶

`reshape(x, new_shape)`¶

Reshapes a tensor to a new shape.

Parameters:

x (MLOperand): Input operand
new_shape (list[int]): New shape

Returns: MLOperand

Example:

x = builder.input("x", [1, 784], "float32")
reshaped = builder.reshape(x, [1, 28, 28, 1])

Graph Building¶

`build(outputs)`¶

Compiles the graph and returns an immutable MLGraph.

Parameters:

outputs (dict): Dictionary mapping output names to MLOperand objects

Returns: MLGraph

Example:

x = builder.input("x", [2, 3], "float32")
y = builder.relu(x)

graph = builder.build({"output": y})

Class: `MLOperand`¶

Represents a tensor operand in the computational graph.

Properties¶

`data_type` (str, read-only)¶

The data type of the operand.

`shape` (list[int], read-only)¶

The shape of the operand.

`name` (str | None, read-only)¶

The name of the operand (if any).

Example:

x = builder.input("x", [2, 3], "float32")

print(x.data_type)  # "float32"
print(x.shape)      # [2, 3]
print(x.name)       # "x"

Class: `MLGraph`¶

Represents a compiled, immutable computational graph.

Properties¶

`operand_count` (int, read-only)¶

The number of operands in the graph.

`operation_count` (int, read-only)¶

The number of operations in the graph.

Methods¶

`get_input_names()`¶

Returns the names of all input operands.

Returns: list[str]

`get_output_names()`¶

Returns the names of all output operands.

Returns: list[str]

Example:

graph = builder.build({"output": y})

print(f"Operands: {graph.operand_count}")
print(f"Operations: {graph.operation_count}")
print(f"Inputs: {graph.get_input_names()}")
print(f"Outputs: {graph.get_output_names()}")

Data Types¶

Supported data types:

Type	Description	Bytes per element
`"float32"`	32-bit floating point	4
`"float16"`	16-bit floating point	2
`"int32"`	32-bit signed integer	4
`"uint32"`	32-bit unsigned integer	4
`"int8"`	8-bit signed integer	1
`"uint8"`	8-bit unsigned integer	1

Error Handling¶

All operations can raise Python exceptions:

try:
    graph = builder.build({"output": invalid_operand})
except ValueError as e:
    print(f"Graph validation failed: {e}")

try:
    context.convert_to_onnx(graph, "/invalid/path.onnx")
except IOError as e:
    print(f"Failed to write file: {e}")

try:
    context.convert_to_coreml(graph, "model.mlmodel")
except RuntimeError as e:
    print(f"Conversion failed: {e}")

Common exceptions: - ValueError: Invalid graph structure or parameters - IOError: File I/O errors - RuntimeError: Conversion or execution failures

API Reference¶

Module: webnn¶

Class: ML¶

Constructor¶

Methods¶

create_context(accelerated=True, power_preference="default")¶

Class: MLContext¶

Properties¶

accelerated (bool, read-only)¶

power_preference (str, read-only)¶

Methods¶

create_graph_builder()¶

compute(graph, inputs, outputs=None)¶

convert_to_onnx(graph, output_path)¶

convert_to_coreml(graph, output_path)¶

create_tensor(shape, data_type, readable=True, writable=True, exportable_to_gpu=False)¶

read_tensor(tensor)¶

write_tensor(tensor, data)¶

dispatch(graph, inputs, outputs)¶

Class: MLTensor¶

Properties¶

shape (list[int], read-only)¶

data_type (str, read-only)¶

size (int, read-only)¶

readable (bool, read-only)¶

writable (bool, read-only)¶

exportable_to_gpu (bool, read-only)¶

Methods¶

destroy()¶

Class: MLGraphBuilder¶

Input/Constant Operations¶

input(name, shape, data_type="float32")¶

constant(value, shape=None, data_type=None)¶

Binary Operations¶

add(a, b)¶

sub(a, b)¶

mul(a, b)¶

div(a, b)¶

matmul(a, b)¶

Convolution Operations¶

conv2d(input, filter, strides=None, dilations=None, pads=None, groups=None, input_layout=None, filter_layout=None)¶

conv_transpose2d(input, filter, strides=None, dilations=None, pads=None, output_padding=None, output_sizes=None, groups=None, input_layout=None, filter_layout=None)¶

Pooling Operations¶

average_pool2d(input, window_dimensions=None, strides=None, dilations=None, pads=None, layout=None)¶

max_pool2d(input, window_dimensions=None, strides=None, dilations=None, pads=None, layout=None)¶

global_average_pool(input, layout=None)¶

global_max_pool(input, layout=None)¶

Normalization Operations¶

batch_normalization(input, mean, variance, scale=None, bias=None, epsilon=1e-5, axis=1)¶

instance_normalization(input, scale=None, bias=None, epsilon=1e-5, layout="nchw")¶

layer_normalization(input, scale=None, bias=None, epsilon=1e-5, axes=None)¶

Unary Operations¶

relu(x)¶

sigmoid(x)¶

tanh(x)¶

softmax(x)¶

Shape Operations¶

reshape(x, new_shape)¶

Graph Building¶

build(outputs)¶

Class: MLOperand¶

Properties¶

data_type (str, read-only)¶

shape (list[int], read-only)¶

name (str | None, read-only)¶

Class: MLGraph¶

Properties¶

operand_count (int, read-only)¶

operation_count (int, read-only)¶

Methods¶

get_input_names()¶

get_output_names()¶

Data Types¶

Error Handling¶

Module: `webnn`¶

Class: `ML`¶

`create_context(accelerated=True, power_preference="default")`¶

Class: `MLContext`¶

`accelerated` (bool, read-only)¶

`power_preference` (str, read-only)¶

`create_graph_builder()`¶

`compute(graph, inputs, outputs=None)`¶

`convert_to_onnx(graph, output_path)`¶

`convert_to_coreml(graph, output_path)`¶

`create_tensor(shape, data_type, readable=True, writable=True, exportable_to_gpu=False)`¶

`read_tensor(tensor)`¶

`write_tensor(tensor, data)`¶

`dispatch(graph, inputs, outputs)`¶

Class: `MLTensor`¶

`shape` (list[int], read-only)¶

`data_type` (str, read-only)¶

`size` (int, read-only)¶

`readable` (bool, read-only)¶

`writable` (bool, read-only)¶

`exportable_to_gpu` (bool, read-only)¶

`destroy()`¶

Class: `MLGraphBuilder`¶

`input(name, shape, data_type="float32")`¶

`constant(value, shape=None, data_type=None)`¶

`add(a, b)`¶

`sub(a, b)`¶

`mul(a, b)`¶

`div(a, b)`¶

`matmul(a, b)`¶

`conv2d(input, filter, strides=None, dilations=None, pads=None, groups=None, input_layout=None, filter_layout=None)`¶

`conv_transpose2d(input, filter, strides=None, dilations=None, pads=None, output_padding=None, output_sizes=None, groups=None, input_layout=None, filter_layout=None)`¶

`average_pool2d(input, window_dimensions=None, strides=None, dilations=None, pads=None, layout=None)`¶

`max_pool2d(input, window_dimensions=None, strides=None, dilations=None, pads=None, layout=None)`¶

`global_average_pool(input, layout=None)`¶

`global_max_pool(input, layout=None)`¶

`batch_normalization(input, mean, variance, scale=None, bias=None, epsilon=1e-5, axis=1)`¶

`instance_normalization(input, scale=None, bias=None, epsilon=1e-5, layout="nchw")`¶

`layer_normalization(input, scale=None, bias=None, epsilon=1e-5, axes=None)`¶

`relu(x)`¶

`sigmoid(x)`¶

`tanh(x)`¶

`softmax(x)`¶

`reshape(x, new_shape)`¶

`build(outputs)`¶

Class: `MLOperand`¶

`data_type` (str, read-only)¶

`shape` (list[int], read-only)¶

`name` (str | None, read-only)¶

Class: `MLGraph`¶

`operand_count` (int, read-only)¶

`operation_count` (int, read-only)¶

`get_input_names()`¶

`get_output_names()`¶