Skip to content
Tony

Introduction to the Matter Protocol

A comprehensive introduction to the Matter smart home protocol covering its data model, commissioning flow, comparison to Xiaomi MIoT SPEC, and the new camera support in Matter 1.5.

Tech , Computer Networking 9 min read

Matter is a smart home interoperability protocol developed by the CSA (Connectivity Standards Alliance). It is an IP-based application-layer protocol designed to solve the long-standing fragmentation problem in the smart home industry — devices from different brands and ecosystems cannot communicate with each other.

Matter was formerly known as Project CHIP (Connected Home over IP), jointly launched by industry giants including Apple, Google, Amazon, and Samsung in 2019. Matter 1.0 was officially released in October 2022.

FeatureDescription
InteroperabilityA single device can be controlled simultaneously by multiple ecosystems such as Apple Home, Google Home, and Alexa
LocalDevice communication occurs within the local area network, independent of the cloud, reducing latency and privacy risks
SecurityCertificate-based device authentication with encryption for all communications
Easy SetupCommissioning via QR code scanning or NFC
Multi-AdminThe same device can be managed by multiple control ecosystems simultaneously

Matter runs on top of IP networks and supports the following underlying transports:

  • Wi-Fi: Suitable for high-bandwidth devices (cameras, displays)
  • Thread: Low-power mesh network, suitable for battery-powered devices such as sensors and door locks
  • Ethernet: Wired connection, suitable for bridges, hubs, etc.
  • BLE (Bluetooth Low Energy): Used only during device commissioning
┌─────────────────────────────────────────┐
│ Application Layer │
│ (Matter) │
├─────────────────────────────────────────┤
│ Transport (TCP/UDP) │
├─────────────────────────────────────────┤
│ Network (IPv6) │
├──────────┬──────────┬───────────────────┤
│ Wi-Fi │ Thread │ Ethernet │
│ (802.11) │(802.15.4)│ │
└──────────┴──────────┴───────────────────┘
BLE Used Only for Commissioning

Matter uses a hierarchical data model to describe device capabilities — this is key to understanding the entire protocol:

Node → Endpoint → Cluster → Attribute / Command / Event

ConceptDescription
NodeAn addressable entity in the network, corresponding to a physical device, with a unique Node ID
EndpointA functional unit within a node. Endpoint 0 is fixed as the Root Node, Endpoints 1+ are application functions
ClusterThe smallest modular unit of functionality, containing attributes, commands, and events. Divided into Server (provides data) and Client (consumes data)
AttributeState data within a Cluster, supporting Read / Write / Subscribe
CommandInvocable actions within a Cluster (e.g., On, Off, Toggle)
EventDevice-initiated status change notifications

Device Type defines the minimum set of Clusters an Endpoint must implement. For example, the “Dimmable Light” device type requires implementing the OnOff Cluster and the LevelControl Cluster.

Physical Device (Node ID: 0x0001)
├── Endpoint 0 (Root Node)
│ ├── Basic Information Cluster
│ │ └── Attributes: VendorName, ProductName, SoftwareVersion...
│ ├── Network Commissioning Cluster
│ └── Descriptor Cluster
└── Endpoint 1 (Dimmable Light)
├── OnOff Cluster (server)
│ ├── Attributes: OnOff (bool)
│ └── Commands: On, Off, Toggle
├── Level Control Cluster (server)
│ ├── Attributes: CurrentLevel (uint8)
│ └── Commands: MoveToLevel, Step
└── Descriptor Cluster
└── Attributes: DeviceTypeList, ServerList, ClientList

Xiaomi’s MIoT SPEC is the device description protocol for the Mi Home ecosystem, defined at iot.mi.com. Both share similar design philosophies but differ greatly in positioning and openness.

DimensionMatterMIoT SPEC V3
HierarchyNode → Endpoint → Cluster → Attribute/Command/EventDevice → Module → Service → Property/Action/Event
Device IdentificationDevice Type ID (numeric)URN: urn:miot-spec-v2:device:<type>:<vendor>:<version>
Grouping LayerEndpoint (functional partitioning for multi-function devices)Module (hardware module partitioning)
Functional UnitCluster (e.g., OnOff, LevelControl)Service (e.g., light, fan)
State ValueAttribute (supports Read/Write/Subscribe)Property (supports Read/Write/Notify)
OperationCommandAction
NotificationEvent (log-style with timestamp)Event (report-style)

Matter MIoT SPEC V3
─────────────────────────────────────────────
Node ←→ Device
Endpoint ←→ Module
Cluster ←→ Service
Attribute ←→ Property
Command ←→ Action
Event ←→ Event
Device Type ←→ Device URN Type

DimensionMatterMIoT SPEC
Standards BodyCSA (Apple, Google, Amazon, et al.)Xiaomi
OpennessOpen standard, any manufacturer can implementClosed ecosystem, requires Xiaomi platform access
Cross-EcosystemNatively supports multi-ecosystem controlMi Home / HomeKit only (partial devices)
CommunicationLocal IP networking (Wi-Fi/Thread)Bluetooth mesh / Wi-Fi / ZigBee, requires gateway
Cloud DependencyLocal-first, cloud-independentMany features depend on Xiaomi Cloud
CertificationUnified CSA certificationXiaomi proprietary certification
Device Scale~3000+ certified products as of 2025Thousands of Mi Home ecosystem products

A fan light is a typical multi-function device with both fan and light capabilities, making it an excellent example to demonstrate the role of the grouping layer (Endpoint / Module).

Matter (Fan Light):

A physical device exposes both light and fan capabilities through two Endpoints:

Fan Light Node
├── Endpoint 0 (Root Node)
│ └── Basic Information, Descriptor...
├── Endpoint 1 (Dimmable Light)
│ ├── Device Type: Dimmable Light (0x0101)
│ ├── OnOff Cluster
│ │ ├── Attribute: OnOff (bool)
│ │ └── Commands: On, Off, Toggle
│ └── LevelControl Cluster
│ ├── Attribute: CurrentLevel (0-254)
│ └── Commands: MoveToLevel
└── Endpoint 2 (Fan)
├── Device Type: Fan (0x002B)
├── FanControl Cluster
│ ├── Attribute: FanMode (Off/Low/Medium/High/Auto)
│ ├── Attribute: PercentCurrent (0-100)
│ └── Commands: Step
└── OnOff Cluster
└── Commands: On, Off, Toggle

MIoT SPEC V3 (Fan Light):

V3 uses two Modules to describe the light module and fan module respectively:

{
"type": "urn:miot-spec-v2:device:fan-light:0000A012:yeelink-fancl1:1",
"modules": [
{
"iid": 1,
"type": "urn:miot-spec-v2:module:light-module",
"services": [
{
"type": "urn:miot-spec-v2:service:light",
"properties": [
{"type": "urn:miot-spec-v2:property:on", "format": "bool", "access": ["read","write","notify"]},
{"type": "urn:miot-spec-v2:property:brightness", "format": "uint8", "value-range": [1,100,1]}
]
}
]
},
{
"iid": 2,
"type": "urn:miot-spec-v2:module:fan-module",
"services": [
{
"type": "urn:miot-spec-v2:service:fan",
"properties": [
{"type": "urn:miot-spec-v2:property:on", "format": "bool", "access": ["read","write","notify"]},
{"type": "urn:miot-spec-v2:property:fan-level", "format": "uint8", "value-range": [1,5,1]},
{"type": "urn:miot-spec-v2:property:mode", "format": "uint8", "value-list": [
{"value": 0, "description": "Normal"},
{"value": 1, "description": "Natural Wind"}
]}
]
}
]
}
]
}

From this example, it is clear that Matter’s Endpoint and MIoT SPEC V3’s Module solve the same problem — partitioning the different capabilities of a multi-function physical device into independent namespaces, allowing the controller to address and operate them separately.

The design approaches of both are very similar, both following a “functional grouping + attributes/commands” pattern. V3’s newly added Module layer serves the same purpose as Matter’s Endpoint — adding a grouping abstraction between the device and its functional units to describe the internal structure of multi-function devices (such as an air conditioner’s “compressor module” and “display module”).

The main difference is that Matter is a cross-ecosystem open standard, while MIoT SPEC serves Xiaomi’s closed ecosystem.


VersionRelease DateKey Additions
1.02022.10Basic device types: lights, switches, outlets, door locks, thermostats, window coverings, etc.
1.12023.05ICD (Intermittently Connected Devices) support, scene management
1.22023.10Home appliances: refrigerators, washing machines, robot vacuum cleaners, etc.
1.32024.05Energy management, EV chargers, microwave ovens, ovens, etc.
1.42024.11Enhanced device composition, extended ICD
1.52025.11Cameras, closures, enhanced energy management

Cameras have long been the most anticipated device type for Matter. Matter 1.5 (released November 2025) finally introduces full support for cameras. This is a complex device type — unlike simple state-controlled devices like light bulbs or switches, cameras involve real-time audio/video streaming, codec negotiation, privacy controls, and many other capabilities.

Matter 1.5 adds three core Clusters for cameras:

This Cluster is responsible for describing the camera’s audio/video capabilities and managing stream resources — it is the core of the camera device type.

Key Attributes:

AttributeDescription
MaxConcurrentVideoEncodersMaximum number of concurrent video encoders the device can run
MaxEncodedPixelRateMaximum encoded pixel rate (constrains the upper limit of resolution × frame rate)
VideoSensorParamsImage sensor parameters (physical resolution, maximum frame rate, etc.)
NightVisionCapableWhether night vision is supported
SupportedSnapshotParamsSupported snapshot parameters (resolution, encoding format)
HDRModeEnabledWhether HDR mode is enabled
RateDistortionTradeOffPointsRate-distortion trade-off points to help the controller select appropriate encoding parameters
MicrophoneCapabilitiesMicrophone capabilities (sample rate, number of channels, encoding format)
SpeakerCapabilitiesSpeaker capabilities (for two-way talk)
TwoWayTalkSupportTwo-way talk support type (half-duplex / full-duplex)
AllocatedVideoStreamsList of currently allocated video streams
AllocatedAudioStreamsList of currently allocated audio streams
AllocatedSnapshotStreamsList of currently allocated snapshot streams

Key Commands:

CommandDescription
VideoStreamAllocateAllocate a video stream, specifying encoding format, resolution, frame rate, bitrate, etc.
VideoStreamDeallocateRelease a video stream
AudioStreamAllocateAllocate an audio stream
AudioStreamDeallocateRelease an audio stream
SnapshotStreamAllocateAllocate a snapshot stream (for capturing images)
SnapshotStreamDeallocateRelease a snapshot stream
CaptureSnapshotTrigger a snapshot capture
SetStreamPrioritiesSet priorities for multiple streams (low-priority streams are degraded when resources are insufficient)

This design embodies the “capability description + resource management” philosophy: the device first informs the controller of its capability boundaries through Attributes, and the controller then applies for specific stream resources through Commands based on its needs.

This Cluster handles WebRTC session signaling exchange and serves as the channel establishment layer for real-time video transport.

Design Concept: The Matter protocol itself acts as the signaling channel, carrying WebRTC’s SDP Offer/Answer and ICE Candidate exchange, without the need for an additional signaling server. The actual audio/video data is transmitted peer-to-peer via WebRTC media channels, bypassing the Matter message layer.

Attributes:

AttributeDescription
CurrentSessionsList of currently active WebRTC sessions
MaxSessionsMaximum number of concurrent sessions

Commands:

CommandDescription
SolicitOfferController requests the camera to initiate an SDP Offer (camera-led negotiation)
ProvideOfferController sends an SDP Offer to the camera (controller-led negotiation)
ProvideAnswerRespond with an SDP Answer to complete media negotiation
ProvideICECandidateExchange ICE candidate addresses for NAT traversal and connection establishment
EndSessionTerminate a WebRTC session

WebRTC Session Establishment Flow:

┌──────────┐ ┌──────────┐
│Controller│ │ Camera │
│(Phone/Hub)│ │ │
└─────┬────┘ └─────┬────┘
│ │
│ 1. ProvideOffer (SDP Offer) │
│────────────────────────────────────────►│
│ │
│ 2. ProvideAnswer (SDP Answer) │
│◄────────────────────────────────────────│
│ │
│ 3. ProvideICECandidate (bidirectional) │
│◄────────────────────────────────────────►│
│ │
│ 4. ICE Connectivity Check & DTLS Handshake
│◄════════════════════════════════════════►│
│ │
│ 5. SRTP Audio/Video Stream (P2P Direct) │
│◄════════════════════════════════════════►│
│ │

Steps 13 are transmitted via the Matter message layer (signaling), while steps 45 are direct WebRTC media layer communication.

This Cluster supports the camera actively pushing video streams to specified receivers, suitable for the following scenarios:

  • Continuous Recording: Pushing video streams to an NVR (Network Video Recorder) or cloud storage
  • Event Triggering: Actively pushing video to a Hub or phone upon detecting motion/sound
  • Multiple Receivers: Pushing the same camera’s video stream to multiple destinations simultaneously

Commands:

CommandDescription
AllocatePushTransportRegister a push target, specifying transport protocol and destination address
DeallocatePushTransportRemove a push target
FindTransportFind a registered push transport
SetTransportStatusEnable / pause push

The difference from WebRTC Transport Provider: WebRTC is a “pull” model (the controller requests a view on demand), while Push AV Stream is a “push” model (the camera sends proactively). They are complementary, covering typical camera usage scenarios.

Matter chose WebRTC over RTSP, proprietary protocols, and other solutions for video transport, for the following considerations:

DimensionWebRTCRTSPProprietary Protocol
LatencyVery low (<500ms, typically <200ms)Moderate (1~3s)Implementation-dependent
NAT TraversalNative support (ICE/STUN/TURN)Requires additional handlingRequires additional handling
EncryptionMandatory (DTLS + SRTP)Optional (RTSPS)Uncertain
Ecosystem MaturityNative browser/mobile supportRequires dedicated playerRequires dedicated SDK
Bidirectional CommunicationNative supportPrimarily unidirectionalImplementation-dependent
Codec FlexibilityH.264/H.265/VP8/VP9/AV1Mostly H.264/H.265Implementation-dependent

Core advantages summary:

  1. Zero additional infrastructure: Signaling reuses the Matter channel, media is P2P direct — no need to deploy STUN/TURN servers (in LAN scenarios)
  2. End-to-end security: Mandatory encryption aligns with Matter’s security design philosophy
  3. Broad compatibility: Controllers (phones, smart speakers) can connect without additional SDKs

Camera Node
├── Endpoint 0 (Root Node)
│ ├── Basic Information Cluster
│ ├── Network Commissioning Cluster
│ └── Descriptor Cluster
└── Endpoint 1 (Camera)
├── Camera AV Stream Management Cluster
│ ├── Attributes:
│ │ ├── MaxConcurrentVideoEncoders (uint8)
│ │ ├── MaxEncodedPixelRate (uint32)
│ │ ├── VideoSensorParams (struct)
│ │ ├── NightVisionCapable (bool)
│ │ ├── HDRModeEnabled (bool)
│ │ ├── SupportedSnapshotParams (list)
│ │ ├── MicrophoneCapabilities (struct)
│ │ ├── SpeakerCapabilities (struct)
│ │ ├── TwoWayTalkSupport (enum: None/HalfDuplex/FullDuplex)
│ │ ├── AllocatedVideoStreams (list)
│ │ ├── AllocatedAudioStreams (list)
│ │ └── AllocatedSnapshotStreams (list)
│ └── Commands:
│ ├── VideoStreamAllocate / Deallocate
│ ├── AudioStreamAllocate / Deallocate
│ ├── SnapshotStreamAllocate / Deallocate
│ ├── CaptureSnapshot
│ └── SetStreamPriorities
├── WebRTC Transport Provider Cluster
│ ├── Attributes:
│ │ ├── CurrentSessions (list)
│ │ └── MaxSessions (uint8)
│ └── Commands:
│ ├── SolicitOffer
│ ├── ProvideOffer / ProvideAnswer
│ ├── ProvideICECandidate
│ └── EndSession
└── Push AV Stream Transport Cluster
└── Commands:
├── AllocatePushTransport
├── DeallocatePushTransport
├── FindTransport
└── SetTransportStatus

Beyond cameras, Matter 1.5 also introduces:

  • Closures: Garage doors, blinds, and other devices requiring precise position control and safety status feedback
  • Enhanced Energy Management: More granular power monitoring, smart scheduling, and demand response capabilities

The process of a device joining a Matter network is called Commissioning, with the typical flow as follows:

┌──────────┐ BLE/SoftAP ┌──────────────┐
│ │ ◄─────────────────────────► │ │
│ New │ 1. Discovery & PASE │ Commissioner │
│ Device │ Secure Channel │ (Phone App) │
└──────────┘ └──────────────┘
│ │
│ 2. Device Attestation (DAC Cert Chain) │
│◄────────────────────────────────────────│
│ │
│ 3. Network Config (Wi-Fi/Thread Creds) │
│◄────────────────────────────────────────│
│ │
│ 4. Establish CASE Secure Session │
│◄────────────────────────────────────────│
│ │
│ 5. Complete, Device Joins Fabric │
│◄────────────────────────────────────────│

Key security mechanisms:

  • PASE (Passcode-Authenticated Session Establishment): Establishes an initial secure channel based on the device’s setup code
  • CASE (Certificate-Authenticated Session Establishment): Establishes a runtime secure channel based on certificates
  • DAC (Device Attestation Certificate): Device attestation certificate ensuring the device comes from a legitimate manufacturer
  • Fabric: A trust domain within a Matter network; devices within the same Fabric trust each other

The Matter protocol aims to establish a true “universal language” for smart homes. Its data model (Cluster/Attribute/Command) shares the same design philosophy as MIoT SPEC (Service/Property/Action), but Matter’s openness and multi-ecosystem support give it the potential to become an industry standard.

The camera support introduced in Matter 1.5 adopts the WebRTC technology stack, balancing low latency, security, and interoperability — an important milestone in protocol maturity. As more device types are added and the ecosystem matures, Matter has the potential to truly realize the vision of “buy any brand’s device, control with any platform.”


References