Introduction to the Matter Protocol
A comprehensive introduction to the Matter smart home protocol covering its data model, commissioning flow, comparison to Xiaomi MIoT SPEC, and the new camera support in Matter 1.5.
Matter is a smart home interoperability protocol developed by the CSA (Connectivity Standards Alliance). It is an IP-based application-layer protocol designed to solve the long-standing fragmentation problem in the smart home industry — devices from different brands and ecosystems cannot communicate with each other.
Matter was formerly known as Project CHIP (Connected Home over IP), jointly launched by industry giants including Apple, Google, Amazon, and Samsung in 2019. Matter 1.0 was officially released in October 2022.
| Feature | Description |
|---|---|
| Interoperability | A single device can be controlled simultaneously by multiple ecosystems such as Apple Home, Google Home, and Alexa |
| Local | Device communication occurs within the local area network, independent of the cloud, reducing latency and privacy risks |
| Security | Certificate-based device authentication with encryption for all communications |
| Easy Setup | Commissioning via QR code scanning or NFC |
| Multi-Admin | The same device can be managed by multiple control ecosystems simultaneously |
Matter runs on top of IP networks and supports the following underlying transports:
- Wi-Fi: Suitable for high-bandwidth devices (cameras, displays)
- Thread: Low-power mesh network, suitable for battery-powered devices such as sensors and door locks
- Ethernet: Wired connection, suitable for bridges, hubs, etc.
- BLE (Bluetooth Low Energy): Used only during device commissioning
┌─────────────────────────────────────────┐│ Application Layer ││ (Matter) │├─────────────────────────────────────────┤│ Transport (TCP/UDP) │├─────────────────────────────────────────┤│ Network (IPv6) │├──────────┬──────────┬───────────────────┤│ Wi-Fi │ Thread │ Ethernet ││ (802.11) │(802.15.4)│ │└──────────┴──────────┴───────────────────┘ BLE Used Only for CommissioningMatter uses a hierarchical data model to describe device capabilities — this is key to understanding the entire protocol:
Node → Endpoint → Cluster → Attribute / Command / Event| Concept | Description |
|---|---|
| Node | An addressable entity in the network, corresponding to a physical device, with a unique Node ID |
| Endpoint | A functional unit within a node. Endpoint 0 is fixed as the Root Node, Endpoints 1+ are application functions |
| Cluster | The smallest modular unit of functionality, containing attributes, commands, and events. Divided into Server (provides data) and Client (consumes data) |
| Attribute | State data within a Cluster, supporting Read / Write / Subscribe |
| Command | Invocable actions within a Cluster (e.g., On, Off, Toggle) |
| Event | Device-initiated status change notifications |
Device Type defines the minimum set of Clusters an Endpoint must implement. For example, the “Dimmable Light” device type requires implementing the OnOff Cluster and the LevelControl Cluster.
Physical Device (Node ID: 0x0001)├── Endpoint 0 (Root Node)│ ├── Basic Information Cluster│ │ └── Attributes: VendorName, ProductName, SoftwareVersion...│ ├── Network Commissioning Cluster│ └── Descriptor Cluster│└── Endpoint 1 (Dimmable Light) ├── OnOff Cluster (server) │ ├── Attributes: OnOff (bool) │ └── Commands: On, Off, Toggle ├── Level Control Cluster (server) │ ├── Attributes: CurrentLevel (uint8) │ └── Commands: MoveToLevel, Step └── Descriptor Cluster └── Attributes: DeviceTypeList, ServerList, ClientListXiaomi’s MIoT SPEC is the device description protocol for the Mi Home ecosystem, defined at iot.mi.com. Both share similar design philosophies but differ greatly in positioning and openness.
| Dimension | Matter | MIoT SPEC V3 |
|---|---|---|
| Hierarchy | Node → Endpoint → Cluster → Attribute/Command/Event | Device → Module → Service → Property/Action/Event |
| Device Identification | Device Type ID (numeric) | URN: urn:miot-spec-v2:device:<type>:<vendor>:<version> |
| Grouping Layer | Endpoint (functional partitioning for multi-function devices) | Module (hardware module partitioning) |
| Functional Unit | Cluster (e.g., OnOff, LevelControl) | Service (e.g., light, fan) |
| State Value | Attribute (supports Read/Write/Subscribe) | Property (supports Read/Write/Notify) |
| Operation | Command | Action |
| Notification | Event (log-style with timestamp) | Event (report-style) |
Matter MIoT SPEC V3─────────────────────────────────────────────Node ←→ DeviceEndpoint ←→ ModuleCluster ←→ ServiceAttribute ←→ PropertyCommand ←→ ActionEvent ←→ EventDevice Type ←→ Device URN Type| Dimension | Matter | MIoT SPEC |
|---|---|---|
| Standards Body | CSA (Apple, Google, Amazon, et al.) | Xiaomi |
| Openness | Open standard, any manufacturer can implement | Closed ecosystem, requires Xiaomi platform access |
| Cross-Ecosystem | Natively supports multi-ecosystem control | Mi Home / HomeKit only (partial devices) |
| Communication | Local IP networking (Wi-Fi/Thread) | Bluetooth mesh / Wi-Fi / ZigBee, requires gateway |
| Cloud Dependency | Local-first, cloud-independent | Many features depend on Xiaomi Cloud |
| Certification | Unified CSA certification | Xiaomi proprietary certification |
| Device Scale | ~3000+ certified products as of 2025 | Thousands of Mi Home ecosystem products |
A fan light is a typical multi-function device with both fan and light capabilities, making it an excellent example to demonstrate the role of the grouping layer (Endpoint / Module).
Matter (Fan Light):
A physical device exposes both light and fan capabilities through two Endpoints:
Fan Light Node├── Endpoint 0 (Root Node)│ └── Basic Information, Descriptor...│├── Endpoint 1 (Dimmable Light)│ ├── Device Type: Dimmable Light (0x0101)│ ├── OnOff Cluster│ │ ├── Attribute: OnOff (bool)│ │ └── Commands: On, Off, Toggle│ └── LevelControl Cluster│ ├── Attribute: CurrentLevel (0-254)│ └── Commands: MoveToLevel│└── Endpoint 2 (Fan) ├── Device Type: Fan (0x002B) ├── FanControl Cluster │ ├── Attribute: FanMode (Off/Low/Medium/High/Auto) │ ├── Attribute: PercentCurrent (0-100) │ └── Commands: Step └── OnOff Cluster └── Commands: On, Off, ToggleMIoT SPEC V3 (Fan Light):
V3 uses two Modules to describe the light module and fan module respectively:
{ "type": "urn:miot-spec-v2:device:fan-light:0000A012:yeelink-fancl1:1", "modules": [ { "iid": 1, "type": "urn:miot-spec-v2:module:light-module", "services": [ { "type": "urn:miot-spec-v2:service:light", "properties": [ {"type": "urn:miot-spec-v2:property:on", "format": "bool", "access": ["read","write","notify"]}, {"type": "urn:miot-spec-v2:property:brightness", "format": "uint8", "value-range": [1,100,1]} ] } ] }, { "iid": 2, "type": "urn:miot-spec-v2:module:fan-module", "services": [ { "type": "urn:miot-spec-v2:service:fan", "properties": [ {"type": "urn:miot-spec-v2:property:on", "format": "bool", "access": ["read","write","notify"]}, {"type": "urn:miot-spec-v2:property:fan-level", "format": "uint8", "value-range": [1,5,1]}, {"type": "urn:miot-spec-v2:property:mode", "format": "uint8", "value-list": [ {"value": 0, "description": "Normal"}, {"value": 1, "description": "Natural Wind"} ]} ] } ] } ]}From this example, it is clear that Matter’s Endpoint and MIoT SPEC V3’s Module solve the same problem — partitioning the different capabilities of a multi-function physical device into independent namespaces, allowing the controller to address and operate them separately.
The design approaches of both are very similar, both following a “functional grouping + attributes/commands” pattern. V3’s newly added Module layer serves the same purpose as Matter’s Endpoint — adding a grouping abstraction between the device and its functional units to describe the internal structure of multi-function devices (such as an air conditioner’s “compressor module” and “display module”).
The main difference is that Matter is a cross-ecosystem open standard, while MIoT SPEC serves Xiaomi’s closed ecosystem.
| Version | Release Date | Key Additions |
|---|---|---|
| 1.0 | 2022.10 | Basic device types: lights, switches, outlets, door locks, thermostats, window coverings, etc. |
| 1.1 | 2023.05 | ICD (Intermittently Connected Devices) support, scene management |
| 1.2 | 2023.10 | Home appliances: refrigerators, washing machines, robot vacuum cleaners, etc. |
| 1.3 | 2024.05 | Energy management, EV chargers, microwave ovens, ovens, etc. |
| 1.4 | 2024.11 | Enhanced device composition, extended ICD |
| 1.5 | 2025.11 | Cameras, closures, enhanced energy management |
Cameras have long been the most anticipated device type for Matter. Matter 1.5 (released November 2025) finally introduces full support for cameras. This is a complex device type — unlike simple state-controlled devices like light bulbs or switches, cameras involve real-time audio/video streaming, codec negotiation, privacy controls, and many other capabilities.
Matter 1.5 adds three core Clusters for cameras:
This Cluster is responsible for describing the camera’s audio/video capabilities and managing stream resources — it is the core of the camera device type.
Key Attributes:
| Attribute | Description |
|---|---|
MaxConcurrentVideoEncoders | Maximum number of concurrent video encoders the device can run |
MaxEncodedPixelRate | Maximum encoded pixel rate (constrains the upper limit of resolution × frame rate) |
VideoSensorParams | Image sensor parameters (physical resolution, maximum frame rate, etc.) |
NightVisionCapable | Whether night vision is supported |
SupportedSnapshotParams | Supported snapshot parameters (resolution, encoding format) |
HDRModeEnabled | Whether HDR mode is enabled |
RateDistortionTradeOffPoints | Rate-distortion trade-off points to help the controller select appropriate encoding parameters |
MicrophoneCapabilities | Microphone capabilities (sample rate, number of channels, encoding format) |
SpeakerCapabilities | Speaker capabilities (for two-way talk) |
TwoWayTalkSupport | Two-way talk support type (half-duplex / full-duplex) |
AllocatedVideoStreams | List of currently allocated video streams |
AllocatedAudioStreams | List of currently allocated audio streams |
AllocatedSnapshotStreams | List of currently allocated snapshot streams |
Key Commands:
| Command | Description |
|---|---|
VideoStreamAllocate | Allocate a video stream, specifying encoding format, resolution, frame rate, bitrate, etc. |
VideoStreamDeallocate | Release a video stream |
AudioStreamAllocate | Allocate an audio stream |
AudioStreamDeallocate | Release an audio stream |
SnapshotStreamAllocate | Allocate a snapshot stream (for capturing images) |
SnapshotStreamDeallocate | Release a snapshot stream |
CaptureSnapshot | Trigger a snapshot capture |
SetStreamPriorities | Set priorities for multiple streams (low-priority streams are degraded when resources are insufficient) |
This design embodies the “capability description + resource management” philosophy: the device first informs the controller of its capability boundaries through Attributes, and the controller then applies for specific stream resources through Commands based on its needs.
This Cluster handles WebRTC session signaling exchange and serves as the channel establishment layer for real-time video transport.
Design Concept: The Matter protocol itself acts as the signaling channel, carrying WebRTC’s SDP Offer/Answer and ICE Candidate exchange, without the need for an additional signaling server. The actual audio/video data is transmitted peer-to-peer via WebRTC media channels, bypassing the Matter message layer.
Attributes:
| Attribute | Description |
|---|---|
CurrentSessions | List of currently active WebRTC sessions |
MaxSessions | Maximum number of concurrent sessions |
Commands:
| Command | Description |
|---|---|
SolicitOffer | Controller requests the camera to initiate an SDP Offer (camera-led negotiation) |
ProvideOffer | Controller sends an SDP Offer to the camera (controller-led negotiation) |
ProvideAnswer | Respond with an SDP Answer to complete media negotiation |
ProvideICECandidate | Exchange ICE candidate addresses for NAT traversal and connection establishment |
EndSession | Terminate a WebRTC session |
WebRTC Session Establishment Flow:
┌──────────┐ ┌──────────┐│Controller│ │ Camera ││(Phone/Hub)│ │ │└─────┬────┘ └─────┬────┘ │ │ │ 1. ProvideOffer (SDP Offer) │ │────────────────────────────────────────►│ │ │ │ 2. ProvideAnswer (SDP Answer) │ │◄────────────────────────────────────────│ │ │ │ 3. ProvideICECandidate (bidirectional) │ │◄────────────────────────────────────────►│ │ │ │ 4. ICE Connectivity Check & DTLS Handshake │◄════════════════════════════════════════►│ │ │ │ 5. SRTP Audio/Video Stream (P2P Direct) │ │◄════════════════════════════════════════►│ │ │Steps 13 are transmitted via the Matter message layer (signaling), while steps 45 are direct WebRTC media layer communication.
This Cluster supports the camera actively pushing video streams to specified receivers, suitable for the following scenarios:
- Continuous Recording: Pushing video streams to an NVR (Network Video Recorder) or cloud storage
- Event Triggering: Actively pushing video to a Hub or phone upon detecting motion/sound
- Multiple Receivers: Pushing the same camera’s video stream to multiple destinations simultaneously
Commands:
| Command | Description |
|---|---|
AllocatePushTransport | Register a push target, specifying transport protocol and destination address |
DeallocatePushTransport | Remove a push target |
FindTransport | Find a registered push transport |
SetTransportStatus | Enable / pause push |
The difference from WebRTC Transport Provider: WebRTC is a “pull” model (the controller requests a view on demand), while Push AV Stream is a “push” model (the camera sends proactively). They are complementary, covering typical camera usage scenarios.
Matter chose WebRTC over RTSP, proprietary protocols, and other solutions for video transport, for the following considerations:
| Dimension | WebRTC | RTSP | Proprietary Protocol |
|---|---|---|---|
| Latency | Very low (<500ms, typically <200ms) | Moderate (1~3s) | Implementation-dependent |
| NAT Traversal | Native support (ICE/STUN/TURN) | Requires additional handling | Requires additional handling |
| Encryption | Mandatory (DTLS + SRTP) | Optional (RTSPS) | Uncertain |
| Ecosystem Maturity | Native browser/mobile support | Requires dedicated player | Requires dedicated SDK |
| Bidirectional Communication | Native support | Primarily unidirectional | Implementation-dependent |
| Codec Flexibility | H.264/H.265/VP8/VP9/AV1 | Mostly H.264/H.265 | Implementation-dependent |
Core advantages summary:
- Zero additional infrastructure: Signaling reuses the Matter channel, media is P2P direct — no need to deploy STUN/TURN servers (in LAN scenarios)
- End-to-end security: Mandatory encryption aligns with Matter’s security design philosophy
- Broad compatibility: Controllers (phones, smart speakers) can connect without additional SDKs
Camera Node├── Endpoint 0 (Root Node)│ ├── Basic Information Cluster│ ├── Network Commissioning Cluster│ └── Descriptor Cluster│└── Endpoint 1 (Camera) ├── Camera AV Stream Management Cluster │ ├── Attributes: │ │ ├── MaxConcurrentVideoEncoders (uint8) │ │ ├── MaxEncodedPixelRate (uint32) │ │ ├── VideoSensorParams (struct) │ │ ├── NightVisionCapable (bool) │ │ ├── HDRModeEnabled (bool) │ │ ├── SupportedSnapshotParams (list) │ │ ├── MicrophoneCapabilities (struct) │ │ ├── SpeakerCapabilities (struct) │ │ ├── TwoWayTalkSupport (enum: None/HalfDuplex/FullDuplex) │ │ ├── AllocatedVideoStreams (list) │ │ ├── AllocatedAudioStreams (list) │ │ └── AllocatedSnapshotStreams (list) │ └── Commands: │ ├── VideoStreamAllocate / Deallocate │ ├── AudioStreamAllocate / Deallocate │ ├── SnapshotStreamAllocate / Deallocate │ ├── CaptureSnapshot │ └── SetStreamPriorities │ ├── WebRTC Transport Provider Cluster │ ├── Attributes: │ │ ├── CurrentSessions (list) │ │ └── MaxSessions (uint8) │ └── Commands: │ ├── SolicitOffer │ ├── ProvideOffer / ProvideAnswer │ ├── ProvideICECandidate │ └── EndSession │ └── Push AV Stream Transport Cluster └── Commands: ├── AllocatePushTransport ├── DeallocatePushTransport ├── FindTransport └── SetTransportStatusBeyond cameras, Matter 1.5 also introduces:
- Closures: Garage doors, blinds, and other devices requiring precise position control and safety status feedback
- Enhanced Energy Management: More granular power monitoring, smart scheduling, and demand response capabilities
The process of a device joining a Matter network is called Commissioning, with the typical flow as follows:
┌──────────┐ BLE/SoftAP ┌──────────────┐│ │ ◄─────────────────────────► │ ││ New │ 1. Discovery & PASE │ Commissioner ││ Device │ Secure Channel │ (Phone App) │└──────────┘ └──────────────┘ │ │ │ 2. Device Attestation (DAC Cert Chain) │ │◄────────────────────────────────────────│ │ │ │ 3. Network Config (Wi-Fi/Thread Creds) │ │◄────────────────────────────────────────│ │ │ │ 4. Establish CASE Secure Session │ │◄────────────────────────────────────────│ │ │ │ 5. Complete, Device Joins Fabric │ │◄────────────────────────────────────────│Key security mechanisms:
- PASE (Passcode-Authenticated Session Establishment): Establishes an initial secure channel based on the device’s setup code
- CASE (Certificate-Authenticated Session Establishment): Establishes a runtime secure channel based on certificates
- DAC (Device Attestation Certificate): Device attestation certificate ensuring the device comes from a legitimate manufacturer
- Fabric: A trust domain within a Matter network; devices within the same Fabric trust each other
The Matter protocol aims to establish a true “universal language” for smart homes. Its data model (Cluster/Attribute/Command) shares the same design philosophy as MIoT SPEC (Service/Property/Action), but Matter’s openness and multi-ecosystem support give it the potential to become an industry standard.
The camera support introduced in Matter 1.5 adopts the WebRTC technology stack, balancing low latency, security, and interoperability — an important milestone in protocol maturity. As more device types are added and the ecosystem matures, Matter has the potential to truly realize the vision of “buy any brand’s device, control with any platform.”
References