Audio/Video Calls
WebRTC 1:1 audio/video calls with end-to-end encryption via Insertable Streams. SDK is transport-agnostic for TURN: any TURN provider returning standard iceServers JSON works (Cloudflare Realtime TURN, Twilio NTS, Metered, self-hosted coturn). The relay server's /api/v1/calls/ice-config is the single injection point — clients never hard-code TURN.
Initialize Call Module
The call module needs your Ed25519 identity keys (for signing outgoing signaling) and your aliasId (for envelope routing). Initialize after authentication:
import { loadIdentity, deriveIdentity } from '@daomessage_sdk/sdk'
const stored = await loadIdentity()
const identity = deriveIdentity(stored.mnemonic)
client.initCalls({
signingPrivKey: identity.signingKey.privateKey,
signingPubKey: identity.signingKey.publicKey,
myAliasId: stored.aliasId,
// alwaysRelay: default FALSE since 1.0.11
// - false → iceTransportPolicy='all' (host/srflx/relay all allowed)
// P2P 直连时不产生 TURN 带宽费用,两端公网 IP 会互相可见
// - true → iceTransportPolicy='relay' (强制 TURN 中继,隐藏 IP)
// 付费隐私模式,每小时 ~100MB 音频带宽计费
// alwaysRelay: true,
})Make a Call
// Audio call
await client.calls.call('u87654321', { audio: true, video: false })
// Video call
await client.calls.call('u87654321', { audio: true, video: true })Receive a Call
// 1.0.12+ 回调携带 isVideo 参数,UI 层据此选择响铃界面(音频/视频)
client.calls.onIncomingCall = (fromAlias, isVideo) => {
showIncomingCallDialog(fromAlias, isVideo)
}
// User accepts
await client.calls.answer()
// User rejects
client.calls.reject()isVideo 由 offer SDP 的 m=video 行自动判断,不依赖发起方声明。 对方一旦在 offer 里包含视频轨,isVideo=true;纯音频 offer 则为 false。
Subscribe to State / Streams (Reactive API, Recommended)
1.0.11 起推荐使用 observable 订阅,避免 React / Vue 里 ref 赋值的时序竞态:
// 订阅通话状态
const stateSub = client.calls.observeState().subscribe(state => {
// 'idle' | 'calling' | 'ringing' | 'connecting' | 'connected'
// | 'hangup' | 'rejected' | 'ended'
updateCallUI(state)
})
// 订阅本地/远端流
const localSub = client.calls.observeLocalStream().subscribe(stream => {
if (stream && localVideoRef.current) localVideoRef.current.srcObject = stream
})
const remoteSub = client.calls.observeRemoteStream().subscribe(stream => {
if (stream && remoteVideoRef.current) remoteVideoRef.current.srcObject = stream
})
// 退订
stateSub.unsubscribe()
localSub.unsubscribe()
remoteSub.unsubscribe()为什么用 observable 而不是 onLocalStream 回调: answer() 里的 getUserMedia 会极快 resolve(<100ms),如果 UI 用 mod.onLocalStream = (s) => ref.current.srcObject = s 给回调赋值, 流到达时 React video 元素可能还没 mount,ref.current 为 null → 流被静默丢弃 → 本地小窗空白。observable 订阅是 hot stream,任何时候 订阅都能拿到当前值,React 重渲染时重新挂载 ref 也能立刻同步。
回调式 API(onLocalStream / onRemoteStream)保留作向后兼容,但不推荐。
Hang Up
client.calls.hangup()Error Handling
client.calls.onError = (err) => {
console.error('[Calls]', err.name, err.message)
}常见错误:
NotAllowedError— 用户拒绝麦克风/摄像头权限getUserMedia timeout after 6000ms— 1.0.3+ 增加的超时保护。Android Chrome 某些场景下 gUM 既不 resolve 也不 reject,SDK 自动降级到音频-only,再失败抛错answer() called while already answering— 1.0.10+ 的防重入锁触发, UI 应 disable 接听按钮防连点
Call Flow
Caller Relay Server Callee
│ │ │
│── call_offer (SDP+sig) ──▶│── call_offer ──────────────▶│
│ │ │
│◀── call_answer (SDP) ─────│◀── call_answer (SDP+sig) ──│
│ │ │
│◀──────── ICE candidates ──│◀── ICE candidates ─────────│
│── ICE candidates ────────▶│── ICE candidates ──────────▶│
│ │ │
│◀═══════ WebRTC P2P (or TURN relay) ═══════════════════▶│Relay server 只做盲转发,不解密 payload,不解析 SDP(E2EE 保障)。
ICE Configuration (GET /api/v1/calls/ice-config)
Response contract
标准 WebRTC RTCConfiguration.iceServers 兼容格式。任何实现这个契约的 后端都能替换 relay-server 默认的 TURN provider:
{
"ttl": 600,
"ice_transport_policy": "all",
"ice_servers": [
{ "urls": ["stun:turn.example.com:3478"] },
{
"urls": [
"turn:turn.example.com:3478?transport=udp",
"turn:turn.example.com:3478?transport=tcp",
"turns:turn.example.com:5349?transport=tcp",
"turns:turn.example.com:443?transport=tcp"
],
"username": "<ephemeral>",
"credential": "<ephemeral>"
}
]
}ttl— credentials 有效期秒数,SDK 用它做本地缓存ice_transport_policy—"all"(P2P + TURN 兜底) /"relay"(强制 TURN)- 每个
ice_servers[*]标准 WebRTC 结构,客户端零感知后端是哪家 TURN 商
推荐后端:Cloudflare Realtime TURN
1.0.11+ 官方参考实现走 Cloudflare。你自建 relay 的时候,推荐这条最省事的路:
# 1. 在 Cloudflare Dashboard 开通 Realtime
# https://dash.cloudflare.com/?to=/:account/realtime/turn-servers
# 2. 创建 TURN Key,拿到 Key ID 和 API Token
# 3. 在 relay-server 的 .env 设置:
CF_TURN_KEY_ID=xxxxxxxxxxxxxxxxxxxx
CF_TURN_API_TOKEN=yyyyyyyyyyyyyyyyyyyyyrelay-server 的 HandleICEConfig 会优先调 CF API 换 iceServers,9 分钟缓存。 计费:$0.05/GB outbound,1 小时音频通话 ≈ $0.003(不到 2 分人民币)。 全球 330+ anycast 节点,中国用户路由到香港/新加坡节点,延迟 50-150ms。
其他支持的后端(保留 TURN_HOST 兼容)
若 relay-server 的 .env 没配 CF_TURN_KEY_ID,自动降级到:
TURN_HOST+TURN_SECRET环境变量(自建 coturn HMAC-SHA1 临时凭证)- 公共 STUN(仅 P2P 直连,开发/测试模式)
自建 coturn 的坑(参考,不推荐生产用):
denied-peer-ip=172.16.0.0-172.31.255.255会误伤 AWS VPC 默认段 导致 CREATE_PERMISSION 403,allowed-peer-ip 白名单救不了- EC2 网卡 IP 是内网(172.31.x.x),不能
relay-ip=18.142.189.254(EADDRNOTAVAIL),必须只用--external-ip广告公网 IP - 同一 WiFi 双端呼叫时,coturn 会判定"peer IP = 自己 IP"自循环拒绝, 需要
allow-loopback-peers+cli-password
这些坑全部可以用 CF TURN 一句话绕开。
E2EE for Calls
SDK 默认用 Insertable Streams(WebRTC Encoded Transform)加密音视频帧:
// Applied automatically by CallModule — no manual setup needed
// Each RTP frame is encrypted with AES-256-GCM before leaving the device即使 TURN server 泄露或被劫持,观察者只能看到加密后的 RTP,无法还原音视频。
Signaling Signature (crypto_v=2, 2026-04 hardening)
All call_* frames are Ed25519-signed and AES-GCM-encrypted. Plaintext signaling (historical crypto_v=1) is no longer accepted.
Automatic pipeline on send:
- Attach
_ts(current time) +_nonce(16 random bytes) - Ed25519 sign the full payload using sender's identity private key
- AES-GCM encrypt the signed blob with the ECDH session key
- Outer envelope carries only route fields (
type,to,from,call_id,crypto_v:2)
Automatic pipeline on receive:
- Decrypt with the session key (must exist; no plaintext fallback)
- Verify Ed25519 signature using peer's identity public key
- Check
|now - _ts| < 60s(replay window) - Check
_noncenot seen in last 5 minutes (replay cache) - Check
inner.from === envelope.from,inner.call_id === envelope.call_id
Any failure → frame silently dropped. Defends against MITM SDP injection, signaling replay, and envelope tampering by a compromised relay.
Important Notes
- Call signaling 走 WebSocket,与 IM 消息共用通道,不另开端口
alwaysRelay默认false(1.0.11+ 行为变更);如需隐私模式强制 TURN,构造时显式传trueonLocalStream/onRemoteStream回调式 API 存在 React ref 时序竞态, 推荐用observeLocalStream/observeRemoteStream订阅式 API- 防连点:1.0.10+ SDK
answer()内部_answering锁,UI 层按钮也应 disable - 跨平台 ICE candidate(1.0.19+):Android SDK 发
call_ice时使用candidate=string+sdp_mid+sdp_mline三字段,PWA SDK 1.0.19+ 自动归一化两种格式;Android↔PWA 通话需双方 SDK 均 ≥ 1.0.19 - canonical 信令签名(1.0.20+):
signSignal/verifySignal改用递归排序的 canonical JSON(原JSON.stringify(_, keys.sort())只排顶层 key,带嵌套对象的信令如call_ice会跨平台 verify 失败);Android↔PWA 通话必须双方 SDK 均 ≥ 1.0.20
Android 端原生接入注意事项(WebRTC for Android)
Android SDK 暂未封装 CallModule 类,App 层需要自己用 org.webrtc / stream-webrtc-android 实现 PeerConnection 管理。以下两个坑是已知陷阱, AI Vibe Coding 时务必照做,否则呼通了也用不了:
1. 必须配置 AudioManager 路由(否则接通后无声音)
WebRTC 不会自动切换系统音频路由。Android 默认 AudioManager.mode = MODE_NORMAL, 远端音轨会被路由到错误通路 → 表现为"通话已建立但完全没声音"。
<!-- AndroidManifest.xml,与 RECORD_AUDIO 同等必需 -->
<uses-permission android:name="android.permission.MODIFY_AUDIO_SETTINGS" />// 在 PeerConnection.PeerConnectionState.CONNECTED 时切到 IN_COMMUNICATION 模式
override fun onConnectionChange(state: PeerConnection.PeerConnectionState?) {
if (state == PeerConnection.PeerConnectionState.CONNECTED) {
val am = context.getSystemService(Context.AUDIO_SERVICE) as AudioManager
am.mode = AudioManager.MODE_IN_COMMUNICATION
am.isMicrophoneMute = false
am.isSpeakerphoneOn = true // 默认外放;听筒模式置 false
}
}
// 通话挂断 / teardown 时记得恢复 MODE_NORMAL,否则会影响系统铃声、媒体音量
fun teardownPeer() {
val am = context.getSystemService(Context.AUDIO_SERVICE) as AudioManager
am.mode = AudioManager.MODE_NORMAL
am.isSpeakerphoneOn = false
// ...
}2. 收到 call_offer 时必须缓存 SDP 并切换 INCOMING 状态
PWA SDK 1.0.20+ 起,发起方直接发 call_offer(不再单独发 call_invite)。 Android 被叫端如果只是缓存 SDP 而不更新 UI 状态(比如 _state.value = INCOMING), 会出现"PWA 已拨出,Android 完全无反应、不振铃"。
"call_offer" -> {
val sdpStr = frame["sdp"] as? String ?: return
if (peerConnection != null) {
peerConnection.setRemoteDescription(...)
} else {
// 接听前 PC 还没创建,先缓存 SDP
pendingOfferSdp = sdpStr
// 关键:必须切到 INCOMING 状态,UI 层才会渲染响铃界面
if (state == State.IDLE) {
val isVideo = sdpStr.contains("\nm=video ") || sdpStr.startsWith("m=video ")
_info.value = CallInfo(callId, from, isCaller = false,
mode = if (isVideo) Mode.VIDEO else Mode.AUDIO)
_state.value = State.INCOMING
}
}
}参考实现:template-app-android/.../call/CallManager.kt(SDK 1.0.20 兼容版)。
Frame E2EE 细节(2026-04 P3.9)
- 每个方向维护独立 counter,IV = baseIV ⊕ counter_le_8B
- 帧格式:
counter(8B big-endian) || AES-GCM(ciphertext||tag) - 接收端从帧头读 counter 派生 IV,不信任发送端携带的显式 IV
- 同 counter 在 2048 帧窗口内重复 → 丢帧(防重放)
- 单 key 累计加密达到
2^24帧或16 GiB字节的 80% 时,WorkerpostMessage({type:'rekey-needed'})通知通话层重新协商密钥;达到 100% 阈值即拒绝加密后续帧,防止 AES-GCM 同 (key, IV) 复用
视频通话排查手册(2026-04-27 新增)
排查前必须先确认的 4 件事(否则 99% 走弯路)
报「视频通话不工作」前,必须先排除主观误判:
- 截图 / 录屏给我看 — 文字描述「看不到对方画面」太主观,可能是:
- 真的黑屏(SurfaceView 没贴上)
- 摄像头对着深色物体(画面真在显示,但内容很暗)
- 摄像头物理上对着错误的东西(比如对着电脑屏幕上某张图,你以为没在工作)
- WebRTC stats 给我看 — F12 打开
chrome://webrtc-internals/,看outbound-rtp (kind=video)和inbound-rtp (kind=video):bytesSent/framesEncoded持续涨 → 你这边在发bytesReceived/framesDecoded持续涨 → 对端发的你收到了- 数字不涨 → 才是真问题
- SDK 1.0.27+ 自带 Diag dump — 通话连上后 console 每 2s 一行
📊 [Diag],把 5~10 条贴出来比文字描述精准 100 倍 - Android logcat — 关键看
org.webrtc.Loggingtag:SurfaceEglRenderer: Reporting first rendered frame.= 已渲染第一帧EglRenderer: Frames received: N. Dropped: 0. Rendered: N. Render fps: X= 持续渲染中- 这两条出现 = Android 端已经在显示对方视频,问题不在 Android
任意一项数据给出「正在发 / 正在收 / 正在渲染」时,就不要再改 SDK,问题在物理摄像头位置或 UI 布局。
视频通话上线前自检清单
| 项目 | 检查方法 | 预期结果 |
|---|---|---|
主叫 SDP m=video 是 a=sendrecv | console 看 🔴 STEP 6.5 行的 m-lines | 必须含 a=sendrecv(SDK 1.0.26+ 自动) |
被叫 SDP m=video 是 a=sendrecv | console 看 🟢 ANSWER STEP 3.5 行的 m-lines | 必须含 a=sendrecv |
| getUserMedia(video) 失败时报错 | 拒绝摄像头权限,UI 应弹 alert | UI 弹错(SDK 1.0.25+ 不再静默降级) |
| 主叫 answer 超时有兜底 | 拔网测,卡 15s 后 UI 应自动结束 | onError 触发"对方未应答"(SDK 1.0.24+) |
| ICE 失败有 restart 自愈 | NAT 切换 / TURN 单边超时 | restartIce 一次,connectionState 重新 connecting(SDK 1.0.24+) |
| 双端 outbound + inbound 都有数据 | 1.0.27 Diag dump 或 webrtc-internals | bytes/frames 双向持续涨 |
已知不是 SDK bug 的「现象」
下列情况换 SDK 版本无效,不要改 SDK:
- ❌ PWA 摄像头物理上对着深色 / 黑色 / OBS 软件等 — 你看到的"黑屏"是真实视频帧,不是渲染失败
- ❌ Android 看到的是 PWA 实际拍到的画面,但你以为应该看到别的 — 远端画面 = 远端摄像头当前指向的物体,不是 UI 截图
- ❌ 本地预览小窗 = 自己的脸是正常的 — 这是被叫端预览自己摄像头的画面,不是远端
- ❌ 小窗里是自己,大屏黑/暗 = 大屏其实在显示对方,只是对方摄像头视野很暗
真 bug 的客观信号
只有满足以下任一条件,才是 SDK / 信令 / 渲染层的真 bug:
- 📛
outbound-rtp video bytesSent永远 = 0(sender 死了) - 📛
inbound-rtp video bytesReceived永远 = 0(收不到 RTP) - 📛 Android logcat 从未出现
SurfaceEglRenderer: Reporting first rendered frame(渲染未启动) - 📛 PeerConnection state 从未
connected(ICE 失败) - 📛 PWA 没弹摄像头授权 + 没弹错误对话框(SDK 1.0.25+ 必然弹其中之一)
Troubleshooting
1.0.3~1.0.12 加了以下诊断日志(console.error 级,生产 bundle 保留):
| 前缀 | 含义 | 出现位置 |
|---|---|---|
🔥 [App] | 应用层按钮点击、onClick 事件 | ChatWindow / CallScreen |
🟠 [App] | mod.call() 调用边界 | 按钮 onClick |
🔴 [CallModule] STEP 1~8 | 发起方 call() 进度 | SDK |
🟢 [CallModule] RCVD STEP 1~8 | 接收方 handleOffer() 进度 | SDK |
🟢 [CallModule] ANSWER STEP 0~6 | 接收方 answer() 进度 | SDK |
🟡 [CallModule] | iceServers config、iceGatheringState | SDK |
🔵 [CallModule] | handleSignal call_id 流转 | SDK |
用法:通话失败时,按发起端 + 接收端分别抓 console 完整输出, 定位失败 STEP 直接映射到 SDK 代码行。onicecandidateerror 的 errorText 会给 STUN/TURN 层级失败原因。
Call Lifecycle & Media Quality (1.0.29+)
Two SDK-level guarantees added in 1.0.29, plus 1.0.30 hardens timeout cancellation, for consistent cross-platform call behavior:
1.0.30 addition: timeout cancellation notifies the peer
When the caller's answer timeout (15s) fires, the SDK now actively sends call_hangup to the peer before cleanup, rather than only cleaning up locally. This fixes the "ghost call" issue — previously, when PWA silently aborted on timeout, the Android user could still tap "Accept" later and walk through the full answer flow; the SDK would negotiate for 19s before ICE FAILED kicked in, and the user perceived "in a call timing up" when it was actually 19s of "Establishing secure channel...".
When the callee receives this call_hangup while still in ringing state, the incoming-call UI dismisses immediately, eliminating ghost-call windows.
Lifecycle: automatic call termination
Beyond the explicit call_hangup / call_reject signaling, the SDK now self-terminates calls in two scenarios where the signal might never arrive (peer App killed by OS, network down, etc):
| Trigger | Timeout | Action |
|---|---|---|
RTCPeerConnection.connectionState === 'disconnected' persists | 8 seconds | cleanup('ended') — fires state_change → 'ended' |
Inbound RTP bytesReceived / framesDecoded no growth | 15 seconds | cleanup('ended') — same as above |
These timeouts must match the Android SDK (8000ms / 15000ms) to avoid timeout drift across platforms.
The SDK also clears the disconnected timer on 'connected' / 'connecting' so it doesn't fight ICE restart.
Media quality: 720p30 + 1.5 Mbps default
Video calls now default to:
| Property | Value |
|---|---|
| Capture resolution | 1280×720 (ideal, browser may downgrade) |
| Capture framerate | 30 fps (ideal) |
| Max send bitrate | 1,500,000 bps (set via RTCRtpSender.setParameters after connect) |
| Degradation pref. | 'maintain-framerate' (drop resolution before framerate) |
| Codec preference | (not in this release) — SDK does not call setCodecPreferences; codec is decided by browser/Android default negotiation. Planned for a separate future release. |
What you need to do
Nothing. The SDK handles all of the above internally. Verify by:
- Make a video call between PWA ↔ Android.
- After ~10 seconds open browser console, look for
📊 [Diag #N]lines. - Expected:
frame=1280x720,fps>=24,qLimit=none(orbandwidthon slow networks).
If you see qLimit=cpu it means the device is encoder-bottlenecked — the SDK's degradationPreference: 'maintain-framerate' will already be downgrading resolution to keep 30 fps, but worst-case clients may fall to ~480p.
Why these constants are not configurable
If a user app sets maxBitrate to 3 Mbps but the peer SDK still has 1.5 Mbps, WebRTC's bandwidth estimator picks the lower value — meaning your config is silently ignored. Same for the disconnected timeout: if PWA is 8s and Android is 30s, Bug 1 (state desync) reappears in one direction.
The constants are intentionally internal so apps can't drift them. Tune via SDK PR if needed.