Various apps and services available on modern smartphones constantly listen with the mic (e.g. Siri or Google Assistant listening for wake word, "Now Playing" feature on Pixel phones). To settle user privacy concerns, most of these services promise to only process the relevant bits of recorded audio (e.g. the voice command following the wake word, audio signatures of songs detected by "Now Playing").
How can users be sure that other captured sounds, such as private conversations, are not processed and transcribed locally on the device and sent to their servers in the form of encrypted text or audio signatures? Through compression, timing, obfuscation and encryption, they could make it hard or even impossible to detect such behaviour via traffic analysis.
My question is: Do users ultimately have rely on their trust, or are there any effective ways to verify the privacy-related promises these services make?
I’m grateful for any ideas and insights you can share on this!