Watermark Stress Test
This is an educational demonstration of state-of-the-art audio watermark performance under codec processing. Upload any (speech) audio file to test watermark performance before and after processing with a low-bitrate neural codec [1].
For this demo, we use the AudioSeal [2] watermark, which is well documented, open source, and provides state-of-the-art localized detection performance. Both the watermark and codec operate at 16kHz, meaning all frequencies above 8kHz are left unaltered. To ensure consistent watermark performance, we normalize audio to -16db LUFS and downmix to mono prior to embedding.
[1] https://github.com/jasonppy/VoiceCraft [2] https://github.com/facebookresearch/audioseal
The citation info for our corresponding paper is:
@inproceedings{deepwatermarksareshallow,
author ={Patrick O'Reilly and Zeyu Jin and Jiaqi Su and Bryan Pardo},
title = {Deep Audio Watermarks are Shallow: Limitations of Post-Hoc Watermarking Techniques for Speech},
booktitle = {ICLR Workshop on GenAI Watermarking},
year = {2025}
}
For the VoiceCraft codec:
@article{voicecraft,
author={Puyuan Peng and Po-Yao Huang and Daniel Li and Abdelrahman Mohamed and David Harwath},
year={2024},
title={VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild},
journal={arXiv preprint arXiv:2403.16973v1},
}
And for the AudioSeal watermark:
@article{audioseal,
title={Proactive Detection of Voice Cloning with Localized Watermarking},
author={San Roman, Robin and Fernandez, Pierre and Elsahar, Hady and D´efossez, Alexandre and Furon, Teddy and Tran, Tuan},
journal={International Conference on Machine Learning (ICML)},
year={2024}
}