A group of researchers has revolutionized secure communication by developing an algorithm that hides sensitive information so effectively that it’s impossible to detect if anything is hidden. Led by Oxford University in close collaboration with Carnegie Mellon University, the team predicts that the method could soon be widely used in digital communication between people, including social networks and private messages. In particular, the ability to send completely secure information can empower vulnerable groups such as dissidents, investigative journalists and humanitarian workers.
Applies to the settings under the algorithm name. steganography : the practice of hiding sensitive information inside harmless content. Steganography differs from cryptography in that sensitive information is hidden in a way that hides the fact that something is being hidden. An example would be hiding a Shakespeare poem inside an AI-generated cat image.
Despite being studied for over 25 years, current steganography approaches often lack adequate security, meaning people using these techniques are at risk of being detected. This is because previous steganography algorithms slightly changed the distribution of harmless content.
To overcome this, the research team used recent breakthroughs in information theory, including minimum entropy matching, which allows two data distributions to be combined so that their mutual information is maximized, but individual distributions are preserved. As a result, there is no statistical difference between the distribution of harmless content with the new algorithm and the distribution of content that encodes sensitive information.
The algorithm is tested using various types of models that generate automatically generated content, such as GPT-2, an open-source language model, and WAVE-RNN, a text-to-speech converter. In addition to being completely secure, the new algorithm shows up to 40% higher coding efficiency than previous steganography methods in various applications, allowing more information to be hidden in a given amount of data. This can make steganography an attractive method even if complete security is not required due to its data compression and storage advantages.
The research team has applied for a patent for the algorithm, but plans to issue free licenses to third parties for responsible non-commercial use. This includes academic and humanitarian use, as well as trusted third-party security reviews. The researchers published this study as a preprint. arXivand they also posted an inefficient implementation of their method on Github. They will also present the new algorithm at the International Conference on Representation Studies, the main AI conference in May 2023.
AI-generated content is increasingly used in normal human communication, thanks to products like ChatGPT, Snapchat’s AI stickers, and TikTok’s video filters. As a result, steganography may become more common as the mere existence of AI-generated content ceases to arouse suspicion.
Co-author Dr Christian Schroeder de Witt (University of Oxford, Department of Engineering) said: “Our method can be applied to any software that automatically generates content, such as probabilistic video filters or meme generators. This is very useful for journalists and aid workers in countries where encryption is illegal, for example.” However, users still need to take precautions, as any encryption method can be vulnerable to third-party attacks, such as detecting a steganography program on a user’s phone.”
Co-author Samuel Sokota (Carnegie Mellon University, Department of Machine Learning) said: “The paper’s main contribution is to demonstrate the deep link between a problem called minimum entropy matching and completely secure steganography. Using this link, a new family of steganography algorithms with excellent security guarantees We offer.”