In a paper titled Few-Shot Adversarial Learning of Realistic Neural Talking Head Models, researchers at the Samsung AI Center in Moscow and the Skolkovo Institute of Science and Technology have revealed how realistic fake videos can be created with just a few source images. While the capability to create convincing deepfakes is not entirely new, the paper demonstrates it's easier and quicker than previously imagined.
Currently, to generate a realistic deepfake, you need a large dataset of pictures to train the model. However, the algorithm developed by Samsung researchers can create convincing animated portraits from a small dataset, ranging from just 1 to 32 images. They are able to do this by training their model on “landmark” facial features which include eyes, mouth shapes, the length and shape of a nose bridge.
To demonstrate their work, the researchers created living portraits of Mona Lisa, Albert Einstein, Fyodor Dostoyevsky, Marilyn Monroe, and others by using only source image for each. The 1-shot results already look impressive, but the realism can be further improved by using multiple training shots as shown in the video below.
The paper says the technology has “practical applications for telepresence, including (video conferencing) and multi-player games, as well as special effects industry.” While that sounds true and exciting, realistic fake videos also have the potential to destroy individuals and communities. Even if the companies investing in this space may not have any nefarious goals, there is no guarantee their technology will be used only for the intended purposes.