Can you really trust what you see? Jade Norton explores the technology behind DeepFakes that can create highly realistic fake videos and are helping the rise of misinformation across the internet
Misinformation has been spreading across our screens. The increasing resolution and rise of deepfake videos has created an environment where what we see in front of us cannot always be relied on to be true. The human eye is not always capable of perceiving the difference between real and fake but with the use of computer algorithms, any doctoring or artefacts of video editing can be detected. The 2021 winner of the BT Young Science & Technology Exhibition, Greg Tarr, is the latest, and one of the youngest, developers to create a highly sophisticated algorithm that allows even the most professionally manufactured deepfakes to be detected.
The complexity of deepfakes lies in its ability to use neural networks to synthesise a fake video based on another individual. Deepfake software was first made publicly available in 2018 and uses neural networks, in particular generative adversarial networks, to replace the face of one individual in a video with synthesized images of another’s face. Neural networks work like the human brain and use the information from one image to map onto another image to create a digital map. The algorithms for deepfake technology use both spatial and temporal information to learn the mapping from one person to another person
Deepfakes are trained to reconstruct a video frame by frame from a given source image. The algorithm is usually trained on a large collection of videos and then exact frame pairs are extracted and fed into the model. The video is reconstructed using the comparison of the key points between pairs of images and it uses the training data to map the motion between the points.
The output model creates a dense virtual motion field which shows the change in position of all points of an image in 3D. This model is assisted by an occlusion mask, which is a shading technique that mimics soft shadows from ambient light. Once the model has enough information it feeds into a video generator that warps the source image to resemble the driving video and outputs a video known as a deepfake.
The algorithm created by Greg Tarr is capable of detecting an altered video and is believed to be one of the most effective at this time. His algorithm aims to achieve a higher detection accuracy than that of other software, alongside improved speed and efficiency. He used the recent controversial Channel 4 deepfake of the Queen to test his 150,000 lines of code, and the results provided a 94.3% confidence that the video was, in fact, a deepfake. Technology like this allows for the prevention of the spread of misinformation and could be deployed to filter out deepfake content across the internet.
There is currently no specific legislation that currently governs deepfake activity. Some social media platforms have taken action by blocking the sharing of deepfake content but due to their relatively easy production, they are still easily accessible. The use of deepfakes has not been widely seen in mainstream media yet but they have been used in multiple political elections with well-known figures having deepfakes of them released. The main effects of these videos are seen in the pornographic industry. Here images of female celebrities are frequently mapped onto models, against both of their wishes. This violation has no jurisdiction in the UK or US but may fall under the new Irish laws against the distribution of images unlawfully.
The creation of a fully synthesised video also allows for audio skins or “voice clones” to be created. This technology can be used to create almost identical copies of voices of well-known figures and have them communicate anything the programmer wanted to their audience.
The accessibility of deepfake technology is what makes it particularly worrying. A convincing video can be processed on a desktop with a good graphics card within a couple of hours, however, many of the videos that are currently accessible online are created by individuals with a high knowledge of coding. Standard desktops have processing times of days or even weeks but with time and basic knowledge on your hands, it can be relatively easy.
Another algorithm for identifying deepfake videos was created by researchers from the University of Albany. They found that the detection of eye blinking on videos can be used as a marker for identifying fake videos. This is due to the training data being unable to process the frequency of blinking as the average exposure time of 1/30 second is too short to identify the length of a human blink, which is every 2-10 seconds. The pictures that the deepfake is created on is highly unlikely to show someone blinking, therefore, a lack of blinking is a tell-tale sign of a deepfake video.
Providing knowledge of these videos is just as important as creating algorithms to detect them. Humans are highly tuned to other human behaviour and although AI may be able to identify the subtleties hiding in plain sight, we are often able to sense that something is wrong in the big picture. It is important to be aware that what you are watching may not be as true as represented and to be careful making judgements on online content as its creator may have made it with malicious intent.