The rapid advancement of artificial intelligence has fundamentally transformed the landscape of digital communication and deception. In an era where synthetic media is cheap, fast, and highly realistic, the traditional cognitive heuristic of “seeing is believing” has collapsed. Understanding how to recognise deepfakes is no longer just a technical specialty — it is a vital safeguard for individuals, organisations, and investigators tasked with verifying digital evidence.
The Collapse of “Seeing is Believing” in the AI Era
For generations, visual and auditory evidence served as the bedrock of human trust and legal proof. We naturally trusted photographs, video recordings, and voice messages as faithful records of reality. However, the rise of sophisticated machine learning models — specifically Generative Adversarial Networks (GANs) and diffusion models — has permanently eroded this epistemic foundation. AI-enabled deception has become a major societal challenge, blurring the boundaries between authentic human interaction and synthetic mimicry.
This transition presents a profound psychological shift. When the sensory inputs we rely on to navigate relationships, commerce, and security can be seamlessly fabricated, the risk of cognitive capture increases. As cybercriminals and hostile actors exploit these vulnerabilities, developing a systematic, evidence-informed approach to media verification is critical. At The Centre for Forensic Neuroscience, we operate at the intersection of behavioural science and digital deception, analysing how these technologies impact human decision-making and security.
What Is a Deepfake? Understanding the Synthetic Spectrum
A deepfake is a piece of synthetic media where a person’s likeness, voice, or actions have been replaced or generated using artificial intelligence. This technology does not merely edit existing content; it constructs entirely new data points that simulate human behaviour. To build a robust defence, we must understand the main categories of synthetic media fraud:
- Face-Swapped Video: Replacing one person’s face with another in an existing video while preserving the original head movements and expressions. This is often used in political manipulation, harassment, and targeted social engineering.
- AI-Generated Talking Heads: Creating a completely synthetic avatar from a static image and a text script. These systems generate lip movements, facial expressions, and blinks to match a computer-generated voice.
- Voice Cloning Scams: Training a text-to-speech model on a short sample of a target’s real voice (often less than 30 seconds of public audio) to generate highly realistic, customized speech. This has fueled a surge in phone-based impersonation fraud.
- Synthetic Images: Generating realistic photorealistic portraits of non-existent people, commonly used to create deceptive social media profiles, fake testimonials, and sockpuppet accounts.
- Real-Time Impersonation: Using AI software during live video conferences or streams to alter the speaker’s face and voice on the fly, allowing fraudsters to bypass live visual checks.
Why Deepfakes Are Dangerous: The Mechanics of Trust Hijacking
Deepfakes represent a quantum leap in social engineering because they exploit deep-seated human evolutionary shortcuts. Humans are hardwired to process faces and voices as primary indicators of identity and emotion. When an AI mimics these signals, it performs a form of cognitive trust hijacking. The dangers extend across several domains:
- Social Engineering & Emotional Manipulation: By impersonating a loved one, a colleague, or an authority figure, attackers bypass normal rational suspicion. They tap directly into the victim’s emotional responses, inducing panic, fear, or protective instincts.
- Authority Exploitation: Cybercriminals impersonate high-level executives, law enforcement officers, or government officials to demand immediate action, bypassing standard organisational controls.
- Financial Fraud: Voice cloning and synthetic video are used to authorize fraudulent transactions, alter payment details, or steal corporate assets under the guise of legitimate business operations.
- Reputational Attacks: Deploying fabricated media to defame corporate leaders, political candidates, or private individuals. Even if a deepfake is subsequently debunked, the initial emotional impact and cognitive bias often linger.
Notable Deepfake Scams: Real-World Case Analyses
To understand the practical threat, we must examine real-world instances where organisations and individuals were deceived by synthetic media:
The Arup Executive Video Conference Fraud
In early 2024, a multinational engineering firm, Arup, lost HK$200 million (£20 million) in a sophisticated multi-person deepfake scam in Hong Kong. An employee was invited to a video call with what appeared to be the company’s Chief Financial Officer and several colleagues. In reality, every other participant on the call was a pre-recorded deepfake avatar. The employee, believing they were receiving direct instructions from their executive team, executed several large bank transfers.
Executive Voice Cloning Scams
In another high-profile case, a bank manager in the UAE was deceived by a cloned voice of a company director he recognised. The manager, having previously spoken with the director, trusted the cloned voice on the phone and authorized transactions worth $35 million. This illustrates how voice cloning scams can defeat traditional voice verification protocols.
Investment Scams Using Fake Celebrity Videos
Fraudsters regularly use AI-generated videos of well-known public figures, financial commentators, and tech entrepreneurs to endorse fraudulent high-yield investment schemes. These videos are widely distributed on social media platforms, leveraging the familiar-face bias to trap retail investors.
Romance and Sextortion Scams
On an individual level, deepfakes are increasingly used in romance scams to maintain a deceptive online relationship without ever meeting in person. In safeguarding and forensic contexts, we also see the growth of AI-assisted sextortion, where victims’ faces are superimposed onto explicit material to blackmail them.
Why Humans Struggle to Detect Deepfakes: Cognitive Vulnerabilities
Why are we so easily misled by synthetic media? The answer lies in human psychology and cognitive processing. From a cyberpsychology perspective, several factors limit our detection capabilities:
- Cognitive Trust Mechanisms: Our brains default to truth. In daily life, assuming that others are who they claim to be is a necessary social lubricant. Suspending this default trust requires active, effortful cognitive processing.
- Emotional Realism: High-stress scenarios (such as an urgent call about a kidnapped child or a corporate crisis) trigger the amygdala, shutting down analytical thinking. Under emotional stress, we overlook technical glitches in the media.
- Familiar-Face Bias: We process familiar faces through specialized neural pathways (the fusiform face area). When we see a face we recognize, our brain focuses on recognition rather than verification, making us less likely to notice subtle digital anomalies.
- Mobile-Device Viewing Limitations: Most consumer media is consumed on small screens — often on the move, with glare, low brightness, or poor audio quality. These physical limitations mask the visual artefacts that would be obvious on a high-definition monitor in a lab.
- Urgency Tactics: Scammers deliberately introduce urgency (e.g., “this transaction must happen in the next 10 minutes”), which prevents the victim from stepping back to critically analyse the video or voice quality.
The Visual Deepfake Detection Checklist: A Forensic Observer’s Guide
When conducting a physical or digital review of a suspected video, investigators and managers should apply a systematic inspection protocol. The following visual checklist details the common technical limitations of AI generation models, explaining why they occur and how to spot them:
1. Unnatural Blinking Patterns
Why it occurs: Early deepfake models were trained on datasets of static images where the subjects’ eyes were open. Consequently, they did not learn the physiological mechanics of blinking. While modern models have improved, they still struggle with the duration, frequency, and synchrony of natural human blinking.
How to spot it: Observe the subject for a full minute. A typical human blinks 15 to 20 times per minute, with each blink lasting a fraction of a second. Deepfakes may blink too rarely, too frequently, or perform incomplete, half-blinks. Look for eyes that remain static or glassy during speech.
2. Lip-Sync and Phoneme Mismatch
Why it occurs: Matching the physical shape of the mouth (visemes) to the phonetic sounds (phonemes) of spoken language is computationally intensive. AI models often generate the audio track and then attempt to morph the mouth region to fit, leading to temporal drift.
How to spot it: Focus on hard consonant sounds like B, M, P, F, and V, which require complete closure of the lips or contact between the teeth and lips. In deepfakes, these sounds are often heard before or after the lips physically touch, or the mouth may remain slightly open while producing closed-mouth consonants.
3. Facial Boundary and Edge Distortion
Why it occurs: Most video deepfakes are created by superimposing a synthetic face onto a real actor’s head. The boundary where the synthetic mask meets the original skin is a common point of failure for blending algorithms.
How to spot it: Scan the perimeter of the face — specifically the jawline, temples, and hairline. Look for subtle blurring, pixelation, doubling of edges, or flickering lines when the subject turns their head or passes a hand in front of their face.
4. Lighting and Shadow Inconsistencies
Why it occurs: AI models generate faces pixel-by-pixel based on statistical probability, often without a true 3D understanding of physics. They struggle to align the lighting direction on the synthetic face with the ambient lighting of the background environment.
How to spot it: Compare the reflection in the subject’s pupils (the specular highlight) with the apparent light sources in the scene. If the room light is coming from the left, but the highlight in the eyes indicates a light source from the right, the face is synthetic. Check if shadows cast by the nose or chin align with the environment.
5. Over-Smoothed or “Plastic” Skin
Why it occurs: Generative models tend to average out fine details like skin pores, wrinkles, scars, and blemishes, resulting in a hyper-smooth texture.
How to spot it: Look for a lack of natural skin micro-texture. Authentic human skin has subtle variations in color, sweat glisten, and fine lines. A face that looks airbrushed, plastic, or uniformly matte — especially when compared to a more textured neck or background — is highly suspicious.
6. Robotic or Rigid Head Movement
Why it occurs: AI face-swapping algorithms map a synthetic face onto a target actor, but matching the natural structural tilt of the skull can cause coordinate misalignment, leading to rigid, mechanical motion.
How to spot it: Note if the face seems to slide slightly across the skull when the person turns their head. Watch for a “bobblehead” effect, where the head moves in a repetitive, smooth arc that lacks the natural micro-tremors and jerky corrections of human muscular movement.
7. Audio-Visual Mismatch and Voice Incongruence
Why it occurs: Audio cloning software and video generation tools are typically run as separate pipelines, leading to synchronization errors and acoustic mismatches.
How to spot it: Listen to the background noise. Is the room tone of the audio consistent with the visual environment? For example, if the video shows a person in a busy outdoor setting, but the voice has the clean, echoey reverb of an empty room, the audio has been manipulated. Look for sudden drops in audio quality or artificial clicks between sentences.
8. Background and Environmental Distortions
Why it occurs: Real-time deepfake filters must process frames in milliseconds. When the subject moves, the rendering algorithm may accidentally warp or distort the pixels in the immediate background.
How to spot it: Focus your gaze just outside the subject’s silhouette. Look for warping, bending of straight lines (like doorframes or bookshelves), or static noise that appears only when the person moves their head or shoulders.
9. Hand and Finger Anomalies
Why it occurs: Hands are highly articulable structures with complex self-occlusion (fingers hiding behind other fingers). AI models struggle to map these spatial relationships, resulting in anatomical errors.
How to spot it: When the subject raises their hands to gesture, count the fingers. Look for merged fingers, impossible joint angles, hands that appear to blend into clothing, or fingernails rendered on the wrong side of the digit.
10. Emotional and Expression Incongruence
Why it occurs: Micro-expressions (subtle, transient facial movements lasting milliseconds) reflect genuine psychological states. AI models can generate macro-movements (like a smile or a frown) but struggle to replicate the complex, fleeting coordination of facial muscles associated with authentic emotion.
How to spot it: Apply a behavioural analysis lens. Does the subject’s emotional expression match the content of their words? For example, does the mouth smile while the eyes remain cold and unengaged? Look for a “dead eyes” appearance where the upper face fails to participate in the expression.
11. Flickering and Edge Artefacts
Why it occurs: Deepfake algorithms struggle to render fine detail on thin structures like eyeglasses, earrings, teeth, and hair strands.
How to spot it: Watch for flickering or disappearing details. Do the frames of the subject’s glasses disappear for a split second when they turn? Do their teeth look like a single, solid white block rather than individual teeth? Does hair look like a solid mass rather than individual strands, or does it pixelate around the edges?
12. Temporal Inconsistencies and Jumps
Why it occurs: Standard deepfake generators process video frame-by-frame. Maintaining temporal coherence — ensuring that a detail rendered in frame 1 stays identical in frame 2 and frame 100 — is a major challenge for generative models.
How to spot it: Play the video at half-speed (0.5x). Look for rapid, frame-to-frame shifts in skin tone, sudden changes in the shape of the iris, or fleeting glitches where the original actor’s face briefly “shows through” the digital mask during rapid movement.
Investigator Note on Verification
Never rely on a single visual indicator to confirm a deepfake. Digital artefacts can sometimes occur due to network compression, low bandwidth, or camera sensor noise. Look for a cluster of anomalies — combining visual glitches with behavioural irregularities — to build a high-probability assessment of synthetic media fraud.
Behavioural Indicators of Deepfake Scams: The Psychology of the Con
While technical detectors are essential, the most reliable line of defence is often behavioural. Deepfake scams are social engineering attacks at their core; they rely on psychological pressure to force compliance. Investigators and employees should watch for the following behavioural red flags:
- Extreme Urgency: The impersonated caller or video contact insists that immediate action is required (e.g., “the contract will lapse if the wire transfer isn’t completed in 15 minutes”). This is designed to prevent rational verification.
- Enforced Secrecy: The target is instructed not to discuss the request with other team members or managers (e.g., “this is a highly confidential acquisition, do not mention it to anyone”). This isolates the victim from organisational sanity checks.
- Bypassing Standard Procedures: The caller asks the employee to ignore established verification steps (e.g., “I know we usually require a dual-signature form, but I am in a board meeting and need you to execute this now on my sole authority”).
- High-Pressure Tactics: Aggressive or coercive language, invoking disciplinary action or professional ruin if the demand is not met.
- Cryptocurrency or Alternative Payment Requests: Asking for funds to be transferred via irreversible, hard-to-trace channels like bitcoin wallets, digital gift cards, or external payment processors.
Organisational Protection: Building an Anti-Deepfake Culture
To protect businesses against synthetic media fraud, relying on staff intuition is insufficient. Organisations must implement structured protocols that assume digital channels are compromised:
- Multi-Step Verification: Establish a rigid rule that any financial transaction or sensitive data transfer requested via video or voice call must be confirmed via a separate, pre-established channel (e.g., an internal chat message, a call to a known number, or a physical signature).
- Challenge-Response Systems: During a suspicious call, ask the caller a question that a deepfake system cannot easily answer in real-time. For example, ask them to recall a specific, obscure shared memory, or ask them to perform an unexpected physical action (e.g., “hold your hand in front of your face and wave” — this often breaks real-time face-rendering models).
- Independent Verification Channels: Never use the contact details provided in the suspicious message. Use the company directory to contact the individual directly.
- Voice Footprint Reduction: Limit the public availability of high-quality audio recordings of senior executives (such as podcast appearances, public speeches, or media interviews). Scammers use these public samples to train their voice-cloning models.
- Regular Awareness Training: Educate staff on the existence and capabilities of generative AI scams. Conduct mock deepfake phishing exercises to test organisational resilience.
Can Deepfakes Be Reliably Detected? The Evidentiary Arms Race
A common question in legal and corporate investigations is whether deepfakes can be definitively identified. The short answer is that detection is a continuous, evolving arms race. As detection algorithms improve, generative AI models adapt to bypass them. Key realities of detection include:
AI Detection Tool Limitations: Automated deepfake detectors (which search for pixel patterns or noise distributions) have high accuracy rates in controlled test environments. However, they frequently fail when applied to compressed, low-quality social media videos or real-time streams.
Forensic Analysis: Professional digital forensics involves checking file metadata, examining the noise patterns of the camera sensor (PRNU analysis), and using specialized software to detect inconsistencies in the audio wave structure. This is a slow, specialized process that cannot be done during a live call.
Contextual Assessment: Because technical tools are not infallible, a forensic psychologist or investigator must look at the whole picture. We must evaluate the context of the communication, the behavioural pressure applied, and the plausibility of the request. Behavioural analysis remains a cornerstone of deception detection.
Practical Consumer Safety Advice: What to Do in Suspicious Scenarios
If you receive a suspicious video call, voice message, or social media request, apply the S.V.S.C.C. framework developed by digital safety experts:
The Consumer Safety Protocol
- STOP: Do not react immediately. Scammers rely on panic to bypass your critical thinking. Take a deep breath.
- VERIFY: Reach out to the person using a trusted, independent method. Call their known phone number or send a message on an established platform.
- SLOW DOWN: If the caller is rushing you, treat the urgency itself as a major red flag. Legitimate businesses and loved ones will understand if you need to double-check.
- CHECK CONTEXT: Ask yourself: Does this request make sense? Would my CEO really ask me to buy iTunes gift cards? Would my relative really demand money via cryptocurrency?
- CONSULT OTHERS: Talk to a colleague, family member, or trusted friend. An outside perspective is often highly effective at spotting the absurdity of a scam.
The Future of Deepfakes: The Next Wave of AI-Enabled Deception
The threat landscape is changing rapidly. In the coming years, we expect to see:
- Hyper-Personalised Scams: Attackers will compile public data from social media, professional profiles, and leaked databases to create tailored deepfake videos or audio clips containing specific personal details.
- Automated Fraud Systems: AI agents will be deployed to conduct interactive, real-time voice and video scams at scale, generating custom responses based on the victim’s reactions.
- Synthetic Evidence: The introduction of fabricated video and audio evidence in civil and criminal litigation, forcing courts to implement stricter verification standards for digital proof.
- Real-Time Video Conference Hijacking: As processing power increases, real-time face-swapping filters will become indistinguishable from reality, even on standard video calls.
Final Thoughts: A Trust Problem in a Synthetic World
Ultimately, deepfakes are not just a technical problem — they are a trust problem. When we can no longer trust our eyes and ears, the fabric of digital interaction begins to fray. The solution is not to live in constant paranoia, but to transition from a default-trust mindset to a verification-culture mindset. Verification must replace intuition.
At The Centre for Forensic Neuroscience, we believe that behavioural awareness and systematic verification are our most powerful tools against AI-enabled deception. By understanding the psychological triggers used by scammers and the physical limitations of generative software, we can protect our organisations, our families, and our legal systems from synthetic fraud.
Quick Reference: The Anti-Deepfake Verification Checklist
Print or save this checklist as a quick reference guide for your team or family members:
| Category | What to Look For | Verification Action |
|---|---|---|
| Visual Indicators | Unnatural blinking, lip-sync mismatch, edge blur, plastic skin, lighting inconsistencies. | Watch the face at 0.5x speed; look for alignment glitches when head turns. |
| Behavioural Indicators | Extreme urgency, secrecy demands, request to bypass process, unusual payment types. | Treat urgency as a warning; decline to bypass internal safety protocols. |
| Verification Actions | Real-time challenge, out-of-band contact, peer consult. | Ask the caller to perform a physical action or contact them on a known number. |
Dr Keith Ashcroft is a Chartered Psychologist and Principal Forensic Examiner at The Centre for Forensic Neuroscience. The Centre provides expert consultations in cyberpsychology, investigative psychology, and polygraph examinations for corporate, legal, and private clients. If you require forensic advice on deception detection, synthetic media fraud, or employee security awareness, please contact us for a confidential consultation.