Deepfake Detector is a research-oriented FastAPI app and CLI for analyzing videos with two detection pipelines:
MesoNetfor face-level artifact detectionTemp-D3for temporal anomaly detection across frames
The repository is set up to be open-source friendly: application code is included, but pretrained weights and datasets are not redistributed.
This repository includes:
- application code in
main.py,run_models.py,templates/, andstatic/ - vendored or adapted research code in
MesoNet/andtemp-d3/ - small sample assets and supporting project files
This repository intentionally does not include:
- pretrained
MesoNetweights - datasets
- downloaded videos
- cached external model downloads
If you use local weights for experimentation, keep them untracked under MesoNet/weights/.
- The original code in this repository is licensed under the MIT License. See LICENSE.
- Vendored research code and other third-party material are not automatically relicensed under MIT. See THIRD_PARTY.md.
- Only add or redistribute weights, datasets, or source media if you have clear rights to do so.
For a YouTube URL submitted through the web app:
POST /api/analyzevalidates the URL and creates an in-memory job.yt-dlpdownloads the source video intodownloads/.run_models.pyrunsMesoNetwhen compatible local weights are available and runsTemp-D3.main.pyparses the model outputs, applies a sigmoid to the rawTemp-D3score, and combines available scores.- The backend returns a final
realorfakeverdict and deletes the temporary video when cleanup succeeds.
MesoNet is optional in practice. If no local weights are installed, the system falls back to Temp-D3 only.
| Model | Purpose | Output |
|---|---|---|
MesoNet (Meso4) |
Detects face-level forgery artifacts | Score from 0.0 to 1.0, where higher is more fake-like |
Temp-D3 (XCLIP-16) |
Detects temporal anomalies in video frames | Score from 0.0 to 1.0, where higher is more fake-like |
temp_d3_score = sigmoid(temp_d3_raw_score)
combined_score = mean([temp_d3_score, mesonet_score]) # when MesoNet produced a score
combined_score = temp_d3_score # when MesoNet is unavailable or unusable
verdict = "fake" if combined_score >= 0.5 else "real"
Deepfake_Detector/
├── main.py
├── run_models.py
├── requirements_web.txt
├── templates/
├── static/
├── MesoNet/
├── temp-d3/
├── LICENSE
├── THIRD_PARTY.md
Recommended baseline:
- Python
3.11 pipffmpegcmakeand a working compiler toolchain fordlibandface_recognition- internet access on first
Temp-D3run so the encoder can be cached locally
Optional:
- a CUDA-capable GPU for faster inference
- local
MesoNetweights if you want the face-based detector enabled
macOS:
brew install ffmpeg cmakeUbuntu / Debian:
sudo apt update
sudo apt install -y ffmpeg cmake build-essentialSome environments may also need OpenCV runtime packages such as libgl1.
git clone https://github.com/Ishaan2011/Deepfake-detector.git
cd Deepfake-detectorpython3.11 -m venv .venv
source .venv/bin/activateWindows:
.venv\Scripts\activatepip install -r requirements_web.txtIf you need a specific CUDA build of PyTorch, install the matching torch and torchvision packages first, then run:
pip install -r temp-d3/requirements.txtpip install tensorflow keras face_recognition scipy imageio imageio-ffmpegTemp-D3may download pretrained encoder weights on first run.downloads/is created automatically bymain.py.- If you have rights to use compatible
MesoNetweights, place them underMesoNet/weights/.
For offline runs, pre-cache the model referenced by temp-d3/models/D3_model.py, which currently defaults to microsoft/xclip-base-patch16.
uvicorn main:app --reload --host 0.0.0.0 --port 8000Then open http://localhost:8000 and submit a YouTube URL.
python run_models.py /path/to/video.mp4 --verbose-statusExample output:
MesoNet not used (No faces found or error)
Temp-D3 Score: 3.4512 (Higher value ~ more likely Fake/Anomaly)
If local MesoNet weights are installed, the CLI will print a MesoNet Score as well.
Serves the browser UI.
Starts a YouTube analysis request.
Request:
{
"youtube_url": "https://www.youtube.com/watch?v=..."
}Response:
{
"job_id": "abc123",
"status_endpoint": "/api/analyze/abc123"
}Returns the current job snapshot, including status, logs, progress, and the final result when complete.
Example completed response:
{
"job_id": "abc123",
"youtube_url": "https://www.youtube.com/watch?v=...",
"status": "completed",
"phase": "Completed",
"download_percent": 100.0,
"result": {
"video_file": "abcd1234.mp4",
"mesonet_score": 0.82,
"temp_d3_raw_score": 3.45,
"temp_d3_score": 0.97,
"overall_verdict": "fake",
"processing_seconds": 42.1
},
"error": null
}Observed backend states:
queuedstarteddownloadingrunning_modelsrunning_mesonetrunning_temp_d3completedfailed
- Jobs are stored in memory and processed in background threads.
- Request history is lost on server restart.
- Downloaded videos are stored temporarily in
downloads/and deleted when cleanup succeeds.
pip install -r requirements_web.txtInstall ffmpeg and make sure it is available on your PATH.
Install cmake and build tools first, then retry. Python 3.11 is usually the smoothest option.
The first run may download encoder weights from Hugging Face. Later runs should use the local cache.
That usually means one of two things:
- no compatible local
MesoNetweights are installed - the pipeline could not extract enough usable faces from the video
In either case, the backend falls back to the normalized Temp-D3 score alone.
- The browser workflow accepts YouTube URLs only.
- Jobs are stored in memory and run in background threads.
- Request history is lost on server restart.
- There is no persistent database, auth layer, rate limiting, or worker queue.
- Accuracy depends heavily on compression, lighting, motion, and domain shift.
MesoNetis face-dependent and may fail on videos without stable detectable faces.- This repository mixes app code with vendored research components.
- There is currently no automated test suite or CI pipeline in the repository.
This project should not be treated as forensic proof or as a sole decision-making system.
- Use results as a signal, not final proof.
- Verify suspicious content with multiple methods and human review.
- Do not use this tool as the sole basis for legal, safety-critical, or reputational decisions.
- Respect the licenses and usage terms of upstream models, weights, datasets, and source media.
Focused pull requests are easiest to review. If you change model orchestration, keep the API flow and CLI behavior aligned so the web app and local runner stay consistent.
This project builds on or incorporates ideas from:
MesoNet: Afchar et al., "MesoNet: a Compact Facial Video Forgery Detection Network"Temp-D3: Zheng et al., "D3: Training-Free AI-Generated Video Detection Using Second-Order Features"