Text this: Deepfake Detection Using Multimodal CLIP-Based SigLIP-2 Vision Transformers.