Text this: A high-resolution and interpretable sound source localization network with a physics-informed perceptual loss function for enhanced localization accuracy.