Text this: Human-like cognitive generalization for large models via mental representation-guided supervision.