Text this: Semantically Supervised SeDINO Encoder for Visual–Language–Action Model.