Text this: A cosine similarity-based token subsampling method for vision transformer in cloud computing.