Text this: Brief application notes for vision transformer (ViT) and convolutional neural network (CNN) in medical imaging.