Text this: A multi-instance learning framework combining CNN and transformer for noninvasive coronary heart disease diagnosis using scleral images.