Text this: Multimodal heterogeneous graph entity-level fusion for named entity recognition with multi-granularity visual guidance.