Text this: Advancing the Multimodal Language Acquisition Framework through Collaborative Dialogue