Text this: Modality-specific and hierarchical feature learning for RGB-D hand-held object recognition.