Text this: Whom do we prefer to learn from in observational reinforcement learning?