Adaptive Preference Learning With Bandit Feedback: Information Filtering, Dueling Bandits and Incentivizing Exploration