Fine Tuning Large Language Models Using Reinforcement Learning from Human Feedback

Speakers:

Prince Tyagi

Fine Tuning Large Language Models Using Reinforcement Learning from Human Feedback

Date:

Tuesday, November 19, 2024

Time:

11:35 am

Track:

Tech/Deep Dives

Room:

Forum 8

Summary:

RLHF is critical method in NLP which learn from human interactions, generating more accurate and contextually appropriate responses. In this session, Prince will explore what is RLHF: He will start with discussing underlying principles behind it, various techniques used and finally a real-world example including data collection, reward design, and model evaluation to illustrate its effectiveness. By the end Prince will discuss its limitations, how it can be adapted to different use cases and domains.

Speakers: