Human Evaluation Overview
A method where humans directly evaluate AI responses, allowing for intuitive verification of the model's response quality.
📄️ Manual Evaluation
Manual Evaluation Guide
📄️ Interactive Evaluation
Interactive Evaluation is a workflow where workers chat with an AI model in real-time to assess its response quality.
1. Manual Evaluation
This feature allows evaluators to manually assess the quality of AI responses based on predefined evaluation criteria (rubrics). It ensures objectivity through systematic and consistent evaluation standards while also reflecting human subjective judgment.
2. Interactive Evaluation
An interactive evaluation system where you can send queries directly to an AI model and evaluate the responses in real-time. It provides a feature for evaluators to immediately rate responses as Good/Bad and write a Ground Truth (GT).