Skip to main content

Dataset Overview

The Datasets page serves as a repository for managing reusable test sets in your Evaluation Projects. A Dataset is a collection of data designed to validate specific evaluation criteria.

To consistently measure quality when comparing different AI systems (e.g., models or system prompts) against certain criteria, you should reuse the same Dataset. You can create Datasets by manually preparing and uploading files, or by using an LLM to generate Query/Response pairs.

Dataset Components

Context Set - Reference documents that serve as the evaluation criteria. Query Set - A collection of questions to be asked to the AI model. Response Set - The results of the AI model's responses to the queries.


Dataset Workflow

Context Set → Query Set → Response Set → Evaluation → Result Analysis
↓ ↓ ↓ ↓
Prepare Docs → Generate Questions → Generate Responses → Measure Performance

Getting Started

Datasets can be prepared in two ways: AI-driven generation and manual file upload.

Step 1: Create a Context Set

Upload the reference documents. 👉 Context Set Guide

Step 2: Create a Query Set

Write questions yourself or automatically generate them based on the context. 👉 Query Set Guide

Step 3: Create a Response Set

Collect AI model responses or upload existing results. 👉 Response Set Guide

tip

Once a Dataset is prepared, it can be loaded and used in any evaluation task you want.


Key Features

  • Reusable: A Dataset created once can be used in multiple evaluation projects.
  • Version Control: Unlinked Query Sets can be modified, with support for tracking change history.
  • Flexible Structure: Automatic mapping of data columns to fit the evaluation framework.

Supported File Formats

  • CSV
  • XLSX

⚡ Quick Tips

  1. Prepare the Context Set (source documents) first → it forms the basis for questions.
  2. The higher the quality of the Context, the better the Query Set you can create.
  3. Always review and refine automatically generated Query Sets.
  4. Use clear and specific names for your Datasets to make them easier to manage.
  5. Add special fields as needed - you don't have to fill in every field.