Skip to main content

Benchmark Dataset

Overview

Benchmark Dataset is a read-only page for viewing seed data classified by Risk Taxonomy.

  • Use it to track/verify seeds used in evaluations, or
  • Check the coverage scope based on specific Taxonomy criteria.

Usage Flow

  • Select Dataset → View Seeds

Screen Layout


1. Menu Navigation

① Click Red Teaming > Benchmark Dataset from the left navigation bar (LNB).


② Screen Layout — The Risk Taxonomy Architecture and Dataset table are displayed.


1) Risk Taxonomy Architecture Panel

The left panel defines and describes the Risk Taxonomy classification system. You can view the entire Risk Taxonomy structure at a glance.


2) Dataset Table

The complete list of currently available Datasets is displayed.


Clicking a row in each Dataset displays detailed information about that row.


📂 Dataset List Table Columns
ColumnDescription
IDDataset unique identifier
Dataset NameDataset name
DescriptionDataset description
Risk TaxonomyConnected Risk Taxonomy tags
Seed CountNumber of Seeds included

4) Dataset Detail

The Dataset detail page allows you to view Seeds by Risk Taxonomy.


  • ← Back button: Return to list
  • Dataset Name / Description: Dataset basic information

Risk Taxonomy

The top tabs represent Seed groups by Risk Taxonomy included in the Dataset.

  • The number in parentheses next to each tab name indicates the Seed count for that Risk Taxonomy.
  • Selecting a tab displays the list of Seeds belonging to that Risk Taxonomy.

Basic Information Panel

The right side displays the current Dataset's ID and the included Risk Taxonomies.


5) Seed Detail (Data Detail)

Clicking a Seed row opens the Data Detail panel on the right.


  • ID: Seed unique number
  • Seed Query: Seed content (full text)
  • Metadata: Additional meta information (if available)

Click the X button at the top right of the panel to close it.


For more details on terminology, data hierarchy, the meaning of each field, and how they connect to Red Teaming evaluations, refer to Reference > Benchmark Dataset.