2. View Red Teaming Results (View Results)

Overview

Once an evaluation is completed, you can visualize overall results in the Dashboard and analyze individual outcomes in the Attack Set detail view.

This document explains how to read and explore results from the Dashboard after an Auto Red Teaming or Benchmark execution has completed.

Through this tutorial, you will be able to:

Compare Attack Success Rates (ASR) across multiple target models at a glance.
Select a specific model to analyze detailed metrics and vulnerable areas.
Filter results by Attack Set to improve comparison accuracy.
Inspect actual attack–response results at the individual Seed level.

Dashboard Overview — ASR by Target Model

After evaluation is complete, navigate to Task Detail → Dashboard tab.

The Dashboard aggregates results at the Attack Set level, allowing comparison of multiple target models evaluated using the same Attack Set.

The first section displayed is the Attack Success Rate (ASR) by Target Model chart, which provides a quick, high-level view of overall results.

💡 If no Attack Set is selected All relevant results executed within the current workspace are summarized and displayed.

1. Filters and Chart

① Select Attack Set

Use the top filter to select the Attack Set you want to analyze.

② ASR by Model Chart

The Attack Success Rate for each model included in the selected Attack Set is displayed as a chart. Clicking a model in the chart reveals detailed metrics for that model in the section below.

2. Run Summary for Selected Model

Metric	Description
Attack Success Rate (ASR)	Percentage of successful attacks (%)
Average Runs per Seed	Average number of iterations per Seed
Average Attack Score	Average attack score
Total Cost	Total cost
Total Run Time	Total execution time

3. Model-Level Metrics — Judge / Scorer Tabs

Below the Run Summary, metrics are further broken down by Taxonomy (risk classification) and Strategy (attack strategy). By default, the Judge tab is selected, but you can switch to the Scorer tab as needed.

Judge vs. Scorer — Which tab should you use?

Judge tab: Best for quickly understanding overall attack success rates based on binary outcomes—whether an attack succeeded or failed. Scorer tab: Used to assess the degree of harmfulness in model responses. It enables analysis of the harmfulness spectrum that cannot be captured by ASR alone.

① Judge Tab

Attack Success Rate by Taxonomy: Attack success rate by risk category
Attack Success Rate by Strategy: Attack success rate by attack strategy

② Scorer Tab

Safety Score by Taxonomy: Safety score by risk category
Safety Score by Strategy: Safety score by attack strategy

Scores in the Scorer tab range from 0 to 10, with the following interpretations:

Score Range	Judgment	Meaning
10, 9, 8	Highly Unsafe	Clearly generates harmful responses
7, 6, 5	Moderately Unsafe	Partially harmful or borderline responses
4, 3	Moderately Safe	Generally safe but with room for improvement
2, 1, 0	Refusal or Fully Safe	The request was refused or the response was fully safe

③ Expand Sub-fields

Clicking a row for a Taxonomy or Strategy expands detailed charts for sub-fields within that category. Click Expand All at the top to expand all sub-fields at once.

Attack Set Detail

Click an item in the Attack Set list to navigate to the detailed, Seed-level result view.

1. Header Section

① Left-side Information

Target Model
Sampling Method
Count per Taxonomy

② Right-side Metrics

Total (Safe / Unsafe ratio)
Average Loop Count
Max Red Teaming Run
Unsafety Count by Strategy

2. Taxonomy Tabs

Results are organized into tabs by Taxonomy category.

Column	Description
ID	Seed identifier
Seed	Original Seed
Result	Safe / Unsafe
High-risk Query	Generated attack prompt
Answer	Model response
Strategy	Attack strategy used
Loop	Number of iterations

3. Data Detail Panel

Clicking a Seed row opens a detailed view panel on the right.

Seed: Original Seed
High-risk Query: Generated attack prompt (when judged Unsafe)
Defense Successes: History of successful defenses (Safe attempts)

Using Defense Successes

Even if a Seed ultimately results in an Unsafe judgment, the model may have successfully defended against earlier attempts. Reviewing these histories helps identify which variations the model could defend against and where the defense ultimately failed.

Dashboard Overview — ASR by Target Model​

1. Filters and Chart​

① Select Attack Set​

② ASR by Model Chart​

2. Run Summary for Selected Model​

3. Model-Level Metrics — Judge / Scorer Tabs​

① Judge Tab​

② Scorer Tab​

③ Expand Sub-fields​

Attack Set Detail​

1. Header Section​

① Left-side Information​

② Right-side Metrics​

2. Taxonomy Tabs​

3. Data Detail Panel​