Running Evaluation

The Running Evaluation page explains how to run automated evaluations
by selecting a target model and an evaluation model based on the dataset you created.

Depending on the type of uploaded data, the evaluation interface may differ slightly.
This guide focuses on the common process across all cases.

Run Auto-Evaluation: Execute automatic evaluation using selected target and evaluation models
Monitor Progress: Check real-time progress and status of the evaluation
View Results & Duration: After completion, review evaluation results and time taken

① Start Evaluation

Once dataset creation is complete, click the Evaluate Dataset button to begin evaluation.

② Select Evaluation Models

Click the Start Evaluation button to open the model selection modal.
Choose the Target Model and Evaluation Model you want to use.
The number and type of selectable models depend on the Upload Type.

Multiple Model Selection

Upload Type: Query Generation, Query Upload
You can select multiple Target Models and Evaluation Models.
Use the + button to add or remove models.
Target Models must be connected to valid APIs capable of generating responses.
Evaluate and compare multiple model combinations in one run.

Single Model Selection

Upload Type: Query + Response Upload
You can select only one Target Model and one Evaluation Model.
Even if a Target Model is not connected to a live API, it can still be used for evaluation if registered by name only (e.g., Human Analysis Model 2018).

Model Selection Rules by Evaluation Flow

Upload Type	Target Model (Answer Generation)	Evaluation Model
Query Generation, Query Upload	Multiple selections allowed - Must be connected to a valid API	Multiple selections allowed
Query + Response Upload	Single selection only - Models registered by name only are also allowed (e.g., Human Evaluation)	Single selection only

③ Check Evaluation Progress

Once evaluation starts, you can monitor detailed progress and retry any failed items.

Use View Detail to check progress status, start time, duration, metrics, number of records, and dataset status.
If an error occurs, the cause is shown and you can retry the evaluation for that entry.

Click the **View Detail** button on the progress bar or the **Target Model box** to check detailed progress

In case of errors, you can identify the cause and retry the evaluation

① Start Evaluation​

② Select Evaluation Models​

③ Check Evaluation Progress​

① Start Evaluation

② Select Evaluation Models

③ Check Evaluation Progress