Skip to main content

Running Evaluation

The Running Evaluation page explains how to run automated evaluations
by selecting a target model and an evaluation model based on the dataset you created.

Depending on the type of uploaded data, the evaluation interface may differ slightly.
This guide focuses on the common process across all cases.

  • Run Auto-Evaluation: Execute automatic evaluation using selected target and evaluation models
  • Monitor Progress: Check real-time progress and status of the evaluation
  • View Results & Duration: After completion, review evaluation results and time taken



① Start Evaluation

Once dataset creation is complete, click the Evaluate Dataset button to begin evaluation.




② Select Evaluation Models

Click the Start Evaluation button to open the model selection modal.
Choose the Target Model and Evaluation Model you want to use.
The number and type of selectable models depend on the Upload Type.

Multiple Model Selection
  • Upload Type: Query Generation, Query Upload
  • You can select multiple Target Models and Evaluation Models.
  • Use the + button to add or remove models.
  • Target Models must be connected to valid APIs capable of generating responses.
  • Evaluate and compare multiple model combinations in one run.
Single Model Selection
  • Upload Type: Query + Response Upload
  • You can select only one Target Model and one Evaluation Model.
  • Even if a Target Model is not connected to a live API, it can still be used for evaluation if registered by name only (e.g., Human Analysis Model 2018).

Model Selection Rules by Evaluation Flow
Upload TypeTarget Model (Answer Generation)Evaluation Model

Query Generation,
Query Upload

  • Multiple selections allowed
    - Must be connected to a valid API
  • Multiple selections allowed
Query + Response Upload
  • Single selection only
    - Models registered by name only are also allowed (e.g., Human Evaluation)
  • Single selection only


③ Check Evaluation Progress

Once evaluation starts, you can monitor detailed progress and retry any failed items.

  • Use View Detail to check progress status, start time, duration, metrics, number of records, and dataset status.
  • If an error occurs, the cause is shown and you can retry the evaluation for that entry.

Click the View Detail button on the progress bar or the Target Model box to check detailed progress



In case of errors, you can identify the cause and retry the evaluation