Skip to main content

How to Run RAG Checkerโ€‹

โœ… RAG Checker Overview

RAG Checker automatically evaluates a RAG systemโ€™s factual accuracy (Factuality) and retrievalโ€“generation performance
by comparing the Expected Response (ER) with the Target Response (TR).

Once you upload a dataset and create an evaluation task,
the system automatically calculates key metrics such as Precision, Recall, Faithfulness, and Hallucination,
enabling you to quantitatively assess the modelโ€™s factual consistency and context utilization.


Step 1. Prepare the Datasetโ€‹

Before running the evaluation, prepare a dataset that includes the following columns:

๐Ÿ“‚ View Dataset Upload Guide
ColumnDescription
queryThe userโ€™s input question.
expected_responseThe Expected Response (ER) โ€” the reference or ground-truth answer for the query.
responseThe Target Response (TR) generated by the RAG system.
retrieved_context1The document or passage retrieved by the model for answer generation.
If multiple contexts are retrieved, add sequential columns such as retrieved_context2, retrieved_context3, and so on.
  • Supported file types: .csv, .xlsx
  • Required columns: query, expected_response, response, retrieved_context1

Step 2. Create a RAG Checker Taskโ€‹

  1. In the left navigation panel, open RAG Checker.
  2. Click + Add Task in the upper-right corner.
  3. In the dialog box, fill in the following information:
    • Task Name โ€“ The name of the evaluation task.
    • Description โ€“ A brief summary of the task.
    • Target Model โ€“ The RAG system or LLM to be evaluated.
  4. Click [Create] to create the task.

Once the task is created, proceed to create an Evaluation Set.


Step 3. Create and Run an Evaluation Setโ€‹

  1. Open the task detail page.
  2. Go to the [Evaluation Set] tab and click + New Eval Set.
  3. Configure the following settings:
    • Decomposition Model โ€“ Breaks responses into verifiable claims.
    • Entailment Model โ€“ Determines whether each claim is logically supported by the retrieved context.
    • Set Name / Description โ€“ Specify a name and optional description for the evaluation set.
  4. Select the dataset for evaluation.
  5. Click [Start Evaluation] to begin the evaluation.

Once started, RAG Checker automatically performs the following steps:

  • Decomposes the model response into verifiable claims.
  • Determines whether each claim is entailed by the retrieved passages.


Once the evaluation is complete, proceed to the next step to review and analyze the results.