Evaluate Text Result

The Evaluate Text Result node allows you to evaluate text outputs against specific criteria using Griptape's Eval Engine. This node is useful for validating AI-generated content, checking factual accuracy, or assessing the quality of text outputs.

Inputs

Examples (Property): Choose from preset examples or create your own evaluation
- Options:
  - Choose a preset..
  - Paraphrase
  - Factual
  - Analogy
Input (Input/Property): The input text to be evaluated
- Supports multiline text input
Expected Output (Input/Property): The expected or reference output text
- Single line text input
Actual Output (Input/Property): The actual output text to be evaluated
- Single line text input
Criteria (Input/Property): The evaluation criteria to use
- Supports multiline text input
- Example: "Does the output accurately paraphrase the input without losing meaning?"

Outputs

Score (Output): A float value between 0 and 1 representing the evaluation score
- 1.0 indicates perfect match
- 0.0 indicates complete mismatch
Reason (Output): A detailed explanation of the evaluation result
- Provides feedback on why the score was given
- Explains any discrepancies found

Example Usage

Paraphrase Evaluation

Input: "The quick brown fox jumps over the lazy dog."
Expected Output: "A swift brown fox leaps above a sleeping dog."
Actual Output: "A fast fox jumps over a dog that's not awake."
Criteria: "Does the output accurately paraphrase the input without losing meaning?"

Factual Evaluation

Input: "The capital of France is Paris."
Expected Output: "Paris is the capital city of France."
Actual Output: "France's capital is Paris."
Criteria: "Is the output factually correct based on the input?"

Analogy Evaluation

Input: "A bird is to sky as a fish is to ______."
Expected Output: "water"
Actual Output: "concrete"
Criteria: "Does the output correctly complete the analogy?"

Notes

The node uses Griptape's Eval Engine to perform the evaluation
The evaluation is based on the provided criteria
The score is normalized between 0 and 1
The reason provides detailed feedback about the evaluation
You can use preset examples or create your own custom evaluations