Cultured Computer
Evaluation

PersonaPersistBench

Evaluation framework scoring identity persistence across turns and models.

Overview

No standard benchmark measures whether a persona held across turns, models, or adversarial pressure. CharacterBench (AAAI 2025) evaluates character customization across 11 dimensions but assumes a single model and does not test identity persistence through model swaps. Existing commercial metrics (CSAT, deflection, resolution rate) measure outcomes, not coherence.

PersonaPersistBench closes this gap. Five metrics run inside the decision loop, scored every turn.

Metrics

ICS: Identity Coherence

Embedding cosine similarity between response and persona definition. Measures whether the AI stayed in character.

VCS: Voice Consistency

Signature phrase presence, forbidden pattern absence. Measures whether the AI sounds like itself.

MCS: Memory Continuity

Fact extraction and downstream recall verification. Measures whether the AI remembers what it learned.

CFS: Context Fidelity

Gap between stored and applied context. Catches "knows but ignores."

DR: Drift Rate

EWMA of ICS across turns. Separates noise from systematic identity decline.

Identity Coherence (ICS)

Every RICE definition is pre-computed into a high-dimensional embedding. Every model response is embedded and scored against that vector in real time.

RangeInterpretation
> 0.85Strong alignment
0.70-0.85Acceptable, minor drift
< 0.70Triggers regeneration

Automatic recovery

When ICS drops below 0.70, the system automatically regenerates the response with reinforced persona context. The user never sees the failed response.

Voice Consistency (VCS)

Pattern matching runs against the RICE Communication layer every turn:

Match TypeScore Impact
Signature phrase present+0.1
Forbidden pattern detected-0.2
Stylistic rule compliance+0.05

Memory Continuity (MCS)

Facts are extracted from each turn and stored. MCS verifies downstream recall: did the system use a fact the user shared three turns ago? Binary per fact, aggregated per session.

Context Fidelity (CFS)

Measures the gap between stored and applied context. Catches the case where the system injects context but the model ignores it: "knows but doesn't use."

Drift Rate (DR)

Uses Exponentially Weighted Moving Average (EWMA) to separate noise from systematic decline. This catches gradual persona erosion that single-turn scoring would miss.

RangeInterpretation
< 0.10Stable identity
0.10-0.15Minor drift, acceptable
> 0.15Systematic decline, intervention needed

Success Criteria

ConditionThreshold
ICS across all model switches> 0.70
VCS regardless of underlying modelConsistent
DR across full conversation< 0.15
Adversarial resistanceRobust to prompt injection

Test Scenarios

PersonaPersistBench evaluates across three dimensions:

  1. Single session: sustained identity over 20+ turns
  2. Multi-turn with model switches: multiple LLMs within one conversation
  3. Adversarial pressure: character break attempts, prompt injection, role confusion

References

  • Li, K. et al. (2024). Measuring and Controlling Instruction (In)Stability in Language Model Dialogs. COLM 2024.
  • Choi, J. et al. (2024). Examining Identity Drift in Conversations of LLM Agents.
  • Zhou, J. et al. (2025). CharacterBench: Benchmarking Character Customization of Large Language Models. AAAI 2025.

On this page