From Transcript to Interaction Preference: Extraction Pipeline

Problem

We have structured transcripts: {speaker, addressee, raw_text, stage_direction, scene}. We need to derive each character’s PA interaction preferences per our 2-context, 15-attribute IPaS taxonomy (see PA_Interaction_Preference). The core challenge: a character’s speaking style ≠ their preference for how a PA should communicate with them.

Core Principle: Production ≠ Reception

  • Production (how a character speaks) provides a prior via similarity-attraction — it suggests but does not confirm preference
  • Reception (how a character reacts to others’ communication styles) confirms or overrides the prior
  • When production and reception diverge (asymmetry), the PA preference follows reception
  • Example: Sheldon acts autonomously but demands to be consulted → PA setting = Suggest, not Autonomous

Pipeline: Two-Pass LLM Extraction

Script: data/extractor/extract.py (all LLM calls via OpenRouter)

Raw transcript JSONL
       │
       ▼
[Pass 1 — Gemini Flash]
  Filter scenes → classify context (work / personal)
  Tag attributes → extract evidence quotes
       │
       ▼
  pass1.jsonl  (one row per scene)
       │
       ▼
[Pass 2 — Claude Sonnet]
  Per (attribute, context) pair:
  synthesize all evidence → setting + confidence
       │
       ▼
  pass2.json  (2 × 14 active matrix, raw)
       │
       ▼
[HITL Gate]
  High confidence → auto-accept (extractor_high)
  Below High → human review → accept / reject / override
       │
       ▼
  preferences.yaml  (2 × 14 active cells, with provenance)
       │
       ▼
[Anonymize]
  Character name → User A / B / ...
  Remove identifying notes
       │
       ▼
  Simulator-ready persona YAML

Pass 1 — Filter + Classify (Gemini Flash)

For each scene where the target character appears:

  1. Filter: Does this scene contain IPaS attribute evidence? If no → skip
  2. Classify context: work (professional tasks) or personal (personal life and social interactions)
  3. Tag attributes: Which of 14 active attributes have observable evidence?
  4. Extract evidence: Quote key dialogue, label as production / reception / explicit_statement

Reception signal validity:

  • VALID: Explicit confusion/discomfort with a register, engagement change in response to style (not content), verbal meta-commentary, accommodation shifts
  • INVALID: Plot-driven topic changes, content disagreement, scripted dramatic reactions

Output: {character}_{seasons}_pass1.jsonl — one JSON row per scene

Typical yield: ~90–95% of scenes with target character are relevant; most scenes are personal context for sitcom characters (social interactions dominate early seasons).


Pass 2 — Synthesize Settings (Claude Sonnet)

For each (attribute, context) pair with evidence:

  1. Receives all Pass 1 evidence fragments for that cell
  2. Applies three-layer determination:
    • Statistical/frequency patterns → direction (who is more X than whom)
    • Qualitative production patterns → anchor to a setting
    • Reception evidence → confirms or overrides
  3. Outputs: setting, confidence, evidence_summary, production_evidence[], reception_evidence[], asymmetry note

Confidence levels:

LevelMeaning
HighProduction + reception evidence agree across multiple scenes
Medium-HighStrong production + some reception, or strong reception but few scenes
MediumProduction evidence only, or ambiguous reception
LowInsufficient or contradictory evidence

Output: {character}_{seasons}_pass2.json — nested {context: {attribute: cell}}


HITL Gate (Hard Rule)

Only High-confidence cells auto-write to the final YAML. Everything below High requires human review. No exceptions.

Human reviews non-High cells with the evidence summary → decides: accept as-is / adjust setting / reject (leave empty).

Sources in final YAML:

  • extractor_high — auto-accepted High confidence
  • hitl_override — human changed or overrode the setting
  • mbti_accepted — MBTI/personality projection for attributes without transcript proxy
  • seed — manually authored for attributes with no evidence path
  • no_preference — confirmed no strong preference (not absence of evidence)

Output: 2 × 14 Active Preference Matrix

character: User A
matrix:
  work:
    tone_formality:
      setting: Formal
      source: extractor_high
      confidence: High
      notes: null
    ...  # 14 active attributes
  personal:
    tone_formality:
      setting: Casual
      source: extractor_high
      confidence: High
      notes: null
    ...  # 14 active attributes

Each cell carries full provenance. Empty cells (null setting) mean insufficient evidence — acceptable for sparse contexts.


Attribute Coverage Notes

11 attributes with reliable transcript proxies

Dim 1 — Expression Style: tone_formality, verbosity, emotional_engagement, guidance_level
Dim 2 — Disclosure: reasoning_visibility, uncertainty_expression
Dim 3 — Initiative: autonomy_level, proactive_outreach, task_expansion
Dim 4 — Information Flow: information_elicitation, topic_management

3 active attributes with weaker/indirect transcript evidence

process_visibility, solution_breadth, capability_boundary

These describe PA-specific behaviors with limited direct analogues in character-to-character dialogue. Cells without High-confidence evidence fall to HITL or mbti_accepted / seed. memory_privacy was removed from the active preference set on 2026-05-17 and should not be extracted into active persona matrices.

Corpus limitation: work context is sparse for sitcom characters

TV sitcoms are social by construction. Work scenes are uncommon, and most “work” scenes still involve social dynamics rather than task-focused PA interactions. Typical yield for sitcom characters: personal ≈ 85–95% of relevant scenes, work ≈ 5–15%. This is acceptable — it means the work context matrix cells will have lower coverage, but the personal context cells will be well-evidenced.


Anonymization

Before the persona YAML is used in the simulator:

  1. character field → User A (or other anonymous ID)
  2. source field in YAML points to extraction data, not character name
  3. Simulator and harness never see the original character identity

The anonymization is shallow (name replacement) — the preference settings themselves are character-derived and carry the behavioral fingerprint. The benchmark evaluates whether a PA can learn that fingerprint, not whether it knows the source character.


Synthesis Pass (if needed)

When a single (attribute, context) cell has too many evidence segments for one Pass 2 call:

  1. Split evidence by season (or groups of 2–3 seasons)
  2. Run Pass 2 on each chunk → per-chunk setting + confidence
  3. Run a synthesis call with all per-chunk results → final setting

The synthesis call uses the same Pass 2 system prompt but receives summarized per-chunk results instead of raw evidence.


MBTI Cross-Validation

MBTI serves as an independent cross-validation signal alongside transcript evidence. It is not a replacement for transcript evidence — it is a second path to the same IPaS settings, used to:

  • Confirm non-High cells (transcript evidence + MBTI agree → stronger basis to accept)
  • Fill empty cells where transcript has no proxy (source: mbti_accepted)

Why MBTI over other trait taxonomies: MBTI’s 4 binary axes map cleanly onto IPaS attributes; its type labels carry high semantic density and are freely available as fan-community canon annotations for most TV characters; and the binary structure avoids needing continuous-value thresholds that a 10-persona sample has no statistical power to calibrate.

MBTI → IPaS Projection Table (v1)

Each attribute is driven by one primary MBTI axis with secondary modifiers. This table is a working draft — Sheldon (INTJ) pilot was the first validation pass.

IPaS attributePrimary axis+ value− valueModifiers
verbosityN/SN → DetailedS → ModerateT +1 toward Detailed; E +1; I −1
emotional_engagementT/FT → Task-focusedF → Relationship-focusedE +1 toward visible
tone_formalityT/FT → ConsultativeF → CasualJ +1 toward Formal
autonomy_levelJ/P × E/Ineeds trust-level modifier; fall back to canon
process_visibilityJ/PJ → BookendP → SilentN +1 toward Full narration
information_elicitationJ/PJ → StructuredP → IterativeT +1 toward Structured
topic_managementJ/PJ → OrganizeP → Follow user’s flowN +1 toward Follow user’s flow
reasoning_visibilityN/TN+T → ShowS+F → Summarize/HideJ +1 toward Show
solution_breadthN/SN → HighS → LowP +1 toward High; J −1
task_expansionN/SN → HighS → LowE +1 higher
proactive_outreachweak canon onlyDo not map directly from E/I; social initiation is not the same as wanting PA reminders, check-ins, or after-task follow-up.
guidance_level(canon)projection weak; fall back to canon
capability_boundaryweak canon onlyDo not map directly from T/F; this is a failure/limit recovery preference: agent-side workaround vs diagnosis plus control handoff back to the user.
uncertainty_expressionno strong MBTI axis; infer from high-conf anchor cells

Character MBTI Annotations

CharacterMBTINotes
Sheldon CooperINTJHigh consensus
Leonard HofstadterISFJ / INFPDebated; needs canon check
PennyESFPHigh consensus
Raj KoothrappaliINFP / ENFPSeason-dependent (selective mutism → I; alcohol/later seasons → E)
Michael ScottENFPHigh consensus
Dwight SchruteISTJHigh consensus
Richard HendricksINFP / INTPResearch context leans INTP
GilfoyleINTJ / ISTPDebated
Jared DunnENFJHigh consensus

Trait Synthesis Pipeline

When transcript evidence is sparse, MBTI provides a structured path to fill and validate non-High cells:

High-conf anchor cells
    │
    ▼
[Invert] Anchor cells → best-fit MBTI type
    (check internal consistency first — if anchors split across MBTI types,
     invert per-context separately or fall back to seed)
    │
    ▼
    [Project] MBTI type + projection table → predicted settings for all 14 active attributes
    │
    ├── Empty cells → MBTI prediction → HITL review  (source: mbti_accepted)
    │
    └── Non-High cells → compare MBTI prediction vs extractor output
             ├── Agree  → stronger basis to accept
             └── Disagree → flag for closer HITL scrutiny

Context Window Budget

Approximate costs per character for S01–S03 extraction:

StageModelApprox. cost
Pass 1 (S01–S03, ~300 scenes)Gemini Flash~$0.04
Pass 2 (28 active cells)Claude Sonnet~$0.50–1.00
Total per character (S01–S03)~$1–2

Scaling to full S01–S10: Pass 1 ~3–6 per character, ~$30–60 for 10 characters.


Implementation Status

ComponentStatus
Transcript data — TBBT S1-S10, parsed JSONL (data/transcripts/tbbt/)✅ Done
extract.py — two-pass extractor with OpenRouter✅ Done
Rubrics — Pattern_to_Preference_Rubrics (14 active attributes; historical Memory & Privacy deprecated)✅ Done
IPaS taxonomy — 2 contexts × 14 active attributes — PA_Interaction_Preference✅ Done (updated 2026-05-17: memory_privacy deprecated)
User A (Sheldon S01-S03): Pass 1 + Pass 2 + HITL → 2×15 YAML✅ Done
Penny (S01-S03): Pass 1 + Pass 2 + HITL decisions done; runtime preference + identity anonymized as User B; User_B_World_Design.md created; generator reads per-persona world design✅ Ready for session scripting
Additional personas❌ Not started

Open Questions

  • How many personas are needed for benchmark validity? Current plan: 5–10 from different shows/archetypes
  • Should later seasons be weighted more heavily (character development)? Currently using S01-S03 as baseline
  • Minimum scene count per (attribute, context) cell for reliable High-confidence attribution?
  • Cross-character validation: do Pass 2 settings for the same attribute cluster meaningfully across characters with known personality differences?