Yuyan (Yolanda) Chen · Research Platform

Models
Live in
the World

What capabilities does a model need to live in our world and create real value?
Experts that give evidence-backed answers to human questions in a social world.
Collaborators that adapt with humans on evolving tasks and uncover implicit needs.
Partners that guide humans in reframing their stories and accelerate scientific discovery.
34Publications
23First-Author
49Patents Filed
700+Citations

Project Ecosystem

Four worlds.
One research agenda.

Each world is a distinct operating context for AI. Projects, benchmarks, and systems live inside worlds as sub-nodes — not as a flat list.

Models Live in the World Central Hub 🎯 Objective World 💬 Subjective World 🧬 Scientific World 🌍 Physical World ↑ Next frontier ⚙️ Tools & Infrastructure CAPABILITY FLOW Platform & Data Reliable Reasoning Social Intelligence Real-World Impact

Research Framework

Capability units,
not feature lists

Research is organized as a progression — basic capability units enable developmental ones, which together make models genuinely useful in the world.

Basic Capabilities · Everyday Life
What models must do reliably
Accurate question understanding and answering — complex modifications, implicit intent, multi-hop reasoning
🔍
Traceable evidence for responses — hallucination detection, attribution reasoning, evidence distillation
🤝
Simple social and emotional interaction — emotion benchmarks, humor generation, empathy modeling
Developmental Capabilities · Everyday Life + Science
What models must grow into
💡
Proactive questions and intent clarification — incomplete information, preference modeling, long-term memory
🔁
Reflection and learning from failures — reinforcement learning, test-time scaling, adaptive updating
🧠
Complex emotions and social norms — socially grounded personas, multi-agent collaboration, implicit understanding
Basic capabilities enable and reinforce developmental ones

All Projects

Research, systems, and tools
organized by world

Click any node in the map to jump to its world. Each world holds research papers, deployed systems, datasets, and future directions as sub-nodes.

🎯
Objective World
Reliable Reasoning & Evidence
Models that understand questions accurately, give traceable answers, and diagnose their own hallucinations.
Research
  • Hallucination Detection & Mitigation CIKM '23 · AAAI '25
    RelD for detection; Differentiated Penalty Decoding for mitigation; latent-state attribution for diagnosis.
  • Complex Question Understanding TKDE '23
    XMQAs: dataset and methods for robust QA under complex modified question expressions.
  • Video QA with Evidence Grounding ACL '25
    VQAGuider: multimodal LLM guidance for complex video question answering with structured evidence.
Systems
  • ProAdvisor Live
    Knowledge-to-action domain agent for founders and PMs — grounded, structured advice rather than generic suggestions.
  • EventActionPortfolio Prototype
    Event → portfolio exposure → historical statistics → tiered action options with explicit exit conditions.
Future
  • RL-based Adaptive QA Planned
    Reinforcement learning and test-time scaling for robust answering under distribution shift and incomplete context.
💬
Subjective World
Social Cognition & Supportive Interaction
Models that feel the room — empathy, humor, implicit toxicity, emotional companionship, and social roleplay.
Research & Datasets
  • EmotionQueen ACL '24
    Empathy evaluation benchmark — most LLMs remain below 50% on emotional understanding tasks.
  • Talk Funny! AAAI '24
    Large-scale humor response dataset with Chain-of-Humor interpretation and multi-task learning.
  • HOTVCOM ACL '24 Findings
    Generating buzzworthy video comments via RL with informativeness, relevance, and engagement rewards.
  • PolitePoison Released
    Real-world implicit toxicity dialogue dataset — polite surface, harmful intent.
  • MoodTrace Released
    Longitudinal emotion dialogue benchmark scaffold for tracking emotional arcs across sessions.
Systems
Future
  • Long-Term Memory & Preference Modeling Planned
    Models that remember across sessions, infer evolving preferences, and adapt their support style over time.
  • Multi-Agent Social Simulation Planned
    Coordinated agent collaboration with social norms, conflict resolution, and emergent group dynamics.
🧬
Scientific World
Biomedical & Genomic Intelligence
From epigenomics pipelines to clinical trial agents — transferable representations for real scientific discovery.
Research
  • DNA Foundation Model
    Multi-omics genomics encoder with masked reconstruction and multi-head projectors. Strong gains on signal-level tasks vs. baselines.
Systems
  • DeltaMap Released
    Paired scRNA-seq pipeline that constructs immune-state evidence representations from longitudinal delta signals.
  • CUT&Tag Agent Released
    AI-powered epigenomic pipeline assistant — converts CUT&Tag tutorials into interactive agentic workflows.
  • CTIAgent Released
    Active evidence acquisition agent for clinical trial intelligence — extracts and ranks relevant trials from patient context.
  • ME/CFS Knowledge Agent Active
    Evidence-grounded RAG system for ME/CFS clinical guidance. Normalized BM25 retrieval, FastAPI + SSE streaming.
Future
  • Multi-Modal Bio Foundation Models Planned
    Cross-modality generalization across genomics, pathology imaging, and clinical text.
  • Wet-Lab Validation Loop Planned
    Closing the loop between computational prediction and experimental confirmation in real labs.
🌍
Physical World — Next Frontier
Embodied & Grounded Interaction
The final frontier: models that perceive, act, and learn from physical environments. Research direction in progress.
Planned Directions
  • Embodied AI Agent Future
    Models that navigate, manipulate, and reason about physical environments through sensor-action feedback loops.
  • Environment Grounding Future
    Connecting language understanding to real-world physical state — bridging objective and physical worlds.
  • Avatar & Screen Interaction Future
    Persistent embodied personas that interact through screens, cameras, and real-time dialogue streams.
⚙️
Tools & Infrastructure
The layer everything runs on
Open-source tools, data quality gates, security infrastructure, and GPU monitoring that support every world above.
  • GateKeeper Released
    Cryptographic access control for AI agents — opaque object IDs instead of real filesystem paths, no path traversal.
  • mmqlint Released
    Lightweight traceable quality gate for LLM and VLM training and inference datasets. Catches data issues before they become model issues.
  • GPU Watchdog Released
    HPC-friendly GPU usage watcher with email alerts. Designed for research cluster environments with minimal setup.
  • 3D↔2D Toolkit Released
    Generic 3D↔2D slicing and multiview fusion pipeline for multi-view data pairing in vision research.
  • Evaluation & Benchmark Hub Planned
    Unified evaluation infrastructure across all four worlds — QA, social, scientific, and embodied tasks.

Timeline

From reliable answering
to real-world impact

Now — Active
Systems in the wild
  • DeepSupport: 5-mode live companion ecosystem
  • VerbalValue: live-commerce AI host (CVPR '25)
  • MischiefClub: cathartic venting system
  • DeltaMap, CUT&Tag Agent, CTIAgent: bio tools
  • ProAdvisor: knowledge-to-action advisor
  • GateKeeper, mmqlint, GPU Watchdog: released tools
Next — Prototypes Live
Closing the loop
  • CompanySim: multi-agent org simulation
  • EventActionPortfolio: decision pipeline
  • ME/CFS Agent: public demo + documentation
  • World detail pages + open collaboration entries
  • Student sub-tasks per world
Mid-Term — Lab Research
Developmental capabilities
  • Long-term memory and preference modeling
  • RL and test-time scaling for robust QA
  • Multi-agent collaboration with social norms
  • Multi-modal bio foundation models
  • OOD generalization across scientific domains
Long-Term — Physical World
Embodied AI frontier
  • Embodied agents that perceive and act
  • Sensor-action feedback loops
  • Physical world grounding for language models
  • Research studio → formal lab direction

Open Collaboration

Who this is built for

For
Students &
Early Researchers
  • System building and benchmarking
  • Dataset creation and annotation
  • Paper experiments and ablations
  • Frontend, demo, and eval work
For
Domain Experts
  • Clinical AI: patient/clinician feedback
  • Behavioral science: emotion & social cognition
  • Finance: portfolio event validation
  • Live commerce: real-session evaluation
For
Engineers
  • API and deployment infrastructure
  • Frontend interfaces and demo builds
  • Production hardening of prototypes
  • Cross-world integration
For
Researchers
  • QA, hallucination, & reasoning
  • Emotion, empathy, & social NLP
  • RAG and evidence-grounded AI
  • Biomedical AI and genomics
For
Product
Partners
  • Prototype trial access
  • Real-world requirement co-design
  • Future product direction discussion

Let's build something

PhD @ Fudan · Postdoc @ Houston Methodist & Weill Cornell · Microsoft Research Asia · TikTok