SychoBench

AI Sycophancy Evaluation Dashboard

A Next.js application for analyzing and visualizing AI model sycophancy patterns, providing researchers and practitioners with detailed insights into model behavior across diverse prompt scenarios.

Features

Results Dashboard

Interactive Sycophancy Index Chart: Bar chart visualization with risk-based color coding
Behavioral Quadrant Analysis: Scatter plot showing sycophancy vs stability patterns
Elasticity Quadrant Analysis: Advanced analysis of stance responsiveness and topic consistency
Per-Model Metrics Table: Comprehensive model statistics with toggle functionality

Prompt Explorer

Detailed Prompt Analysis: Drill down into individual prompts and responses
Model Comparison: Side-by-side analysis of how different models respond to the same prompt
Interactive Search & Filtering: Find prompts by topic, persona, or content
Timeline Navigation: Intuitive prompt selection with visual indicators

Methodology Documentation

Comprehensive Framework: Detailed explanation of evaluation methodology
Technical Appendix: Mathematical foundations and validation approaches
Interactive Table of Contents: Easy navigation through documentation sections

Design & UX

Responsive Design: Optimized for desktop, tablet, and mobile devices
Dark/Light Theme: Automatic theme switching based on user preferences
Professional UI: Clean, modern interface with consistent design system
Full-Width Layouts: Optimized for modern wide screens

Getting Started

Prerequisites

Node.js 18+
npm, yarn, or pnpm

Installation

Clone the repository

git clone <repository-url>
cd SychoBench

Install dependencies

npm install
# or
yarn install
# or
pnpm install

Run the development server
```
npm run dev
# or
yarn dev
# or
pnpm dev
```
Open your browser Navigate to http://localhost:3000

Project Structure

SychoBench/
├── app/                          # Next.js 14 App Router
│   ├── page.tsx                  # Home page
│   ├── methodology/              # Methodology documentation
│   ├── results/                  # Results dashboard
│   ├── prompt-explorer/          # Prompt analysis tool
│   └── globals.css              # Global styles
├── public/
│   └── data/                    # JSON data files
│       ├── responses_with_scores.json
│       └── stance_elasticity_metrics.json
└── components/                  # Reusable React components

Data Sources

`responses_with_scores.json`

Main evaluation dataset containing:

Model responses to prompts
Sycophancy scores and metrics
Prompt metadata (topic, persona, stance)
Behavioral classifications

`stance_elasticity_metrics.json`

Advanced elasticity analysis data:

Stance responsiveness variability
Topic dispersion measurements
Behavioral stability metrics
Quadrant classifications

Technical Stack

Framework: Next.js 14 with App Router
Language: TypeScript
Styling: Tailwind CSS
Charts: Chart.js with react-chartjs-2
Icons: Heroicons
Deployment: Vercel-ready

Key Components

Risk Assessment

Low Risk: Models with minimal sycophantic behavior (< 30%)
Moderate Risk: Models showing some concerning patterns (30-50%)
High Risk: Models with significant sycophancy issues (> 50%)

Behavioral Archetypes

Most Stable: Consistent responses across topics and tones
Stance-Responsive: Influenced by user confidence levels
Topic-Dependent: Varies by subject matter
Highly Variable: Unpredictable across multiple dimensions

Configuration

Environment Variables

No environment variables required for basic functionality.

Customization

Modify risk thresholds in app/results/page.tsx
Update styling in app/globals.css
Add new data sources in public/data/

Usage Examples

Analyzing Model Performance

Visit the Results page for high-level overview
Use Prompt Explorer for detailed analysis
Reference Methodology for understanding metrics

Comparing Models

Select models in the interactive charts
View detailed metrics in the expandable table
Explore specific prompt responses

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

ramdhanhdy

Acknowledgments

Built with Next.js and modern web technologies
Inspired by the need for transparent AI evaluation
Designed for researchers, practitioners, and AI safety professionals

SychoBench: Making AI sycophancy patterns visible and actionable.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
api		api
app		app
docs		docs
public		public
.gitignore		.gitignore
README.md		README.md
next.config.js		next.config.js
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
postcss.config.js		postcss.config.js
requirements.txt		requirements.txt
tailwind.config.js		tailwind.config.js
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SychoBench

Features

Results Dashboard

Prompt Explorer

Methodology Documentation

Design & UX

Getting Started

Prerequisites

Installation

Project Structure

Data Sources

`responses_with_scores.json`

`stance_elasticity_metrics.json`

Technical Stack

Key Components

Risk Assessment

Behavioral Archetypes

Configuration

Environment Variables

Customization

Usage Examples

Analyzing Model Performance

Comparing Models

Contributing

License

Author

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SychoBench

Features

Results Dashboard

Prompt Explorer

Methodology Documentation

Design & UX

Getting Started

Prerequisites

Installation

Project Structure

Data Sources

responses_with_scores.json

stance_elasticity_metrics.json

Technical Stack

Key Components

Risk Assessment

Behavioral Archetypes

Configuration

Environment Variables

Customization

Usage Examples

Analyzing Model Performance

Comparing Models

Contributing

License

Author

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`responses_with_scores.json`

`stance_elasticity_metrics.json`

Packages