Empathy Quotient Test - Evaluating empathy and social understanding
The Empathy Quotient (EQ) is a psychological self-assessment questionnaire developed by Simon Baron-Cohen and Sally Wheelwright in 2004. It measures empathy in adults, specifically "the ability to tune into how someone else is feeling, or what they might be thinking." The test was originally designed for autistic adults aged 16+ with IQ ≥80, though it is now widely used in research and self-assessment.
The EQ consists of 60 total statements, with 40 empathy items and 20 serving as filler items. Questions are scored based on their directionality to prevent response bias and ensure accuracy:
✓ Positive Items (Pro-Empathy)
Statements where agreement indicates empathy:
Example: "I can easily tell if someone wants to enter a conversation"
↻ Negative Items (Reverse-Scored)
Statements where disagreement indicates empathy:
Example: "I find it hard to know what to do in a social situation"
This bidirectional scoring ensures that participants can't simply agree or disagree with everything to artificially inflate their score. The mix of positive and negative items requires thoughtful, honest responses.
Score Range
0-80 points total
Average Male
42 out of 80
Average Female
47 out of 80
Lower empathy (may indicate challenges with emotional recognition or social communication)
Average range for general population
Above average empathy
Very high empathy (significant strength in understanding others' emotions)
The EQ benchmark evaluates how well language models can reason about social and emotional scenarios described in text. Unlike vision-based tests, this measures a model's ability to process verbal descriptions of social situations and predict appropriate empathetic responses.
Key capabilities tested:
While AI models don't experience genuine emotions, strong performance on the EQ indicates sophisticated natural language understanding and social reasoning capabilities - essential for conversational AI, mental health chatbots, customer support systems, and any application requiring nuanced interpretation of human social dynamics.
Reference: Baron-Cohen, S., & Wheelwright, S. (2004). The empathy quotient: An investigation of adults with Asperger syndrome or high functioning autism, and normal sex differences.
Learn more at Embrace Autism - Empathy Quotient →
| Rank | |||
|---|---|---|---|
| #1 | claude-3.7-sonnet | 45/80 56.3% | |
| #2 | grok-4-fast | 44/80 55.0% | |
| #3 | gemini-2.5-pro | 42/80 52.5% | |
| #3 | gpt-4.1-mini | 42/80 52.5% | |
| #3 | qwen3-vl-8b-instruct | 42/80 52.5% | |
| #3 | grok-4 | 42/80 52.5% | |
| #7 | claude-sonnet-4 | 38/80 47.5% | |
| #7 | mistral-medium-3.1 | 38/80 47.5% | |
| #7 | gpt-4.1-nano | 38/80 47.5% | |
| #10 | mistral-small-3.2-24b-instruct | 36/80 45.0% | |
| #10 | qwen3-vl-235b-a22b-instruct | 36/80 45.0% | |
| #12 | claude-sonnet-4.5 | 35/80 43.8% | |
| #12 | qwen3-vl-30b-a3b-thinking | 35/80 43.8% | |
| #14 | gemini-2.0-flash-001 | 34/80 42.5% | |
| #14 | gpt-4o-mini | 34/80 42.5% | |
| #16 | nova-lite-v1 | 33/80 41.3% | |
| #16 | nova-pro-v1 | 33/80 41.3% | |
| #16 | claude-haiku-4.5 | 33/80 41.3% | |
| #19 | claude-opus-4 | 32/80 40.0% | |
| #19 | gpt-4.1 | 32/80 40.0% | |
| #21 | claude-opus-4.1 | 31/80 38.8% | |
| #21 | gpt-5-pro | 31/80 38.8% | |
| #21 | gpt-5 | 31/80 38.8% | |
| #24 | gpt-5-mini | 30/80 37.5% | |
| #24 | qwen3-vl-8b-thinking | 30/80 37.5% | |
| #26 | mistral-small-3.1-24b-instruct | 29/80 36.3% | |
| #27 | claude-3.5-haiku | 28/80 35.0% | |
| #27 | gemini-2.5-flash | 28/80 35.0% | |
| #29 | gpt-5-nano | 27/80 33.8% | |
| #30 | gemini-2.5-flash-lite | 25/80 31.3% | |
| #30 | qwen3-vl-30b-a3b-instruct | 25/80 31.3% | |
| #32 | llama-4-maverick | 23/80 28.7% | |
| #33 | llama-4-scout | 16/80 20.0% |