Methodology
Full transparency about what we measure, how we score, and where our boundaries are. No black boxes.
What this test measures
This test measures Fluid Intelligence (Gf) — the ability to reason with novel, abstract stimuli independent of prior knowledge. Items are procedurally generated Raven-style progressive matrices, presented adaptively based on your response accuracy.
The ability assessed:
- Fluid reasoning — identifying patterns in novel, abstract stimuli without relying on prior knowledge
Not measured: crystallized knowledge, working memory, spatial rotation, processing speed, creativity, or personality. These require different instruments.
Your score reflects performance on fluid reasoning tasks under self-administered conditions. It is not a clinical IQ diagnosis and should not be treated as one.
Theoretical foundation
CHC Model (Cattell–Horn–Carroll)
The test is grounded in the Cattell–Horn–Carroll (CHC) theory — the most widely accepted and empirically supported framework for understanding cognitive abilities. The CHC model organizes intelligence into three hierarchical strata: a general factor (g), broad abilities (Gf, Gc, Gv, Gwm, Gs, and others), and narrow abilities measured by individual tasks. It emerged from decades of factor-analytic studies and underpins instruments like the Woodcock–Johnson, KABC-II, and the DAS-II.
Broad abilities covered
Gf — Fluid Intelligence
Reasoning with novel stimuli, inductive and deductive logic, pattern completion
How this compares to clinical assessment
Gold-standard instruments like the WAIS-IV are administered one-on-one by a licensed psychologist, take 60–90 minutes, and involve verbal interaction, manipulatives, and adaptive item selection. Our test is self-administered online in roughly 5 minutes. We cover similar broad constructs but under less controlled conditions and with fewer items per subscale. This means our results are best understood as a reliable screening estimate — not a replacement for clinical evaluation.
| Clinical tests (e.g. WAIS-IV) | Our test |
|---|---|
| VCI — Verbal Comprehension | Crystallized Reasoning |
| PRI — Perceptual Reasoning | Fluid + Spatial Reasoning |
| WMI — Working Memory | Working Memory |
| PSI — Processing Speed | Processing Speed |
Test structure
8–20
Adaptive items (precision-driven)
1
Subscale
| Subscale | Fluid Reasoning |
| Item type | Raven Progressive Matrices (procedurally generated) |
| Items per session | 8–20 (precision-driven stop) |
| Scoring method | Bayesian IRT 2PL |
Duration: approx. 5 minutes
Scoring and interpretation
Raw scores are converted to a standardized IQ-equivalent scale using age-group–specific norms:
100
Population mean (M)
15
Standard deviation (SD)
Every result is reported with a confidence interval (±5 points) reflecting the standard error of measurement — the range in which your true score most likely falls.
The percentile rank tells you what percentage of same-age test-takers scored below you. A percentile of 75 means you scored higher than 75% of your age group.
Scoring uses age-specific norms based on CHC fluid intelligence research. Results are expressed relative to your age group (M=100, SD=15). Norms are continuously refined as empirical data is collected per age bracket.
Reliability and psychometric quality
Psychometric indicators are computed live from all completed sessions and will stabilize as the sample grows. We publish Cronbach's alpha per subscale and overall — a widely used measure of internal consistency in psychological testing.
Item development and quality control
Every item goes through a rigorous multi-stage pipeline before entering the live test:
Design: items are authored to target specific CHC narrow abilities at defined difficulty levels (1–5 scale)
Pilot: items are deployed to an initial sample and flagged if response patterns are anomalous
Calibration: item difficulty and discrimination indices are computed; items outside acceptable ranges are revised or discarded
Balance: the final pool is checked for even difficulty distribution across subscales — no subscale is skewed easy or hard
Monitoring: live item statistics are tracked continuously; degraded items are flagged for review automatically
Limitations and ethical use
No test is perfect. Online self-administration introduces variables that clinical settings control for.
Factors that can influence results:
- Device and connection quality — latency can affect timed items; a stable desktop is ideal
- Testing environment — distractions, noise, and interruptions reduce accuracy
- State factors — fatigue, stress, medication, and motivation affect cognitive performance
- Practice effects — prior exposure to similar tests can inflate scores by 3–7 points
This test is a screening tool. It provides a reliable estimate of performance on specific cognitive tasks but is not a clinical assessment. It should not be used for: psychiatric diagnosis, employee selection, educational placement, or legal proceedings. If you need a formal evaluation, consult a licensed psychologist.
References
Our methodology is informed by foundational and contemporary research in differential psychology:
- Carroll, J. B. (1993). Human Cognitive Abilities: A Survey of Factor-Analytic Studies. Cambridge University Press.
- Cattell, R. B. (1963). Theory of fluid and crystallized intelligence. Journal of Educational Psychology, 54(1), 1–22.
- Horn, J. L., & Cattell, R. B. (1966). Refinement and test of the theory of fluid and crystallized intelligence. Journal of Educational Psychology, 57(5), 253–270.
- McGrew, K. S. (2009). CHC theory and the human cognitive abilities project. Intelligence, 37(1), 1–10.
- Deary, I. J. (2012). Intelligence. Annual Review of Psychology, 63, 453–482.
Test version: 1.0 | Last updated: 2026-04-01