Building Psychohistory: A Data Science Journey
A Note on Methodology
This project uses polity duration as a proxy for stability. After further reading in the cliodynamics literature, I've come to understand this is a limited approach—duration is often determined arbitrarily and conflates different mechanisms (conquest, fragmentation, succession crises). More rigorous work uses direct instability measures like ruler transition outcomes.
I'm preserving this analysis as an exploratory starting point while working toward more theory-grounded methods. The patterns are real; the interpretation is evolving.
In Isaac Asimov's Foundation, mathematician Hari Seldon develops "psychohistory" a science that predicts the behavior of large civilizations over centuries. It's fiction. But what if I tried to build something like it?
I spent the past few months feeding 10,000 years of civilizational data into machine learning models. The goal wasn't to predict the future — it was to understand patterns in the past. What makes some civilizations last centuries while others collapse within decades?
A note on terminology: I use "short-duration" to describe polities lasting below the median (184 years). This isn't a value judgment — short-lived polities aren't "failures." They're data points that help us understand patterns in civilizational dynamics. This is exploratory analysis, not causal inference.
The Dataset: Seshat Global History Databank
This project uses the Seshat Equinox 2022 dataset — a systematic compilation of historical and archaeological data that I've filtered down to 256 polities across 10,000 years. Each civilization is coded for dozens of variables: administrative hierarchy, military technology, religious practices, infrastructure, and more.
The dataset represents a monumental effort by historians, archaeologists, and data scientists to quantify the qualitative. It's imperfect, all historical data is, but it's the most comprehensive attempt to make civilizational patterns analyzable.
The Model: From Chance to Signal
I started with a simple hypothesis: more complex societies should be more fragile. Bureaucracies calcify, elites extract, coordination costs balloon. This is Joseph Tainter's classic argument from The Collapse of Complex Societies.
The first model, using only complexity features, performed barely better than random chance. An AUC of 0.505 is essentially a coin flip.
Model Evolution
Then something interesting happened. Adding warfare technology variables jumped performance to 0.648. Adding religion pushed it to ~0.67 (CV mean). The model was learning something real.
The Three-Mechanism Model
The final model combines three mechanism categories. Hover over each to see how it contributes to predictions.
Arrows show correlational patterns, not confirmed causal effects.
Model Performance
A ~0.67 AUC (CV mean: 0.66 ± 0.06) won't predict specific civilizational fates. But it's strong enough to suggest these features capture genuine patterns in historical dynamics.
Critical caveat: When I tested temporal generalization (leave-one-era-out), performance dropped to 0.57 AUC—barely above chance. This means the model learns era-specific patterns, not universal laws. A model trained on Ancient/Classical/Medieval data struggles to predict Early Modern outcomes. The "rules" change across historical periods.
On model variance: Cross-validation shows AUC ranging from 0.51 to 0.76 depending on the data split (mean: 0.675 ± 0.09). With 256 samples, high variance is expected. The signal is real, but the precision isn't.
Finding #1: Religion Shows Surprising Associations
The most statistically robust result involves religion (survives FDR correction at p < 0.001). Religious variables collectively account for 27% of model decisions, making them the dominant feature category. But the direction is unexpected.
Feature Importance
How much each feature contributes to model decisions (not direction)
Note: Importance measures how often the model uses a feature for decisions, not whether high values help or hurt stability. Hover over bars for details.
However, feature importance doesn't tell us direction. When I analyzed the actual coefficients, the picture got more nuanced:
- Total religious institutionalization correlates with shorter duration (HR = 1.58)—this is counterintuitive and survives FDR correction. But correlation isn't causation: this could reflect confounding with era, literacy, or other unmeasured variables.
- Ideology sub-scores show context-dependent patterns but don't survive FDR correction individually (exploratory finding).
- The direction needs explanation: Why would more religion associate with shorter duration? Possible confounds: later eras have more documented religion AND shorter-lived polities. Or reverse causality: societies facing instability may elaborate religious frameworks for legitimacy.
The takeaway: religious factors matter enormously, but the relationship is nonlinear. It's not "more religion = more stability." It's conditional and context-dependent.
Finding #2: Era Trumps Geography
I expected civilizations to cluster by region — Mediterranean empires with Mediterranean empires, Chinese dynasties with Chinese dynasties. Instead, they cluster by time.
More importantly, the relationship between complexity and duration completely changes across historical periods:
The Era Effect
How complexity affects duration changes dramatically across historical periods
In the Ancient world, each unit of complexity reduced duration by ~159 years. By the Early Modern period, the relationship reversed.
In the Ancient world (pre-500 BCE), each unit of complexity reduced expected duration by ~159 years. By the Early Modern period (1500+ CE), the relationship had reversed — complexity slightly helped.
What changed? Possibly writing, institutional memory, military technology, trade networks — the infrastructure that lets complex societies maintain themselves. The "rules" of civilizational survival aren't fixed; they evolve with humanity's toolkit.
This era-dependence suggests that "complexity" as a single variable is too crude. More sophisticated frameworks distinguish between social scale (population, territory) and institutional capacity (bureaucracy, information systems)—which may have different, even opposing effects on stability depending on historical context.
Finding #3: Warfare Technology Matters
Adding military variables (weapons, fortifications, cavalry, armor) improved model performance by 28%. But the effects are mixed:
- Cavalry and armor show slight stabilizing effects
- Fortifications show slight destabilizing effects (possibly reflecting defensive postures of declining states)
- Total warfare tech slightly destabilizes on average
In the Ancient era, advanced warfare amplified the complexity curse. In the Classical period (500 BCE – 500 CE), it moderated it dramatically — a +0.634 moderation effect. Complex Classical societies with strong militaries outlasted their simpler neighbors.
The Classical era emerges as special across multiple analyses. This was the age of Rome, Han China, Persia — empires that combined bureaucratic sophistication with military innovation. Perhaps that combination, in that historical moment, represented a sweet spot.
Limitations (Honest Assessment)
- Sample size: 256 polities sounds like a lot until you stratify by era. Some subgroups have fewer than 50 cases.
- Selection bias: Seshat skews toward well-documented societies. We know more about Rome than about countless chiefdoms that left no written records.
- Causality unknown: These are correlations. We can't run experiments on civilizations. Reverse causality is always possible.
- Feature definitions: What counts as "ideological cohesion" in 2000 BCE Egypt vs 1500 CE Spain? Coding decisions shape results.
- Survivorship issues: We analyze polities that existed long enough to be recorded. The truly unstable ones may be invisible.
- Model variance: Cross-validation shows high variance (0.51–0.76 AUC, mean 0.66 ± 0.06). Temporal holdout (LOEO) yields only 0.57 AUC.
This isn't predictive science, it's pattern recognition in historical data. The model reveals correlations worth investigating, not laws of civilizational dynamics.
What This Isn't
- Not prediction: This model cannot forecast which modern nations will "collapse." Historical patterns don't transfer to contemporary societies with different institutions, technology, and global connectivity.
- Not causal inference: We found associations, not causes. "Religion correlates with shorter duration" doesn't mean religion causes instability — it could be confounded, reversed, or spurious.
- Not universal laws: The weak temporal holdout (LOEO AUC = 0.57) suggests era-specific patterns, not timeless rules. What mattered in 500 BCE may not matter in 1500 CE.
- Not definitive: After FDR correction, only 7 of 13 "significant" findings survive. Many results are exploratory and need independent replication.
Think of this as hypothesis generation, not hypothesis confirmation. The patterns here suggest directions for future research, not conclusions to act on.
Try It Yourself
The Polity Simulator lets you configure a hypothetical civilization and find historically similar societies. Pick an era, adjust complexity, warfare, and religion parameters, and see which polities from the database most closely match your configuration.
Is this true psychohistory? No. Asimov's fictional science could predict specific futures. This project can only identify patterns in the past—and as the weak temporal holdout shows, those patterns may not even generalize across eras.
Where This Goes Next
The limitations above aren't just caveats—they're research directions. Better target variables (direct instability measures rather than duration), better feature decomposition (separating scale from institutional capacity), and better validation (testing whether patterns hold outside the training period) would all improve this work. I'm actively learning from the cliodynamics literature to get there.