← Back to Blog

Correlation Matrix: Which States Have the Most Similar Lottery Patterns

April 10, 2026  ·  5 min read  ·  Tool Guides

What Correlation Means Here

The Correlation Matrix measures the statistical similarity between pairs of US states based on their digit frequency distributions over the last 90 days (adjustable). Specifically, it computes the Pearson correlation coefficient between two states' frequency vectors — a list of how often each digit (0–9) appeared in each state's draws over the selected window. A correlation of +1.0 means both states have identical relative frequency patterns. A correlation of 0 means no linear relationship. A correlation of −1.0 means the patterns are perfectly inverse (when one digit is high in State A, it tends to be low in State B).

In practice, most state pairs will score between 0.3 and 0.9. Perfect +1.0 correlations only occur with very small samples. Negative correlations are theoretically possible but rare.

Reading the Matrix Heatmap

The Matrix tab displays a full N×N grid (N = number of states with sufficient data) where:

Clusters of warm cells in the matrix — a group of states forming a warm-colored sub-square — indicate a regional or structural similarity in how those states' lotteries have been producing digits over the selected window.

Similarity Ranker Tab

The Similarity Ranker tab provides a more practical interface for individual state research. Select any state from the dropdown and the page renders a ranked list of all other states sorted by cosine similarity score (a closely related but slightly different measure from Pearson correlation, optimized for this ranking task). The top states in the list are the ones whose digit frequency profiles most closely resemble your selected state's profile over the time window. The bottom states are the most dissimilar.

Cosine similarity scores range from 0 to 1 in this context. Scores above 0.85 indicate strongly similar distributions; scores below 0.5 indicate weak similarity.

High vs Low Correlation

High correlation between two states means their digit frequency patterns have been moving together — when digit 7 was high in State A, it tended to also be elevated in State B. Low or negative correlation means the states' patterns have been moving independently or in opposite directions. Neither is inherently better for prediction — it is purely descriptive of what the data shows over the selected window.

What Causes State Similarities

State lottery equipment, number generation methods, and draw schedules are all independent. High correlation is most likely statistical coincidence over a finite sample, particularly for short 30-day windows. Over 180-day windows, persistent high correlation between two states is more noteworthy but still does not imply a causal connection. Possible non-causal explanations include shared equipment vendors (some states use the same RNG or ball machine manufacturer), similar game configurations, or simply the law of large numbers producing temporarily similar samples.

Limitations

Correlation is a snapshot over the selected time window. States that appear highly correlated over 30 days may be uncorrelated over 180 days. Always check multiple time windows before drawing conclusions. The matrix also only reflects the games and draws in the database — states with sparse data will produce less reliable correlation scores. The Similarity Ranker flags states with low draw counts so you know when to treat the score with extra caution.

Disclaimer: This tool is for informational and entertainment purposes only. Lottery draws are random events and past results do not predict future outcomes. Play responsibly.

Explore more with our free analytics tools:

Open Draw Analytics Dashboard →