A dataset containing over 167 thousands surnames and the number of people of each race with that surname.

`surnames_race`

A data frame with 167,408 rows and 8 variables:

- name
Surname

- likely_race
The most likely race based on the probability of each race

- probability_american_indian
Probability that the surname is American Indian

- probability_asian
Probability that the surname is Asian

- probability_black
Probability that the surname is Black

- probability_hispanic
Probability that the surname is Hispanic

- probability_white
Probability that the surname is White

- probability_2races
Probability that the surname is two or more races

...

https://www.census.gov/topics/population/genealogy/data/2010_surnames.html https://www.census.gov/topics/population/genealogy/data/2000_surnames.html