A dataset containing over 167 thousands surnames and the number of people of each race with that surname.
surnames_race
A data frame with 167,408 rows and 8 variables:
Surname
The most likely race based on the probability of each race
Probability that the surname is American Indian
Probability that the surname is Asian
Probability that the surname is Black
Probability that the surname is Hispanic
Probability that the surname is White
Probability that the surname is two or more races
https://www.census.gov/topics/population/genealogy/data/2010_surnames.html https://www.census.gov/topics/population/genealogy/data/2000_surnames.html