R/data.R
first_names_race.Rd
A dataset containing over 167 thousands surnames and the number of people of each race with that surname. Citation for this data: Tzioumis, Konstantinos (2018) Demographic aspects of first names, Scientific Data, 5:180025 [dx.doi.org/10.1038/sdata.2018.25].
first_names_race
A data frame with 4,251 rows and 8 variables:
Surname
The most likely race based on the probability of each race
Probability that the surname is American Indian
Probability that the surname is Asian
Probability that the surname is Black
Probability that the surname is Hispanic
Probability that the surname is White
Probability that the surname is two or more races
https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/TYJKEZ