Find the race of a surname

predict_race(name, probability = TRUE)

Arguments

name

String or vector of strings of surname that you want to know the race of.

probability

If TRUE (default) will provide columns for each race with the probability that the surname is of that race. If FALSE, will only return the name, the match-name from the Census data, and the most likely race.

Value

A data.frame with three or nine columns: The first column has the name as inputted, the second column has the cleaned up name (no spaces or punctuation, all lowercase), the third column tells the likely race of the surname. If the parameter probability is false, these three columns are all that is returned. Otherwise, columns 4-9 tell the specific probability that the surname is each race.

Examples

predict_race("franklin")
#> name match_name likely_race probability_american_indian probability_asian #> 1 franklin franklin white 0.0083 0.0054 #> probability_black probability_hispanic probability_white probability_2races #> 1 0.3876 0.027 0.5438 0.0278
predict_race(c("franklin", "Washington", "Jefferson", "Sotomayor", "Liu"))
#> name match_name likely_race probability_american_indian #> 1 franklin franklin white 0.0083 #> 2 Washington washington black 0.0068 #> 3 Jefferson jefferson black 0.0190 #> 4 Sotomayor sotomayor white 0.0000 #> 5 Liu liu asian 0.0002 #> probability_asian probability_black probability_hispanic probability_white #> 1 0.0054 0.3876 0.0270 0.5438 #> 2 0.0030 0.8753 0.0254 0.0517 #> 3 0.0040 0.7424 0.0247 0.1745 #> 4 0.0067 0.0062 0.9071 0.0774 #> 5 0.9562 0.0017 0.0054 0.0179 #> probability_2races #> 1 0.0278 #> 2 0.0378 #> 3 0.0354 #> 4 0.0026 #> 5 0.0186
predict_race("franklin", probability = FALSE)
#> name match_name likely_race #> 1 franklin franklin white