## The bias of the unbiased

A hilarious paper from Stanford shows the bias of the unbiased [ Proc. Natl. Acad. Sci. vol. 115 pp. E3635 – E3644 ’18 ].  No one wants to be considered biased or to use stereotypes, but this paper indicts all of us.  They use a technique called word embedding to look at a large body of printed material (Wikipedia, Google news articles etc. etc.) over the past 100 years, to look for word associations  -e.g. male trustworthy female submissive and the like. In word embedding models, each word in a given language is associated with a high dimensional vector (not clear to me how the dimensions are chosen) and the metric between the words is measured.  A metric is simply a mathematical device that takes two objects and associates a number with them.  The distance between cities is a good example.

The vector for France is close to vectors for Austria and Italy.  The difference between London and England (obtained by subtracting them) is parallel to the difference between to the difference between Paris and France.  This allows embeddings to capture analogy relationships such as London is to England as Paris is to France.

So word embeddings were used as a way to study gender and ethnic stereotypes in the 20th and 21st centuries in the USA.  Not only that but they plotted how the biases changed over time.

So in your mind the metric between bias == bad, stereotype == worse is clear

So just as women’s occupations have changed so have the descriptors of women.  Back in the day women, if they worked out of the home at all, were teachers or nurses.  A descendent of Jonathan Edwards was a grade school teacher in the town of my small rural high school.

As women moved into the wider workforce from them the descriptors of them changed.  The following is a pair of direct quotes from the article.”

“More importantly, these correlations are very similar over the decades, suggesting that the relationship between embedding bias score and “reality,” as measured by occupation participation, is consistent over time” ….”This consistency makes the interpretation of embedding bias more reliable; i.e., a given bias score corresponds to approximately the same percentage of the workforce in that occupation being women, regardless of the embedding decade.”

English translation:  As women’s percentage of workers in a given occupation changed the ‘bias score’ changed with it.

So what the authors describe and worse, define, as bias and stereotyping is actually an accurate perception of reality.  We’re all guilty.

The authors are following Humpty Dumpty in Alice in Wonderland  — ““When I use a word,” Humpty Dumpty said, in rather a scornful tone, “it means just what I choose it to mean—neither more nor less.” “The question is,” said Alice, “whether you can make words mean so many different things.” “The question is,” said Humpty Dumpty, “which is to be master—that’s all.”

I find the paper hilarious and an example of the bias of the supposedly unbiased.