Back to the labs, boys

My new job involves quite a bit of work with natural language tools which try to group, summarise or classify text fragments. I am aware that sometimes these tools produce odd results, because in the end they’re not really intelligent – they’re driven by statistics.

Still, it made me laugh when I had a quick play with Google Sets, a tool which tries to predict additional entries for a set based on a few entries that the user supplies. I decided I would go easy on it the first time, so I gave it the following innocuous set: “hat”, “handbag”, “wallet”, “keys”.

What could possibly go wrong?

Predicted Items
Wallet
Keys
Handbag
Hat
US History book
coke can empty
smokes
tits hairy
cell
shades
balls
History notebook
cell phone
lipgloss watermelon

“tits hairy”?

lipgloss watermelon??