QT:{{”
…second, they thought that they had devised a method for
communicating with such “locked-in” people by detecting their unspoken thoughts.
…
Osgood became known not for the results of his surveys but for the method he invented to analyze them. He began by arranging his data in an imaginary space with fifty dimensions—one for fair-unfair, a second for hot-cold, a third for fragrant-foul, and so on. Any given concept, like tornado, had a rating on each dimension—and, therefore, was situated in what was known as high-dimensional space. Many concepts had similar locations on multiple axes: kind-cruel and
honest-dishonest, for instance. Osgood combined these dimensions. Then he looked for new similarities, and combined dimensions again, in a process called “factor analysis.”
When you reduce a sauce, you meld and deepen the essential flavors. Osgood did something similar with factor analysis. Eventually, he was able to map all the concepts onto a space with just three dimensions. The first dimension was “evaluative”—a blend of scales like good-bad, beautiful-ugly, and kind-cruel. The second had to do with “potency”: it consolidated scales like large-small and strong-weak. The third measured how “active” or “passive” a concept was. Osgood could use these three key factors to locate any concept in an abstract space. Ideas with similar coördinates, he argued, were neighbors in meaning.
For decades, Osgood’s technique found modest use in a kind of personality test. Its true potential didn’t emerge until the nineteen-eighties, when researchers at Bell Labs were trying to solve what they called the “vocabulary problem.” People tend to employ lots of names for the same thing. This was an obstacle for computer users, who accessed programs by typing words on a command line.
…
They updated Osgood’s approach. Instead of surveying undergraduates, they used computers to analyze the words in about two thousand technical reports. The reports themselves—on topics ranging from graph theory to user-interface design—suggested the dimensions of the space; when multiple reports used similar groups of words, their dimensions could be combined. In the end, the Bell Labs researchers made a space that was more complex than Osgood’s. It had a few hundred dimensions. Many of these dimensions described abstract or “latent” qualities that the words had in common—connections that wouldn’t be apparent to most English speakers. The researchers called their technique “latent semantic analysis,” or L.S.A.
…
In the following years, scientists applied L.S.A. to ever-larger data sets. In 2013, researchers at Google unleashed a descendant of it onto the text of the whole World Wide Web. Google’s algorithm turned each word into a “vector,” or point, in high-dimensional space. The vectors generated by the researchers’ program, word2vec, are eerily accurate: if you take the vector for “king” and subtract the vector for “man,” then add the vector for “woman,” the closest nearby vector is “queen.” Word vectors became the basis of a much improved Google Translate, and enabled the auto-completion of sentences in Gmail. Other companies, including Apple and Amazon, built similar systems. Eventually, researchers realized that the “vectorization” made popular by L.S.A. and word2vec could be used to map all sorts of things.
“}}
I was also very impressed with how the article explained concepts related to LSA and word2vec. Thought it was interesting that they were derived, in a sense, from Charles Osgood’s seminal work.
https://www.newyorker.com/magazine/2021/12/06/the-science-of-mind-reading