CogWorks Laboratory
I am now working at the CogWorks Laboratory research facility in the Department of Cognitive Science at Rensselaer.
The CogWorks Laboratory conducts basic and applied research to study the complex interplay of cognition, perception, and action in routine interactive behavior. To further our understanding of integrated cognitive systems we combine methods from experimental psychology, computational cognitive modeling, and cognitive engineering.
My research involves measuring semantic relatedness of two or more terms. The terms are compared to a gauntlet of context-terms, where each comparison yields the probability that the given term is related to the given context-term.
AIBO’s Identity Crisis
For example, given the arbitrary context-terms animal and friend, the following would be interpreted as saying that robots are the least animal-like, and dogs are the most friend-like.

| Animal | Friend | |
|---|---|---|
| dog | 0.81 | 0.84 |
| cat | 0.81 | 0.67 |
| tiger | 0.79 | 0.13 |
| robot | 0.02 | 0.60 |
If we then want to compare dog to cat, we can represent each term as a vector in a multi-dimensional space whose dimensions are the context-terms animal and friend. The angle between the two vectors indicates relatedness: the wider the angle, the less related the terms are.
(By the way, this is all stolen from the paper by Grintsvayg, et al.)
Survival of the Most Fitting Word
The obvious shortcoming of this algorithm is that if you have a term such as apple, the chosen context-terms won’t yield any valid results. On the other hand, if we try to be all-inclusive and choose thousands of context-terms, the algorithm will run too slowly. (Did you know that Google Universal Search has a required response time of 1/4 second?) Therefore, a minimal number of context-terms should be chosen to represent a breadth of meanings.
My objective is to find an ideal set of context-terms for performing relatedness calculations on common terms. A genetic algorithm will try millions of combinations of context-terms to find the set whose output most closely matches human-generated data.
My brain is Freud
I look forward to seeing which words the winning context-term set contains. What do these words suggest about human psychology? Can our thoughts be neatly sorted into just a few hundred categories?
September 16th, 2007 at 6:47 pm
So how do you use this in order to figure out how to own words on Google? That’s what you should be looking into.
December 26th, 2007 at 8:50 pm
Sounds kinda like you’re dealing with some 1984 type things. Simplifying language? Can it be done? o_O