Labeling Images for FUN!!!. Yan Cao, Chris Hinrichs. How do you improve Learning systems?. Get more processing power. (Faster computers, more memory, more parallel.) Find a more sophisticated algorithm. Get lots and lots of quality data. Why Manually label Images?.
Yan Cao, Chris Hinrichs
Russel et. al. MIT CSAILab
25th, 50th, and 75th percentile by polygon count of come common object types.
We can learn something about the way people take pictures from the distribution of where objects are located. Generally, people are standing when they take pictures.
Torralba et. al. http://people.csail.mit.edu/torralba/tinyimages/
The humans did much better at 32x32 resolution than the best recognition algorithms did at full resolution.
Note that for color images, the humans’ accuracy levels off at 32x32. For grayscale, the same happens at 64x64.
32x32x3 dimensions for color images, 32x32x4 dimensions for grayscale with very nearly the same accuracy, so ~3000 dimensions needed for recognition.
Shift: Allow each pixel to shift in a 5x5 window, and take the best SSD from that. (Crude approximation of general warping.)Warp & Shift
Do an image search on, say, “person”, on any image retrieval engine. Then find the correlation with the search term with the neighbor set of each image returned, and rank them based on the strength of the correlation with the original search terms.
Images matched with the Wordnet node “person” and their nearest neighbors. Note that the neighbors match the part of the person shown in the query image, and their poses and color of clothing.
Here, the system only returns whether the best match passes through the “person” internal node.
The internet has a large bias towards images with people in them, so not all applications of this method will work with things that are not people.
Given a portion of an image, we can find its neighbors, and measure the correlation with “person” in that set.
Extending this, we can find the portion of a query image whose neighbor set has the highest correlation with “person”. This region is very likely to have a person in it.
Given a query image, (grayscale,) find its neighbor set, and take the average color of the set. Then apply that coloring to the grayscale image. Surprisingly, this works, especially given that not all neighbor images are even of the same type of object!