One of the most challenging problems in data retrieval – search movies and extract object information to help me fine what I have asked for – “Get me all the movie scenes that have Brad Pitt on a horse”. While we are at it, make it a “brown horse”.
I had the opportunity to attend a colloquium this month, where Dr. Avi Silberschatz, Professor and Chair of the Computer Science department at Yale University gave a very entertaining and interesting talk on some research issues in computer science. The question I pose here was what he asked while giving the talk. But this is one of the most interesting and unsolved problem in data retrieval. I am sure Google among other is working on this problem.
Let us itemize these issue and see how we could solve them. I am not researching in this area so some of it may be naive or already tried.
The challenges of course are:
- Shortlising the movies: This is challenging as the size of the movies are so large and there are so many movies. For starters, if we say “Brad Pitt”, then we can look up the Meta (like IMDB) and shortlist the movies that he acted in, this is easy.
- Then comes the question of how to scan the movies and identify Brad Patt. I think one way is to create and index of the key frame of the movie, and use some AI to do image recognition on each frame (expensive and very difficult). We can use a social software, that can provide the data set to train the AI program that analyzes the frame.
- Some of the analysis can be done offline, i.e. we can analyse a the key frames and extract known objects (car, horse, cat, man, woman, etc) and index them. Further analysis of each object can associate addtional attributes to these know object like (white car, brown horse, etc).
Well really, there are many issues and as Dr. Avi said, if you do not want to graduate, this is a good problem to solve. How can I do it any justice in a small blog post! But it is a very interesting question. An really fun to work on..