Large Scale Concept Ontology for Multimedia
The Large-Scale Concept Ontology for Multimedia project was a series of workshops held from April 2004 to September 2006[1] for the purpose of defining a standard formal vocabulary for the annotation and retrieval of video.
Mandate
The Large-Scale Concept Ontology for Multimedia project was sponsored by the Disruptive Technology Office and brought together representatives from a variety of research communities, such as multimedia learning, information retrieval, computational linguistics, library science, and knowledge representation, as well as "user" communities such as intelligence agencies and broadcasters, to work collaboratively towards defining a set of 1,000 concepts.[2] Individually, each concept was to meet the following criteria:[3]
- Utility: the concepts must support realistic video retrieval problems
- Feasibility: the concepts are capable or will be capable of detection given the near-term (5 year projected) state of technology
- Observibility: the concepts occur with relatively high frequency in actual video data sets
Jointly, these concepts were to meet the additional criterion of providing broad (domain independent) coverage.[3] High-level target areas for coverage included physical objects, including animate objects (such as people, mobs, and animals), and inanimate objects, ranging from large-scale (such as buildings and highways) to small-scale (such as telephones and appliances); actions and events; locations and settings; and graphics. The effort was led by Dr. Milind Naphade, who was the principal investigator along with researchers from Carnegie Mellon University, Columbia University, and IBM.[1]
Development tracks
The project had two main "tracks": the development and deployment of keyframe annotation tools (performed by CMU and Columbia), and the development of the Large-Scale Concept Ontology for Multimedia concept hierarchy itself. The second track was executed in two phases: The first consisted in the manual construction of an 884 concept hierarchy, was performed collaboratively among the research and user community representatives.
The second track, performed by knowledge representation experts at Cycorp, Inc., involved the mapping of the concepts into the Cyc knowledge base and the use of the Cyc inference engine to semi-automatically refine, correct, and expand the concept hierarchy. The mapping/expansion phase of the project was motivated by a desire to increase breadth—the mapping had the effect of moving from 884 concepts to well past the initial goal of 1000—and to move Large-Scale Concept Ontology for Multimedia from a one-dimensional hierarchy of concepts, to a full-blown ontology of rich semantic connections.[3]
Project results
The outputs of the effort included:[1]
- A "lite" version of the Large-Scale Concept Ontology for Multimedia concept hierarchy consisting of a subset of 449 concepts.
- A corpus of 61,901 video keyframes, taken from the 2006 TRECVID data set, annotated using Large-Scale Concept Ontology for Multimedia "lite."
- The full taxonomy of 2,638 concepts, built semi-automatically by mapping 884 concepts, manually identified by collaborators, into the Cyc knowledge base, and querying the Cyc inference engine for useful additions.
- The full ontology, in the form of a 2006 ResearchCyc release that contained the Large-Scale Concept Ontology for Multimedia mappings into the Cyc ontology.
Public detectors
Several sets of concept detectors were developed and released for public use:
- VIREO-374, 374 detectors developed by City University of Hong Kong.
- Columbia374, 374 detectors developed by Columbia University.
- Mediamill101, 101 detectors developed by The University of Amsterdam.
Use in the larger research community
Since its release, Large-Scale Concept Ontology for Multimedia has begun to be used successfully in visual recognition research: Apart from research done by project participants, it has been used by independent research in concept extraction from images,[4][5] and has served as the basis for a video annotation tool.[6]
See also
References
- Naphade, et al., "Large Scale Concept Ontology for Multimedia: VACE Workshop Report,"
- Naphade, et al., "A Large Scale Concept Ontology for Multimedia Understanding," ppt presentation published by MITRE Archived 2006-05-06 at the Wayback Machine
- Naphade, et al., "Large-Scale Concept Ontology for Multimedia," IEEE MultiMedia, vol. 13, no. 3, pp. 86-91, July-September 2006.
- Snoek, et al., "Adding Semantics to Detectors for Video Retrieval," forthcoming in IEEE Transactions on Multimedia, 2007
- Worring, et al., "The MediaMill Large-lexicon Concept Suggestion Engine," forthcoming, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Honolulu, Hawaii, USA, April 2007.
- Emilie Garanaud, Smeaton, A., and Koskela, M., "Evaluation of a Video Annotation Tool Based on the LSCOM Ontology," in Proceedings of the First International Conference on Semantics and Digital Media Technology, Athens, Greece, 6-8 December 2006. Archived 20 July 2011 at the Wayback Machine