fbpx
Wikipedia

Visual descriptor

In computer vision, visual descriptors or image descriptors are descriptions of the visual features of the contents in images, videos, or algorithms or applications that produce such descriptions. They describe elementary characteristics such as the shape, the color, the texture or the motion, among others.

Introduction edit

As a result of the new communication technologies and the massive use of Internet in our society, the amount of audio-visual information available in digital format is increasing considerably. Therefore, it has been necessary to design some systems that allow us to describe the content of several types of multimedia information in order to search and classify them.

The audio-visual descriptors are in charge of the contents description. These descriptors have a good knowledge of the objects and events found in a video, image or audio and they allow the quick and efficient searches of the audio-visual content.

This system can be compared to the search engines for textual contents. Although it is certain, that it is relatively easy to find text with a computer, is much more difficult to find concrete audio and video parts. For instance, imagine somebody searching a scene of a happy person. The happiness is a feeling and it is not evident its shape, color and texture description in images.

The description of the audio-visual content is not a superficial task and it is essential for the effective use of this type of archives. The standardization system that deals with audio-visual descriptors is the MPEG-7 (Motion Picture Expert Group - 7).

Types edit

Descriptors are the first step to find out the connection between pixels contained in a digital image and what humans recall after having observed an image or a group of images after some minutes.

Visual descriptors are divided in two main groups:

  • General information descriptors: contain low level descriptors which give a description about color, shape, regions, textures and motion.
  • Specific domain information descriptors: give information about objects and events in the scene. A concrete example would be face recognition.

General information descriptors edit

General information descriptors consist of a set of descriptors that covers different basic and elementary features like: color, texture, shape, motion, location and others. This description is automatically generated by means of signal processing.

Color edit

It's the most basic quality of visual content. Five tools are defined to describe color. The three first tools represent the color distribution and the last ones describe the color relation between sequences or group of images:

  • Dominant color descriptor (DCD)
  • Scalable color descriptor (SCD)
  • Color structure descriptor (CSD)
  • Color layout descriptor (CLD)
  • Group of frame (GoF) or group-of-pictures (GoP)

Texture edit

It's an important quality in order to describe an image. The texture descriptors characterize image textures or regions. They observe the region homogeneity and the histograms of these region borders. The set of descriptors is formed by:

  • Homogeneous texture descriptor (HTD)
  • Texture browsing descriptor (TBD)
  • Edge histogram descriptor (EHD)

Shape edit

It contains important semantic information due to human's ability to recognize objects through their shape. However, this information can only be extracted by means of a segmentation similar to the one that the human visual system implements. Nowadays, such a segmentation system is not available yet, however there exists a serial of algorithms which are considered to be a good approximation. These descriptors describe regions, contours and shapes for 2D images and for 3D volumes. The shape descriptors are the following ones:

  • Region-based shape descriptor (RSD)
  • Contour-based shape descriptor (CSD)
  • 3-D shape descriptor (3-D SD)

Motion edit

It's defined by four different descriptors which describe motion in video sequence. Motion is related to the objects motion in the sequence and to the camera motion. This last information is provided by the capture device, whereas the rest is implemented by means of image processing. The descriptor set is the following one:

  • Motion activity descriptor (MAD)
  • Camera motion descriptor (CMD)
  • Motion trajectory descriptor (MTD)
  • Warping and parametric motion descriptor (WMD and PMD)

Location edit

Elements location in the image is used to describe elements in the spatial domain. In addition, elements can also be located in the temporal domain:

  • Region locator descriptor (RLD)
  • Spatio temporal locator descriptor (STLD)

Specific domain information descriptors edit

These descriptors, which give information about objects and events in the scene, are not easily extractable, even more when the extraction is to be automatically done. Nevertheless, they can be manually processed.

As mentioned before, face recognition is a concrete example of an application that tries to automatically obtain this information.

Descriptors applications edit

Among all applications, the most important ones are:

  • Multimedia documents search engines and classifiers.
  • Digital library: visual descriptors allow a very detailed and concrete search of any video or image by means of different search parameters. For instance, the search of films where a known actor appears, the search of videos containing the Everest mountain, etc.
  • Personalized electronic news service.
  • Possibility of an automatic connection to a TV channel broadcasting a soccer match, for example, whenever a player approaches the goal area.
  • Control and filtering of concrete audiovisual content, like violent or pornographic material. Also, authorization for some multimedia content.

See also edit

References edit

  • B.S. Manjunath (Editor), Philippe Salembier (Editor), and Thomas Sikora (Editor): Introduction to MPEG-7: Multimedia Content Description Interface. Wiley & Sons, April 2002 - ISBN 0-471-48678-7

visual, descriptor, this, article, includes, list, references, related, reading, external, links, sources, remain, unclear, because, lacks, inline, citations, please, help, improve, this, article, introducing, more, precise, citations, july, 2018, learn, when,. This article includes a list of references related reading or external links but its sources remain unclear because it lacks inline citations Please help improve this article by introducing more precise citations July 2018 Learn how and when to remove this message In computer vision visual descriptors or image descriptors are descriptions of the visual features of the contents in images videos or algorithms or applications that produce such descriptions They describe elementary characteristics such as the shape the color the texture or the motion among others Contents 1 Introduction 2 Types 2 1 General information descriptors 2 1 1 Color 2 1 2 Texture 2 1 3 Shape 2 1 4 Motion 2 1 5 Location 2 2 Specific domain information descriptors 3 Descriptors applications 4 See also 5 ReferencesIntroduction editAs a result of the new communication technologies and the massive use of Internet in our society the amount of audio visual information available in digital format is increasing considerably Therefore it has been necessary to design some systems that allow us to describe the content of several types of multimedia information in order to search and classify them The audio visual descriptors are in charge of the contents description These descriptors have a good knowledge of the objects and events found in a video image or audio and they allow the quick and efficient searches of the audio visual content This system can be compared to the search engines for textual contents Although it is certain that it is relatively easy to find text with a computer is much more difficult to find concrete audio and video parts For instance imagine somebody searching a scene of a happy person The happiness is a feeling and it is not evident its shape color and texture description in images The description of the audio visual content is not a superficial task and it is essential for the effective use of this type of archives The standardization system that deals with audio visual descriptors is the MPEG 7 Motion Picture Expert Group 7 Types editDescriptors are the first step to find out the connection between pixels contained in a digital image and what humans recall after having observed an image or a group of images after some minutes Visual descriptors are divided in two main groups General information descriptors contain low level descriptors which give a description about color shape regions textures and motion Specific domain information descriptors give information about objects and events in the scene A concrete example would be face recognition General information descriptors edit General information descriptors consist of a set of descriptors that covers different basic and elementary features like color texture shape motion location and others This description is automatically generated by means of signal processing Color edit It s the most basic quality of visual content Five tools are defined to describe color The three first tools represent the color distribution and the last ones describe the color relation between sequences or group of images Dominant color descriptor DCD Scalable color descriptor SCD Color structure descriptor CSD Color layout descriptor CLD Group of frame GoF or group of pictures GoP Texture edit It s an important quality in order to describe an image The texture descriptors characterize image textures or regions They observe the region homogeneity and the histograms of these region borders The set of descriptors is formed by Homogeneous texture descriptor HTD Texture browsing descriptor TBD Edge histogram descriptor EHD Shape edit It contains important semantic information due to human s ability to recognize objects through their shape However this information can only be extracted by means of a segmentation similar to the one that the human visual system implements Nowadays such a segmentation system is not available yet however there exists a serial of algorithms which are considered to be a good approximation These descriptors describe regions contours and shapes for 2D images and for 3D volumes The shape descriptors are the following ones Region based shape descriptor RSD Contour based shape descriptor CSD 3 D shape descriptor 3 D SD Motion edit It s defined by four different descriptors which describe motion in video sequence Motion is related to the objects motion in the sequence and to the camera motion This last information is provided by the capture device whereas the rest is implemented by means of image processing The descriptor set is the following one Motion activity descriptor MAD Camera motion descriptor CMD Motion trajectory descriptor MTD Warping and parametric motion descriptor WMD and PMD Location edit Elements location in the image is used to describe elements in the spatial domain In addition elements can also be located in the temporal domain Region locator descriptor RLD Spatio temporal locator descriptor STLD Specific domain information descriptors edit These descriptors which give information about objects and events in the scene are not easily extractable even more when the extraction is to be automatically done Nevertheless they can be manually processed As mentioned before face recognition is a concrete example of an application that tries to automatically obtain this information Descriptors applications editAmong all applications the most important ones are Multimedia documents search engines and classifiers Digital library visual descriptors allow a very detailed and concrete search of any video or image by means of different search parameters For instance the search of films where a known actor appears the search of videos containing the Everest mountain etc Personalized electronic news service Possibility of an automatic connection to a TV channel broadcasting a soccer match for example whenever a player approaches the goal area Control and filtering of concrete audiovisual content like violent or pornographic material Also authorization for some multimedia content See also editDSpace Feature detection Motion graphics MPEG 7 Scale invariant feature transformReferences editB S Manjunath Editor Philippe Salembier Editor and Thomas Sikora Editor Introduction to MPEG 7 Multimedia Content Description Interface Wiley amp Sons April 2002 ISBN 0 471 48678 7 Retrieved from https en wikipedia org w index php title Visual descriptor amp oldid 1184111960, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.