Table Of Contents

This Page

object_recognition_tod: Textured Object Detection

Texture Object Detection (TOD) is based on a standard bag of features technique.


In the config file you need to specify the feature/descriptor to use as well as the search parameters.

The DB parameters are standard ObjectDbParameters parameters. A typical config file looks like this:

# info about the db
  type: TodTrainer
  module: object_recognition_tod
      type: 'ORB'
      type: ORB
      module: ecto_opencv.features2d
      n_features: 1000
      n_levels: 3
      scale_factor: 1.2
      type: ORB
      module: ecto_opencv.features2d
      key_size: 24
      multi_probe_level: 2
      n_tables: 8
      radius: 55
      ratio: 0.8
      type: 'LSH'
      type: 'CouchDB'
      root: 'http://localhost:5984'
      collection: 'object_recognition'

    # The list of object_ids to analyze
    object_ids: "all"

During training, in the different views of the object features and descriptors are extracted. For each of those, if depth was also captured (which is the only supported method and is highly recommended anyway), the 3d position is also stored.

You can also view the point cloud of the features by launching the apps/feature_viewer application

$ /home/vrabaud/workspace/recognition_kitchen_groovy/src/object_recognition_tod/doc/source/../../apps/feature_viewer --help
usage: feature_viewer [-h] [--db_type DB_TYPE] [--db_root DB_ROOT_URL]
                      [--db_collection DB_COLLECTION] [--commit]
                      [--niter ITERATIONS] [--shell] [--gui]
                      [--logfile LOGFILE] [--graphviz] [--dotfile DOTFILE]

positional arguments:
  object_id             The id of the object for which the TOD model will be

optional arguments:
  -h, --help            show this help message and exit

Database Parameters:
  --db_type DB_TYPE     The type of database used: one of [CouchDB]. Default:
  --db_root DB_ROOT_URL
                        The database root URL to connect to. Default:
  --db_collection DB_COLLECTION
                        The database root URL to connect to. Default:
  --commit              Commit the data to the database.

Ecto runtime parameters:
  --niter ITERATIONS    Run the graph for niter iterations. 0 means run until
                        stopped by a cell or external forces. (default: 0)
  --shell               'Bring up an ipython prompt, and execute
                        asynchronously.(default: False)
  --gui                 Bring up a gui to help execute the plasm.
  --logfile LOGFILE     Log to the given file, use tail -f LOGFILE to see the
                        live output. May be useful in combination with --shell
  --graphviz            Show the graphviz of the plasm. (default: False)
  --dotfile DOTFILE     Output a graph in dot format to the given file. If no
                        file is given, no output will be generated. (default:
  --stats               Show the runtime statistics of the plasm.


A typical config file looks like this:

  type: 'OpenNI'
  module: ''
    image_mode: 'SXGA_RES'
    depth_mode: 'VGA_RES'
    image_fps: 'FPS_15'
    depth_fps: 'FPS_30'

#Use this instead to receive images via ROS
#  type: ros_kinect
#  rgb_frame_id: '/camera_rgb_optical_frame'

  type: 'TodDetector'
  module: 'object_recognition_tod'
    type: 'ORB'
  inputs: [source1]
    object_ids: "all"
      type: ORB
      module: ecto_opencv.features2d
      n_features: 5000
      n_levels: 3
      scale_factor: 1.2
      type: ORB
      module: ecto_opencv.features2d
      type: LSH
      module: ecto_opencv.features2d
      key_size: 16
      multi_probe_level: 1
      n_tables: 10
      radius: 35
      ratio: 0.8
    n_ransac_iterations: 2500
    min_inliers: 8
    sensor_error: 0.01
      type: CouchDB
      root: http://localhost:5984
      collection: object_recognition

During detection, features/descriptors are computed on the current image and compared to our database. Sets of seen descriptors are then checked with the nearest neighbors (descriptor-wise) for an analogous 3d configuration. In the case of 3d input data, it is just a 3d to 3d comparison, but if the input is only 2d, it’s a PnP problem (for which we have not plugged the solvePnP from OpenCV).

So basically, you can only get the pose of an object on an RGBD input for now.