Salih Burak Göktürk and Carlo Tomasi, 2004
This paper describes a head-tracking algorithm that is based on recognition and correlation-based weighted interpolation. The input is a sequence of 3D depth images generated by a novel time-of-flight depth sensor. These are processed to segment the background and foreground, and the latter is used as the input to the head tracking algorithm, which is composed of three major modules: First, a depth signature is created out of the depth images. Next, the signature is compared against signatures that are collected in a training set of depth images. Finally, a correlation metric is calculated between most possible signature hits. The head location is calculated by interpolating among stored depth values, using the correlation metrics as the weights. This combination of depth sensing and recognition-based head tracking provides more than 90 percent success. Even if the track is temporarily lost, it is easily recovered when a good match is obtained from the training set. The use of depth images and recognition-based head tracking achieves robust real-time tracking results under extreme conditions such as 180-degree rotation, temporary occlusions, and complex
Li Guan Marc Pollefeys, 2008
In this paper, we propose a unified calibration technique for a heterogeneous sensor network of video camcorders and Time-of-Flight (ToF) cameras. By moving a spherical calibration target around the commonly observed scene, we can robustly and conveniently extract the sphere centers in the observed images and recover the geometric extrinsics for both types of sensors. The approach is then evaluated with a real dataset of two HD camcorders and two ToF cameras, and 3D shapes are reconstructed from this calibrated system. The main contributions are: (1) We reveal the fact that the frontmost sphere surface point to the ToF camera center is always highlighted, and use this idea to extract sphere centers in the ToF camera images; (2) We propose a unified calibration scheme in spite of the heterogeneity of the sensors. After the calibration, this multi-modal sensor network thus becomes powerful to generate high-quality 3D shapes efficiently.
Huan Du, Thierry Oggier, Felix Lustenberger, Edoardo Charbon, 2005
In this paper, a complete system is presented which mimics a QWERTY keyboard on an arbitrary surface. The system consists of a pattern projector and a true-3D range camera for detecting the typing events. We exploit depth information acquired with the 3D range camera and detect the hand region using a pre-computed reference frame. The fingertips are found by analyzing the hands’ contour and fitting the depth curve with different feature models. To detect a keystroke, we analyze the feature of the depth curve and map it back to a global coordinate system to find which key was pressed. These steps are fully automated and do not require human intervention. The system can be used in any application requiring zero form factor and minimized or no contact with a medium, as in a large number of cases in human-to-computer interaction, virtual reality, game control, 3D designs, etc.
Jochen Teizer, Frederic Bosche, Carlos H. Caldas, Carl T. Haas, and Katherine A. Liapi, 2005
This paper describes a research effort directed to produce methods to model three-dimensional scenes of construction field objects in real-time that adds valuable data to construction information management systems, as well as equipment navigation systems. For efficiency reasons, typical construction objects are modeled by bounding surfaces using a high-frame rate range sensor, called Flash LADAR. The sensor provides a dense cloud of range points which are segmented and grouped into objects. Algorithms are being developed to accurately detect these objects and model characteristics such as volume, speed, and direction. Initial experiments show the feasibility of this method. The advantages and limitations, and potential solutions to limitations are summarized in this paper.
Philipp Michel, Joel Chestnutt, Satoshi Kagami, Koichi Nishiwaki, James Kuffner and Takeo Kanade, 2006
As navigation autonomy becomes an increasingly important research topic for biped humanoid robots, efficient approaches to perception and mapping that are suited to the unique characteristics of humanoids and their typical operating environments will be required. This paper presents a system for online environment reconstruction that utilizes both external sensors for global localization, and on-body sensors for detailed local mapping. An external optical motion capture system is used to accurately localize on-board sensors that integrate successive 2D views of a calibrated camera and range measurements from a SwissRanger SR-2 time-of-flight sensor to construct global environment maps in real-time. Environment obstacle geometry is encoded in 2D occupancy grids and 2.5D height maps for navigation planning. We present an on-body implementation for the HRP-2 humanoid robot that, combined with a footstep planner, enables the robot to autonomously traverse dynamic environments containing unpredictably moving obstacles.
Li Guan, Jean-Sebastien Franco, Marc Pollefeys, 2008
In this paper, we reconstruct 3D objects with a heterogeneous sensor network of Time of Flight (ToF) Range Imaging (RIM) sensors and high-res camcorders. With this setup, we first carry out a simple but effective depth calibration for the RIM cameras. We then combine the camcorder silhouette cues and RIM camera depth information, for the reconstruction. Our main contribution is the proposal of a sensor fusion framework so that the computation is general, simple and scalable. Although we only discuss the fusion of conventional cameras and RIM cameras in this paper, the proposed framework can be applied to any vision sensors. This framework uses a space occupancy grid as a probabilistic 3D representation of scene contents. After defining sensing models for each type of sensors, the reconstruction simply is a Bayesian inference problem, and can be solved robustly. The experiments show that the quality of the reconstruction is substantially improved from the noisy depth sensor measurement.
T. Hong, R. Bostelman, and R. Madhavan, 2004
The performance evaluation of an obstacle detection and segmentation algorithm for Automated Guided Vehicle (AGV) navigation in factory-like environments using a 3D real-time range camera is the subject of this paper 1. Our approach has been tested successfully on British safety standard recommended object sizes and materials placed on the vehicle path. The segmented (mapped) obstacles are then verified using absolute measurements obtained using a relatively accurate 2D scanning laser rangefinder.