Scene Reconstruction Pose Estimation and Tracking

Scene Reconstruction Pose Estimation and Tracking PDF Author: Rustam Stolkin
Publisher: BoD – Books on Demand
ISBN: 3902613068
Category : Computers
Languages : en
Pages : 544

Get Book Here

Book Description
This book reports recent advances in the use of pattern recognition techniques for computer and robot vision. The sciences of pattern recognition and computational vision have been inextricably intertwined since their early days, some four decades ago with the emergence of fast digital computing. All computer vision techniques could be regarded as a form of pattern recognition, in the broadest sense of the term. Conversely, if one looks through the contents of a typical international pattern recognition conference proceedings, it appears that the large majority (perhaps 70-80%) of all pattern recognition papers are concerned with the analysis of images. In particular, these sciences overlap in areas of low level vision such as segmentation, edge detection and other kinds of feature extraction and region identification, which are the focus of this book.

Scene Reconstruction Pose Estimation and Tracking

Scene Reconstruction Pose Estimation and Tracking PDF Author: Rustam Stolkin
Publisher: BoD – Books on Demand
ISBN: 3902613068
Category : Computers
Languages : en
Pages : 544

Get Book Here

Book Description
This book reports recent advances in the use of pattern recognition techniques for computer and robot vision. The sciences of pattern recognition and computational vision have been inextricably intertwined since their early days, some four decades ago with the emergence of fast digital computing. All computer vision techniques could be regarded as a form of pattern recognition, in the broadest sense of the term. Conversely, if one looks through the contents of a typical international pattern recognition conference proceedings, it appears that the large majority (perhaps 70-80%) of all pattern recognition papers are concerned with the analysis of images. In particular, these sciences overlap in areas of low level vision such as segmentation, edge detection and other kinds of feature extraction and region identification, which are the focus of this book.

Scene Reconstruction Pose Estimation and Tracking

Scene Reconstruction Pose Estimation and Tracking PDF Author: Rustam Stolkin
Publisher: IntechOpen
ISBN: 9783902613066
Category : Computers
Languages : en
Pages : 542

Get Book Here

Book Description
This book reports recent advances in the use of pattern recognition techniques for computer and robot vision. The sciences of pattern recognition and computational vision have been inextricably intertwined since their early days, some four decades ago with the emergence of fast digital computing. All computer vision techniques could be regarded as a form of pattern recognition, in the broadest sense of the term. Conversely, if one looks through the contents of a typical international pattern recognition conference proceedings, it appears that the large majority (perhaps 70-80%) of all pattern recognition papers are concerned with the analysis of images. In particular, these sciences overlap in areas of low level vision such as segmentation, edge detection and other kinds of feature extraction and region identification, which are the focus of this book.

Visual-Inertial Odometry for 3D Pose Estimation and Scene Reconstruction Using Unmanned Aerial Vehicles

Visual-Inertial Odometry for 3D Pose Estimation and Scene Reconstruction Using Unmanned Aerial Vehicles PDF Author: Dylan Gareau
Publisher:
ISBN:
Category :
Languages : en
Pages :

Get Book Here

Book Description
As Unmanned Aerial Vehicles (UAVs) become increasingly available, pose estimation remains critical for navigation. Pose estimation is also useful for scene reconstruction in certain surveillance applications, such as surveillance in the event of a natural disaster. This thesis presents a Direct Sparse Visual-Inertial Odometry with Loop Closure (VIL-DSO) algorithm design as a pose estimation solution, combining several existing algorithms to fuse inertial and visual information to improve pose estimation and provide metric scale, as initially implemented in Direct Sparse Odometry (DSO) and Direct Sparse Visual-Inertial Odometry (VI-DSO). VIL-DSO utilizes the point selection and loop closure method of the Direct Sparse Odometry with Loop Closure (LDSO) approach. This point selection method improves repeatability by calculating the Shi-Tomasi score to favor corners as point candidates and allows for generating matches for loop closure between keyframes. The proposed VIL-DSO then uses the Kabsch-Umeyama algorithm to reduce the effects of scale-drift caused by loop closure. The proposed VIL-DSO algorithm is composed of three main threads for computing: a coarse tracking thread to assist with keyframe selection and initial pose estimation, a local window optimization thread to fuse Inertial Measurement Unit (IMU) information and visual information to pose scale and pose estimate, and a global optimization thread to identify loop closure and improve pose estimates. The loop closure thread also includes the modification to mitigate scale-drift using the Kabsch-Umeyama algorithm. The trajectory analysis of the estimates yields that the loop closure improves the pose estimation, but causes to scale estimate to drift. The scale-drift mitigation method successfully improves the scale estimate after loop closure. However, the estimation error level struggles to exceed the other state-of-the-art methods, namely VI-DSO and VI-ORB SLAM. The results were evaluated on the EuRoC MAV dataset, which contains fairly short sequences. VIL-DSO is expected to show more advantages when used on a longer dataset,where loop closure is more useful. Lastly, using the odometry as a feed, scene reconstruction and the effects of various factors regarding mapping are discussed, including the use of a monocular camera, camera angle and resolution in outdoor settings.

3D Computer Vision

3D Computer Vision PDF Author: Christian Wöhler
Publisher: Springer Science & Business Media
ISBN: 1447141504
Category : Computers
Languages : en
Pages : 390

Get Book Here

Book Description
This indispensable text introduces the foundations of three-dimensional computer vision and describes recent contributions to the field. Fully revised and updated, this much-anticipated new edition reviews a range of triangulation-based methods, including linear and bundle adjustment based approaches to scene reconstruction and camera calibration, stereo vision, point cloud segmentation, and pose estimation of rigid, articulated, and flexible objects. Also covered are intensity-based techniques that evaluate the pixel grey values in the image to infer three-dimensional scene structure, and point spread function based approaches that exploit the effect of the optical system. The text shows how methods which integrate these concepts are able to increase reconstruction accuracy and robustness, describing applications in industrial quality inspection and metrology, human-robot interaction, and remote sensing.

Robust Video Object Tracking Via Camera Self-calibration

Robust Video Object Tracking Via Camera Self-calibration PDF Author: Zheng Tang
Publisher:
ISBN:
Category :
Languages : en
Pages : 116

Get Book Here

Book Description
In this dissertation, a framework for 3D scene reconstruction based on robust video object tracking assisted by camera self-calibration is proposed, which includes several algorithmic components. (1) An algorithm for joint camera self-calibration and automatic radial distortion correction based on tracking of walking persons is designed to convert multiple object tracking into 3D space. (2) An adaptive model that learns online a relatively long-term appearance change of each target is proposed for robust 3D tracking. (3) We also develop an iterative two-step evolutionary optimization scheme to estimate 3D pose of each human target, which can jointly compute the camera trajectory for a moving camera as well. (4) With 3D tracking results and human pose information from multiple views, we propose multi-view 3D scene reconstruction based on data association with visual and semantic attributes. Camera calibration and radial distortion correction are crucial prerequisites for 3D scene understanding. Many existing works rely on the Manhattan world assumption to estimate camera parameters automatically, however, they may perform poorly when lack of man-made structure in the scene. As walking humans are common objects in video analytics, they have also been used for camera calibration, but the main challenges include noise reduction for the estimation of vanishing points, the relaxation of assumptions on unknown camera parameters, and radial distortion correction. We propose a novel framework for camera self-calibration and automatic radial distortion correction. Our approach starts with a multi-kernel-based adaptive segmentation and tracking scheme that dynamically controls the decision thresholds of background subtraction and shadow removal around the adaptive kernel regions based on the preliminary tracking results. With the head/foot points collected from tracking and segmentation results, mean shift clustering and Laplace linear regression are introduced in the estimation of the vertical vanishing point and the horizon line, respectively. The estimation of distribution algorithm (EDA), an evolutionary optimization scheme, is then utilized to optimize the camera parameters and distortion coefficients, in which all the unknowns in camera projection can be fine-tuned simultaneously. Experiments on three public benchmarks and our own captured dataset demonstrate the robustness of the proposed method. The superiority of this algorithm is also verified by the capability of reliably converting 2D object tracking into 3D space. Multiple object tracking has been a challenging field, mainly due to noisy detection sets and identity switch caused by occlusion and similar appearance among nearby targets. Previous works rely on appearance models built on individual or several selected frames for the comparison of features, but they cannot encode long-term appearance change caused by pose, viewing angle and lighting condition. We propose an adaptive model that learns online a relatively long-term appearance change of each target. The proposed model is compatible with any features of fixed dimension or their combinations, whose learning rates are dynamically controlled by adaptive update and spatial weighting schemes. To handle occlusion and nearby objects sharing similar appearance, we also design cross-matching and re-identification schemes based on the proposed adaptive appearance models. Additionally, the 3D geometry information is effectively incorporated in our formulation for data association. The proposed method outperforms all the state-of-the-art on the MOTChallenge 3D benchmark and achieves real-time computation with only a standard desktop CPU. It has also shown superior performance over the state-of-the-art on the 2D benchmark of MOTChallenge. For more comprehensive 3D scene reconstruction, we develop a monocular 3D human pose estimation algorithm based on two-step EDA that can simultaneously estimate the camera motion for a moving camera. We first derive reliable 2D joint points through deep-learning-based 2D pose estimation and feature tracking. If the camera is moving, the initial camera poses can be estimated from visual odometry, where the feature points extracted on the human bodies are removed by segmentation masks dilated from 2D skeletons. Then the 3D joint points and camera parameters are iteratively optimized through a two-step evolutionary algorithm. The cost function for human pose optimization consists of loss terms defined by spatial and temporal constancy, "flatness" of human bodies, and joint angle constraints. On the other hand, the optimization for camera movement is based on the minimization of reprojection error of skeleton joint points. Extensive experiments have been conducted on various video data, which verify the robustness of the proposed method. The final goal of our work is to fully understand and reconstruct the 3D scene, i.e., to recover the trajectory and action of each object. The above methods can be extended to a system with camera array of overlapping views. We propose a novel video scene reconstruction framework to collaboratively track multiple human objects and estimate their 3D poses across multiple camera views. First, tracklets are extracted from each single view following the tracking-by-detection paradigm. We propose an effective integration of visual and semantic object attributes, including appearance models, geometry information and poses/actions, to associate tracklets across different views. Based on the optimum viewing perspectives derived from tracking, we generate the 3D skeleton of each object. The estimated body joint points are fed back to the tracking stage to enhance tracklet association. Experiments on a benchmark of multi-view tracking validate our effectiveness.

On Pose Estimation in Room-Scaled Environments

On Pose Estimation in Room-Scaled Environments PDF Author: Hanna E. Nyqvist
Publisher: Linköping University Electronic Press
ISBN: 9176856283
Category :
Languages : en
Pages : 92

Get Book Here

Book Description
Pose (position and orientation) tracking in room-scaled environments is an enabling technique for many applications. Today, virtual reality (vr) and augmented reality (ar) are two examples of such applications, receiving high interest both from the public and the research community. Accurate pose tracking of the vr or ar equipment, often a camera or a headset, or of different body parts is crucial to trick the human brain and make the virtual experience realistic. Pose tracking in room-scaled environments is also needed for reference tracking and metrology. This thesis focuses on an application to metrology. In this application, photometric models of a photo studio are needed to perform realistic scene reconstruction and image synthesis. Pose tracking of a dedicated sensor enables creation of these photometric models. The demands on the tracking system used in this application is high. It must be able to provide sub-centimeter and sub-degree accuracy and at same time be easy to move and install in new photo studios. The focus of this thesis is to investigate and develop methods for a pose tracking system that satisfies the requirements of the intended metrology application. The Bayesian filtering framework is suggested because of its firm theoretical foundation in informatics and because it enables straightforward fusion of measurements from several sensors. Sensor fusion is in this thesis seen as a way to exploit complementary characteristics of different sensors to increase tracking accuracy and robustness. Four different types of measurements are considered; inertialmeasurements, images from a camera, range (time-of-flight) measurements from ultra wide band (uwb) radio signals, and range and velocity measurements from echoes of transmitted acoustic signals. A simulation study and a study of the Cramér-Rao lower filtering bound (crlb) show that an inertial-camera system has the potential to reach the required tracking accuracy. It is however assumed that known fiducial markers, that can be detected and recognized in images, are deployed in the environment. The study shows that many markers are required. This makes the solution more of a stationary solution and the mobility requirement is not fulfilled. A simultaneous localization and mapping (slam) solution, where naturally occurring features are used instead of known markers, are suggested solve this problem. Evaluation using real data shows that the provided inertial-camera slam filter suffers from drift but that support from uwb range measurements eliminates this drift. The slam solution is then only dependent on knowing the position of very few stationary uwb transmitters compared to a large number of known fiducial markers. As a last step, to increase the accuracy of the slam filter, it is investigated if and how range measurements can be complemented with velocity measurement obtained as a result of the Doppler effect. Especially, focus is put on analyzing the correlation between the range and velocity measurements and the implications this correlation has for filtering. The investigation is done in a theoretical study of reflected known signals (compare with radar and sonar) where the crlb is used as an analyzing tool. The theory is validated on real data from acoustic echoes in an indoor environment.

Scene Reconstruction Pose Estimation and Tracking

Scene Reconstruction Pose Estimation and Tracking PDF Author: Rustam Stolkin
Publisher: IntechOpen
ISBN: 9783902613066
Category : Computers
Languages : en
Pages : 542

Get Book Here

Book Description
This book reports recent advances in the use of pattern recognition techniques for computer and robot vision. The sciences of pattern recognition and computational vision have been inextricably intertwined since their early days, some four decades ago with the emergence of fast digital computing. All computer vision techniques could be regarded as a form of pattern recognition, in the broadest sense of the term. Conversely, if one looks through the contents of a typical international pattern recognition conference proceedings, it appears that the large majority (perhaps 70-80%) of all pattern recognition papers are concerned with the analysis of images. In particular, these sciences overlap in areas of low level vision such as segmentation, edge detection and other kinds of feature extraction and region identification, which are the focus of this book.

Multimedia Communications, Services and Security

Multimedia Communications, Services and Security PDF Author: Andrzej Dziech
Publisher: Springer
ISBN: 3642385591
Category : Computers
Languages : en
Pages : 335

Get Book Here

Book Description
This volume constitutes the refereed proceedings of the 6th International Conference on Multimedia Communications, Services and Security, MCSS 2013, held in Krakow, Poland, in June 2013. The 27 full papers included in the volume were selected from numerous submissions. The papers cover various topics related to multimedia technology and its application to public safety problems.

Image Processing and Analysis with Graphs

Image Processing and Analysis with Graphs PDF Author: Olivier Lezoray
Publisher: CRC Press
ISBN: 1439855080
Category : Computers
Languages : en
Pages : 570

Get Book Here

Book Description
Covering the theoretical aspects of image processing and analysis through the use of graphs in the representation and analysis of objects, Image Processing and Analysis with Graphs: Theory and Practice also demonstrates how these concepts are indispensible for the design of cutting-edge solutions for real-world applications. Explores new applications in computational photography, image and video processing, computer graphics, recognition, medical and biomedical imaging With the explosive growth in image production, in everything from digital photographs to medical scans, there has been a drastic increase in the number of applications based on digital images. This book explores how graphs—which are suitable to represent any discrete data by modeling neighborhood relationships—have emerged as the perfect unified tool to represent, process, and analyze images. It also explains why graphs are ideal for defining graph-theoretical algorithms that enable the processing of functions, making it possible to draw on the rich literature of combinatorial optimization to produce highly efficient solutions. Some key subjects covered in the book include: Definition of graph-theoretical algorithms that enable denoising and image enhancement Energy minimization and modeling of pixel-labeling problems with graph cuts and Markov Random Fields Image processing with graphs: targeted segmentation, partial differential equations, mathematical morphology, and wavelets Analysis of the similarity between objects with graph matching Adaptation and use of graph-theoretical algorithms for specific imaging applications in computational photography, computer vision, and medical and biomedical imaging Use of graphs has become very influential in computer science and has led to many applications in denoising, enhancement, restoration, and object extraction. Accounting for the wide variety of problems being solved with graphs in image processing and computer vision, this book is a contributed volume of chapters written by renowned experts who address specific techniques or applications. This state-of-the-art overview provides application examples that illustrate practical application of theoretical algorithms. Useful as a support for graduate courses in image processing and computer vision, it is also perfect as a reference for practicing engineers working on development and implementation of image processing and analysis algorithms.

Time-of-Flight and Structured Light Depth Cameras

Time-of-Flight and Structured Light Depth Cameras PDF Author: Pietro Zanuttigh
Publisher: Springer
ISBN: 3319309730
Category : Computers
Languages : en
Pages : 360

Get Book Here

Book Description
This book provides a comprehensive overview of the key technologies and applications related to new cameras that have brought 3D data acquisition to the mass market. It covers both the theoretical principles behind the acquisition devices and the practical implementation aspects of the computer vision algorithms needed for the various applications. Real data examples are used in order to show the performances of the various algorithms. The performance and limitations of the depth camera technology are explored, along with an extensive review of the most effective methods for addressing challenges in common applications. Applications covered in specific detail include scene segmentation, 3D scene reconstruction, human pose estimation and tracking and gesture recognition. This book offers students, practitioners and researchers the tools necessary to explore the potential uses of depth data in light of the expanding number of devices available for sale. It explores the impact of these devices on the rapidly growing field of depth-based computer vision.