![]() However, the cost to acquire high-quality depth information is often very high. For each pixel in an image, accurate depth information plays an import role in calculating the corresponding coordinates in a 3D space. In computer vision, many approaches have been studied to recover the spatial layout of a scene by acquiring surrounding images under different viewpoints. ![]() On the contrary, based on the assumption that the status of the target scene is stationary, an abnormal event occurring in the scene under surveillance is detected if the alignment error is large. The fine-tuned camera parameters and depth map of the current input RGB image are then used to compute the current view-specific point cloud which can keep the 3D model up-to-date if the alignment error between the current point cloud and the model point cloud is low. Once the viewpoint class of the input image is determined, the coarse estimation of pose parameters and depth map should be fine-tuned since location error exists in the viewpoint of capturing the image using a conventional GPS-based drone navigation scheme. Because the pose parameters of each template are given in the model reconstruction phase, it provides a coarse estimation of the pose parameters, i.e., the depth map and external camera parameters, of the input image in the surveillance phase. The template-based approach is an efficient method that transforms both depth and color information in a limited set of view-specific templates into a 3D point cloud which renders the appearance and geometric structures of a real-world scene. Recently, with the depth or combined color and depth imagery from a low-cost Kinect-like sensor as input data, state-of-the-art approaches focus on the development of novel methods that can accurately reconstruct a 3D scene model from a multiview RGB-D video. Image-based 3D scene modeling is predicated on the assumption that the observed imagery is generated by joining multiple views of a 3D scene. ![]() A point cloud representation can be used for 3D inspection as it renders detailed 3D environments accurately. These methods can represent a real-world scene as a point cloud which consists of a large number of points distributed in a 3D space to faithfully represent the intrinsic structure of the scene. With the advances of low-cost 3D data acquisition devices, researchers in computer vision have introduced many image-based 3D modeling techniques such as simultaneous location and mapping (SLAM), multiview stereo (MVS), photo tourism, virtual reality modeling, and an RGB-D video-based method. Both approaches reconstruct high quality 3D models however, their acquisition cost is very high. Researchers in remote sensing provide two traditional 3D reconstruction techniques including airborne image photogrammetry and light detection and ranging (LiDAR). In the field of three-dimensional (3D) computer vision, researchers aim at quickly reconstructing 3D models from an image sequence due to its potential applications in robotics, augmented reality (AR), geodesy, remote sensing, 3D face recognition, drone or vehicle navigation, and 3D printing. The results indicate that our approach outperforms the compared methods in terms of the accuracy of pose estimation. Finally, we verify the ability of the established reconstruction system on publicly available benchmark datasets and compare it with the state-of-the-art pose estimation algorithms. To compare with conventional pose estimation algorithms which use sparse features for pose estimation, our approach enhances the quality of reconstructing the 3D scene point cloud using the template-to-frame registration. Obviously, the pose estimation algorithm is the key to success for updating the status of the 3D scene. To achieve the goal of scene reconstruction in a 3D space using a camera, using the information of templates, a pose estimation algorithm follows to estimate the pose parameters and depth map of a single RGB image captured by navigating the camera to a specific viewpoint. For all training images in a viewpoint class, the DVCNN estimates their membership probabilities and defines the template of the class as the one of the highest probability. Next, the model is analyzed using the proposed multiple principal analysis to label the viewpoint class of each training RGB image and construct a training dataset for training a deep learning viewpoint classification neural network (DVCNN). A prelearned image model of the target scene is first reconstructed using a training RGB-D video. This paper presents a model-based approach for 3D pose estimation of a single RGB image to keep the 3D scene model up-to-date using a low-cost camera.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |