Large-Scale Structure from Motion: A Survey
-
摘要: 运动恢复结构旨在基于图像间的局部特征匹配解算相机在全局统一坐标系下的绝对位姿,是基于图像的三维重建中的关键问题.近年来,随着采集设备、计算资源以及理论方法的发展,运动恢复结构研究已逐渐由实验室小规模可控场景向室内外大规模实际场景扩展,并取得了成果显著的理论方法与实际应用.文中从实际应用出发,综述了面向大规模场景三维重建的运动恢复结构研究领域的最新成果;专注于运动恢复结构中相机位姿解算核心问题,并全面介绍了其中的最新成果按照解析式与学习式方法进行;在此基础上,为助力社区发展,讨论与分析的运动恢复结构当前研究进展与未来发展态势.Abstract: Structure from motion (SfM) aims to compute the absolute camera poses in a global unified coordinate system based on local feature matches between image pairs, which is a key problem in image-based 3D reconstruction. In recent years, with the development of acquisition devices, computational resources, and theoretical methods, research on SfM has gradually expanded from small-scale controlled laboratory settings to large-scale real-world indoor and outdoor environments, by which significant progress in both theoretical methods and practical applications has been achieved. Starting from the perspective of practical application, this survey focuses on the latest advances in large-scale 3D scene reconstruction-oriented SfM research, and compared with existing surveys in the related field, it specifically concentrates on the core problem of camera pose estimation in SfM and provides a comprehensive overview of the latest developments on both analytical and learning-based approaches. On this basis, to facilitate community development, the current progress and future trend in SfM research are also discussed and analyzed by this survey.
-
-
[1] Zhu Qing, Zhang Liguo, Ding Yulin, et al. From real 3D modeling to digital twin modeling[J]. Acta Geodaetica et Cartographica Sinica, 2022, 51(6): 1040-1049(in Chinese) (朱庆, 张利国, 丁雨淋, 等. 从实景三维建模到数字孪生建模[J]. 测绘学报, 2022, 51(6): 1040-1049) [2] Yang Linyao, Chen Siyuan, Wang Xiao, et al. Digital twins and parallel systems: state of the art, comparisons and prospect[J]. Acta Automatica Sinica, 2019, 45(11): 2001-2031(in Chinese) (杨林瑶, 陈思远, 王晓, 等. 数字孪生与平行系统: 发展现状、对比及展望[J]. 自动化学报, 2019, 45(11): 2001-2031) [3] Moemen M Y, Elghamrawy H, Givigi S N, et al. 3-D reconstruction and measurement system based on multimobile robot machine vision[J]. IEEE Transactions on Instrumentation and Measurement, 2021, 70: Article No.5003109
[4] Qi Z P, Zou Z X, Chen H, et al. 3D reconstruction of remote sensing mountain areas with TSDF-based neural networks[J]. Remote Sensing, 2022, 14(17): Article No.4333
[5] Snavely N, Seitz S M, Szeliski R. Modeling the world from Internet photo collections[J]. International Journal of Computer Vision, 2008, 80(2): 189-210
[6] Shen S H. Accurate multiple view 3D reconstruction using patch-based stereo for large-scale scenes[J]. IEEE Transactions on Image Processing, 2013, 22(5): 1901-1914
[7] Fuhrmann S, Goesele M. Floating scale surface reconstruction[J]. ACM Transactions on Graphics, 2014, 33(4): Article No.46
[8] Waechter M, Moehrle N, Goesele M. Let there be color! Large-scale texturing of 3D reconstructions[C] //Proceedings of the 13th European Conference on Computer Vision. Heidelberg: Springer, 2014: 836-850
[9] Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91-110
[10] DeTone D, Malisiewicz T, Rabinovich A. SuperPoint: self-supervised interest point detection and description[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Los Alamitos: IEEE Computer Society Press, 2018: 224-236
[11] Nister D, Stewenius H. Scalable recognition with a vocabulary tree[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2006: 2161-2168
[12] Arandjelović R, Gronat P, Torii A, et al. NetVLAD: CNN architecture for weakly supervised place recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(6): 1437-1451
[13] Muja M, Lowe D G. Scalable nearest neighbor algorithms for high dimensional data[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(11): 2227-2240
[14] Sarlin P E, DeTone D, Malisiewicz T, et al. SuperGlue: learning feature matching with graph neural networks[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2020: 4937-4946
[15] Agarwal S, Snavely N, Seitz S M, et al. Bundle adjustment in the large[C] //Proceedings of the 11th European Conference on Computer Vision. Heidelberg: Springer, 2010: 29-42
[16] Tang C Z, Tan P. BA-Net: dense bundle adjustment networks[OL]. [2024-03-19]. https://openreview.net/forum?id=B1gabhRcYX
[17] Ma J Y, Jiang X Y, Fan A X, et al. Image matching from handcrafted to deep features: a survey[J]. International Journal of Computer Vision, 2021, 129(1): 23-79
[18] Shan Jie. A brief history and essentials of bundle adjustment[J]. Geomatics and Information Science of Wuhan University, 2018, 43(12): 1797-1810(in Chinese) (单杰. 光束法平差简史与概要[J]. 武汉大学学报·信息科学版, 2018, 43(12): 1797-1810) [19] Snavely N, Simon I, Goesele M, et al. Scene reconstruction and visualization from community photo collections[J]. Proceedings of the IEEE, 2010, 98(8): 1370-1390
[20] Jiang S, Jiang C, Jiang W S. Efficient structure from motion for large-scale UAV images: a review and a comparison of SfM tools[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 167: 230-251
[21] Chen Wu, Jiang San, Li Qingquan, et al. Recent research of incremental structure from motion for unmanned aerial vehicle images[J]. Geomatics and Information Science of Wuhan University, 2022, 47(10): 1662-1674(in Chinese) (陈武, 姜三, 李清泉, 等. 无人机影像增量式运动恢复结构研究进展[J]. 武汉大学学报·信息科学版, 2022, 47(10): 1662-1674) [22] Saputra M R U, Markham A, Trigoni N. Visual SLAM and structure from motion in dynamic environments: a survey[J]. ACM Computing Surveys, 2019, 51(2): Article No.37
[23] Hartley R, Trumpf J, Dai Y C, et al. Rotation averaging[J]. International Journal of Computer Vision, 2013, 103(3): 267-305
[24] Tron R, Zhou X W, Daniilidis K. A survey on rotation optimization in structure from motion[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Los Alamitos: IEEE Computer Society Press, 2016: 1032-1040
[25] Özyeşil O, Voroninski V, Basri R, et al. A survey of structure from motion[J]. Acta Numerica, 2017, 26: 305-364
[26] Yan Shen, Zhang Maojun, Fan Yachun, et al. Progress in the large-scale outdoor image 3D reconstruction[J]. Journal of Image and Graphics, 2021, 26(6): 1429-1449(in Chinese) (颜深, 张茂军, 樊亚春, 等. 大规模室外图像3 维重建技术研究进展[J]. 中国图象图形学报, 2021, 26(6): 1429-1449) [27] Bianco S, Ciocca G, Marelli D. Evaluating the performance of structure from motion pipelines[J]. Journal of Imaging, 2018, 4(8): Article No.98
[28] Hartley R, Zisserman A. Multiple view geometry in computer vision[M]. 2nd ed. Cambridge: Cambridge University Press, 2003
[29] LeCun Y, Bengio Y, Hinton G. Deep learning[J]. Nature, 2015, 521(7553): 436-444
[30] Nister D. An efficient solution to the five-point relative pose problem[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(6): 756-770
[31] Hartley R I, Sturm P. Triangulation[J]. Computer Vision and Image Understanding, 1997, 68(2): 146-157
[32] Hesch J A, Roumeliotis S I. A direct least-squares (DLS) method for PnP[C] //Proceedings of the International Conference on Computer Vision. Los Alamitos: IEEE Computer Society Press, 2011: 383-390
[33] Schönberger J L, Frahm J M. Structure-from-motion revisited[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2016: 4104-4113
[34] Fischler M A, Bolles R C. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography[J]. Communications of the ACM, 1981, 24(6): 381-395
[35] Cui H N, Shen S H, Gao X, et al. Batched incremental structure-from-motion[C] //Proceedings of the International Conference on 3D Vision. Los Alamitos: IEEE Computer Society Press, 2017: 205-214
[36] Cui H N, Shen S H, Gao W. Voting-based incremental structure-from-motion[C] //Proceedings of the 24th International Conference on Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2018: 1929-1934
[37] Ke T, Roumeliotis S I. An efficient algebraic solution to the perspective-three-point problem[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2017: 4618-4626
[38] Lepetit V, Moreno-Noguer F, Fua P. EPnP: an accurate O(n) solution to the PnP problem[J]. International Journal of Computer Vision, 2009, 81(2): 155-166
[39] Moulon P, Monasse P, Marlet R. Adaptive structure from motion with a contrario model estimation[C] //Proceedings of the 11th Asian Conference on Computer Vision. Heidelberg: Springer, 2013: 257-270
[40] Wu C C. Towards linear-time incremental structure from motion[C] //Proceedings of the International Conference on 3D Vision. Los Alamitos: IEEE Computer Society Press, 2013: 127-134
[41] Cui H N, Gao X, Shen S H, et al. HSfM: hybrid structure-from-motion[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2017: 2393-2402
[42] Levi N, Werman M. The viewing graph[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2003: 518-524
[43] Govindu V M. Combining two-view constraints for motion estimation[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2001: 218-225
[44] Barath D, Mishkin D, Eichhardt I, et al. Efficient initial pose-graph generation for global SfM[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2021: 14541-14550
[45] Govindu V M. Lie-algebraic averaging for globally consistent motion estimation[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2004: 684-691
[46] Martinec D, Pajdla T. Robust rotation and translation estimation in multiview reconstruction[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2007: 1-8
[47] Hartley R, Aftab K, Trumpf J. L1 rotation averaging using the weiszfeld algorithm[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2011: 3041-3048
[48] Fredriksson J, Olsson C. Simultaneous multiple rotation averaging using lagrangian duality[C] //Proceedings of the 11th Asian Conference on Computer Vision. Heidelberg: Springer, 2013: 245-258
[49] Chatterjee A, Govindu V M. Efficient and robust large-scale rotation averaging[C] //Proceedings of the IEEE International Conference on Computer Vision. Los Alamitos: IEEE Computer Society Press, 2013: 521-528
[50] Reich M, Yang M Y, Heipke C. Global robust image rotation from combined weighted averaging[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2017, 127: 89-101
[51] Stefanski L A, Boos D D. The calculus of M-estimation[J]. The American Statistician, 2002, 56(1): 29-38
[52] Chatterjee A, Govindu V M. Robust relative rotation averaging[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 958-972
[53] Arrigoni F, Rossi B, Fragneto P, et al. Robust synchronization in SO(3) and SE(3) via low-rank and sparse matrix decomposition[J]. Computer Vision and Image Understanding, 2018, 174: 95-113
[54] Eriksson A, Olsson C, Kahl F, et al. Rotation averaging and strong duality[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2018: 127-135
[55] Shi Y P, Lerman G. Message passing least squares framework and its application to rotation synchronization[C] //Proceedings of the 37th International Conference on Machine Learning. New York: ML Research Press, 2020: 8796-8806
[56] Shi Y P, Wyeth C M, Lerman G. Robust group synchronization via quadratic programming[C] //Proceedings of the 39th International Conference on Machine Learning. New York: ML Research Press, 2022: 20095-20105
[57] Chen Y, Zhao J, Kneip L. Hybrid rotation averaging: a fast and robust rotation averaging approach[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2021: 10353-10362
[58] Zhang G L, Larsson V, Barath D. Revisiting rotation averaging: uncertainties and robust losses[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2023: 17215-17224
[59] Barath D, Noskova J, Matas J. Marginalizing sample consensus[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(11): 8420-8432
[60] Govindu V M. Robustness in motion averaging[C] //Proceedings of the 7th Asian Conference on Computer Vision. Heidelberg: Springer, 2006: 457-466
[61] Zach C, Klopschitz M, Pollefeys M. Disambiguating visual relations using loop constraints[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2010: 1426-1433
[62] Kschischang F R, Frey B J, Loeliger H A. Factor graphs and the sum-product algorithm[J]. IEEE Transactions on Information Theory, 2001, 47(2): 498-519
[63] Cui H N, Shen S H, Gao W, et al. Efficient and robust large-scale structure-from-motion via track selection and camera prioritization[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2019, 156: 202-214
[64] Gao X, Zhu L J, Xie Z X, et al. Incremental rotation averaging[J]. International Journal of Computer Vision, 2021, 129(4): 1202-1216
[65] Lee S H, Civera J. HARA: a hierarchical approach for robust rotation averaging[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2022: 15756-15765
[66] Sim K, Hartley R. Recovering camera motion using L∞ minimization[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2006: 1230-1237
[67] Özyeşil O, Singer A. Robust camera location estimation by convex programming[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2015: 2674-2683
[68] Kennedy R, Daniilidis K, Naroditsky O, et al. Identifying maximal rigid components in bearing-based localization[C] //Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems. Los Alamitos: IEEE Computer Society Press, 2012: 194-201
[69] Cui Z P, Tan P. Global structure-from-motion by similarity averaging[C] //Proceedings of the IEEE International Conference on Computer Vision. Los Alamitos: IEEE Computer Society Press, 2015: 864-872
[70] Goldstein T, Hand P, Lee C, et al. ShapeFit and ShapeKick for robust, scalable structure from motion[C] //Proceedings of the 14th European Conference on Computer Vision. Heidelberg: Springer, 2016: 289-304
[71] Zhuang B B, Cheong L F, Lee G H. Baseline desensitizing in translation averaging[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2018: 4539-4547
[72] Beck A, Tetruashvili L. On the convergence of block coordinate descent type methods[J]. SIAM Journal on Optimization, 2013, 23(4): 2037-2060
[73] Jiang N J, Cui Z P, Tan P. A global linear method for camera pose registration[C] //Proceedings of the IEEE International Conference on Computer Vision. Los Alamitos: IEEE Computer Society Press, 2013: 481-488
[74] Wilson K, Snavely N. Robust global translations with 1DSfM[C] //Proceedings of the 13th European Conference on Computer Vision. Heidelberg: Springer, 2014: 61-75
[75] Cui Z P, Jiang N J, Tang C Z, et al. Linear global translation estimation with feature tracks[C] //Proceedings of the British Machine Vision Conference. Swansea: BMVA Press, 2015: 46.1-46.13
[76] Sengupta S, Amir T, Galun M, et al. A new rank constraint on multi-view fundamental matrices, and its application to camera location recovery[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2017: 2413-2421
[77] Dong Q L, Gao X, Cui H N, et al. Robust camera translation estimation via rank enforcement[J]. IEEE Transactions on Cybernetics, 2022, 52(2): 862-872
[78] Gao X, Zhu L J, Fan B, et al. Incremental translation averaging[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(11): 7783-7795
[79] Manam L, Govindu V M. Correspondence reweighted translation averaging[C] //Proceedings of the 17th European Conference on Computer Vision. Heidelberg, Springer, 2022: 56-72
[80] Cai Q, Zhang L L, Wu Y X, et al. A pose-only solution to visual reconstruction and navigation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(1): 73-86
[81] Manam L, Govindu V M. Sensitivity in translation averaging[C] //Proceedings of the 37th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2023: Article No.2741
[82] Arrigoni F, Fusiello A. Bearing-based network localizability: a unifying view[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(9): 2049-2069
[83] Arie-Nachimson M, Kovalsky S Z, Kemelmacher-Shlizerman I, et al. Global motion estimation from point matches[C] //Proceedings of the 2nd International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission. Los Alamitos: IEEE Computer Society Press, 2012: 81-88
[84] Crandall D J, Owens A, Snavely N, et al. SfM with MRFs: discrete-continuous optimization for large-scale structure from motion[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(12): 2841-2853
[85] Tron R, Vidal R. Distributed 3-D localization of camera sensor networks from 2-D image measurements[J]. IEEE Transactions on Automatic Control, 2014, 59(12): 3325-3340
[86] Moulon P, Monasse P, Marlet R. Global fusion of relative motions for robust, accurate and scalable structure from motion[C] //Proceedings of the IEEE International Conference on Computer Vision. Los Alamitos: IEEE Computer Society Press, 2013: 3248-3255
[87] Kasten Y, Geifman A, Galun M, et al. Algebraic characterization of essential matrices and their averaging in multiview settings[C] //Proceedings of the IEEE/CVF International Conference on Computer Vision. Los Alamitos: IEEE Computer Society Press, 2019: 5894-5902
[88] Geifman A, Kasten Y, Galun M, et al. Averaging essential and fundamental matrices in collinear camera settings[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2020: 6020-6029
[89] Bhowmick B, Patra S, Chatterjee A, et al. Divide and conquer: efficient large-scale structure from motion using graph partitioning[C] //Proceedings of the 12th Asian Conference on Computer Vision. Heidelberg: Springer, 2015: 273-287
[90] Shi J B, Malik J. Normalized cuts and image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(8): 888-905
[91] Cui H N, Shen S H, Gao X, et al. CSFM: community-based structure from motion[C] //Proceedings of the IEEE International Conference on Image Processing. Los Alamitos: IEEE Computer Society Press, 2017: 4517-4521
[92] Clauset A, Newman M E J, Moore C. Finding community structure in very large networks[J]. Physical Review E, 2004, 70(6): Article No.066111
[93] Zhu S Y, Zhang R Z, Zhou L, et al. Very large-scale global SfM by distributed motion averaging[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2018: 4568-4577
[94] Locher A, Havlena M, Van Gool L. Progressive structure from motion[C] //Proceedings of the 15th European Conference on Computer Vision. Heidelberg, Springer, 2018: 22-38
[95] Chen Y, Shen S H, Chen Y S, et al. Graph-based parallel large scale structure from motion[J]. Pattern Recognition, 2020, 107: Article No.107537
[96] Xu B, Zhang L, Liu Y X, et al. Robust hierarchical structure from motion for large-scale unstructured image sets[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2021, 181: 367-384
[97] Chen Y, Yu Z H, Song S, et al. AdaSfM: from coarse global to fine incremental adaptive structure from motion[C] //Proceedings of the IEEE International Conference on Robotics and Automation. Los Alamitos: IEEE Computer Society Press, 2023: 2054-2061
[98] Gao X, Luo J Z, Li K Q, et al. Hierarchical RANSAC-based rotation averaging[J]. IEEE Signal Processing Letters, 2020, 27: 1874-1878
[99] Gao X, Zhu L J, Cui H N, et al. IRA++: distributed incremental rotation averaging[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(7): 4885-4892
[100] Gao X, Cui H N, Li M H, et al. IRAv3: hierarchical incremental rotation averaging on the fly[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(4): 2001-2006
[101] Gao X, Cui H N, Huang W T, et al. IRAv3+: hierarchical incremental rotation averaging via multiple connected dominating sets[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34(4): 3049-3055
[102] Guha S, Khuller S. Approximation algorithms for connected dominating sets[J]. Algorithmica, 1998, 20(4): 374-387
[103] Havlena M, Torii A, Knopp J, et al. Randomized structure from motion based on atomic 3D models from camera triplets[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2009: 2874-2881
[104] Gherardi R, Farenzena M, Fusiello A. Improving the efficiency of hierarchical structure-and-motion[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2010: 1594-1600
[105] Toldo R, Gherardi R, Farenzena M, et al. Hierarchical structure-and-motion recovery from uncalibrated images[J]. Computer Vision and Image Understanding, 2015, 140: 127-143
[106] Zhao L, Huang S D, Dissanayake G. Linear SFM: a hierarchical approach to solving structure-from-motion problems by decoupling the linear and nonlinear components[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2018, 141: 275-289
[107] Sweeney C, Sattler T, Höllerer T, et al. Optimizing the viewing graph for structure-from-motion[C] //Proceedings of the IEEE International Conference on Computer Vision. Los Alamitos: IEEE Computer Society Press, 2015: 801-809
[108] Shah R, Chari V, Narayanan P J. View-graph selection framework for SfM[C] //Proceedings of the 15th European Conference on Computer Vision. Heidelberg, Springer, 2018: 553-568
[109] Butt A A, Collins R T. Multi-target tracking by Lagrangian relaxation to min-cost network flow[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2013: 1846-1853
[110] Cui H N, Shi T X, Zhang J, et al. View-graph construction framework for robust and efficient structure-from-motion[J]. Pattern Recognition, 2021, 114: Article No.107712
[111] Gao X S, Hou X R, Tang J L, et al. Complete solution classification for the perspective-three-point problem[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(8): 930-943
[112] Shen T W, Zhu S Y, Fang T, et al. Graph-based consistent matching for structure-from-motion[C] //Proceedings of the 14th European Conference on Computer Vision. Heidelberg, Springer, 2016: 139-155
[113] Cui H N, Shen S H, Gao W, et al. Progressive large-scale structure-from-motion with orthogonal MSTs[C] //Proceedings of the International Conference on 3D Vision. Los Alamitos: IEEE Computer Society Press, 2018: 79-88
[114] Hart P E, Nilsson N J, Raphael B. A formal basis for the heuristic determination of minimum cost paths[J]. IEEE Transactions on Systems Science and Cybernetics, 1968, 4(2): 100-107
[115] Chum O, Matas J. Matching with PROSAC - progressive sample consensus[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2005: 220-226
[116] Ye Z C, Bao C, Zhou X, et al. EC-SfM: efficient covisibility-based structure-from-motion for both sequential and unordered images[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34(1): 110-123
[117] Havlena M, Torii A, Pajdla T. Efficient structure from motion by graph optimization[C] //Proceedings of the 11th European Conference on Computer Vision. Heidelberg: Springer, 2010: 100-113
[118] Snavely N, Seitz S M, Szeliski R. Skeletal graphs for efficient structure from motion[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2008: 1-8
[119] Althöfer I, Das G, Dobkin D, et al. On sparse spanners of weighted graphs[J]. Discrete & Computational Geometry, 1993, 9(1): 81-100
[120] Li X W, Wu C C, Zach C, et al. Modeling and recognition of landmark image collections using iconic scene graphs[C] //Proceedings of the 10th European Conference on Computer Vision. Heidelberg: Springer, 2008: 427-440
[121] Shah R, Deshpande A, Narayanan P J. Multistage SFM: revisiting incremental structure from motion[C] //Proceedings of the 2nd International Conference on 3D Vision. Los Alamitos: IEEE Computer Society Press, 2014: 417-424
[122] Li R, Gong D, Sun J Q, et al. Robust and accurate hybrid structure-from-motion[C] //Proceedings of the IEEE International Conference on Image Processing. Los Alamitos: IEEE Computer Society Press, 2019: 494-498
[123] Jiang S, Li Q Q, Jiang W S, et al. Parallel structure from motion for UAV images via weighted connected dominating set[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: Article No.5413013
[124] Zhen W K, Hu Y Y, Liu J F, et al. A joint optimization approach of LiDAR-camera fusion for accurate dense 3-D reconstructions[J]. IEEE Robotics and Automation Letters, 2019, 4(4): 3585-3592
[125] Zhen W K, Hu Y Y, Yu H, et al. LiDAR-enhanced structure-from-motion[C] //Proceedings of the IEEE International Conference on Robotics and Automation. Los Alamitos: IEEE Computer Society Press, 2020: 6773-6779
[126] Gao X, Shen S H, Zhu L J, et al. Complete scene reconstruction by merging images and laser scans[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(10): 3688-3701
[127] Tu D T, Wang B Y, Cui H N, et al. Multi-camera-LiDAR auto-calibration by joint structure-from-motion[C] //Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems. Los Alamitos: IEEE Computer Society Press, 2022: 2242-2249
[128] Tu D T, Cui H N, Shen S H. PanoVLM: low-cost and accurate panoramic vision and LiDAR fused mapping[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2023, 206: 149-167
[129] Gao X, Tao D D, Liu Y Q, et al. Vehicle-borne multi-sensor temporal-spatial pose globalization via cross-domain data association[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(12): 13962-13975
[130] Cui H N, Shen S H. MMA: multi-camera based global motion averaging[C] //Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2022: 490-498
[131] Cui H N, Gao X, Shen S H. MCSfM: multi-camera-based incremental structure-from-motion[J]. IEEE Transactions on Image Processing, 2023, 32: 6441-6456
[132] Lhuillier M. Incremental fusion of structure-from-motion and GPS using constrained bundle adjustments[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(12): 2489-2495
[133] Mouragnon E, Lhuillier M, Dhome M, et al. Generic and real-time structure from motion using local bundle adjustment[J]. Image and Vision Computing, 2009, 27(8): 1178-1193
[134] Cui H N, Shen S H, Gao W, et al. Efficient large-scale structure from motion by fusing auxiliary imaging information[J]. IEEE Transactions on Image Processing, 2015, 24(11): 3561-3573
[135] Mildenhall B, Srinivasan P P, Tancik M, et al. NeRF: representing scenes as neural radiance fields for view synthesis[C] //Proceedings of the 16th European Conference on Computer Vision. Heidelberg: Springer, 2020: 405-421
[136] Gao K, Gao Y N, He H J, et al. NeRF: neural radiance field in 3D vision, a comprehensive review[OL]. [2024-03-19]. https://arxiv.org/abs/2210.00379
[137] Wu Z H, Pan S R, Chen F W, et al. A comprehensive survey on graph neural networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(1): 4-24
[138] Purkait P, Chin T J, Reid I. NeuRoRA: neural robust rotation averaging[C] //Proceedings of the 16th European Conference on Computer Vision. Heidelberg: Springer, 2020: 137-154
[139] Yang L W, Li H, Rahim J A, et al. End-to-end rotation averaging with multi-source propagation[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2021: 11769-11778
[140] Yew Z J, Lee G H. Learning iterative robust transformation synchronization[C] //Proceedings of the International Conference on 3D Vision. Los Alamitos: IEEE Computer Society Press, 2021: 1206-1215
[141] Li H, Cui Z P, Liu S C, et al. RAGO: recurrent graph optimizer for multiple rotation averaging[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2022: 15766-15775
[142] Okorn B, Pan C, Hebert M, et al. Deep projective rotation estimation through relative supervision[C] //Proceedings of the 6th Conference on Robot Learning. New York: ML Research Press, 2022: 1575-1585
[143] Tejus G K, Zara G, Rota P, et al. Rotation synchronization via deep matrix factorization[C] //Proceedings of the IEEE International Conference on Robotics and Automation. Los Alamitos: IEEE Computer Society Press, 2023: 2113-2119
[144] Arora S, Cohen N, Hu W, et al. Implicit regularization in deep matrix factorization[C] //Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2019: Article No.666
[145] Lin C H, Ma W C, Torralba A, et al. BARF: bundle-adjusting neural radiance fields[C] //Proceedings of the IEEE/CVF International Conference on Computer Vision. Los Alamitos: IEEE Computer Society Press, 2021: 5721-5731
[146] Chen Y, Lee G H. DBARF: deep bundle-adjusting generalizable neural radiance fields[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2023: 24-34
[147] Park K, Henzler P, Mildenhall B, et al. CamP: camera preconditioning for neural radiance fields[J]. ACM Transactions on Graphics, 2023, 42(6): Article No.208
[148] Jeong Y, Ahn S, Choy C, et al. Self-calibrating neural radiance fields[C] //Proceedings of the IEEE/CVF International Conference on Computer Vision. Los Alamitos: IEEE Computer Society Press, 2021: 5826-5834
[149] Meng Q, Chen A P, Luo H M, et al. GNeRF: GAN-based neural radiance field without posed camera[C] //Proceedings of the IEEE/CVF International Conference on Computer Vision. Los Alamitos: IEEE Computer Society Press, 2021: 6331-6341
[150] Goodfellow I J, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets[C] //Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2014: 2672-2680
[151] Bian W J, Wang Z R, Li K J, et al. NoPe-NeRF: optimising neural radiance field with no pose prior[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2023: 4160-4169
[152] Chen Y, Chen X Y, Wang X, et al. Local-to-global registration for bundle-adjusting neural radiance fields[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2023: 8264-8273
[153] Zhang J H, Zhan F N, Yu Y C, et al. Pose-free neural radiance fields via implicit pose regularization[C] //Proceedings of the IEEE/CVF International Conference on Computer Vision. Los Alamitos: IEEE Computer Society Press, 2023: 3511-3520
[154] Vijayanarasimhan S, Ricco S, Schmid C, et al. SfM-Net: learning of structure and motion from video[OL]. [2024-03-19]. https://arxiv.org/abs/1704.07804
[155] Wang J Y, Zhong Y R, Dai Y C, et al. Deep two-view structure-from-motion revisited[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2021: 8949-8958
[156] Zhuang B B, Chandraker M. Fusing the old with the new: learning relative camera pose with geometry-guided uncertainty[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2021: 32-42
[157] Parameshwara C M, Hari G, Fermüller C, et al. DiffPoseNet: direct differentiable camera pose estimation[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2022: 6835-6844
[158] Xiao Y X, Li L, Li X D, et al. DeepMLE: a robust deep maximum likelihood estimator for two-view structure from motion[C] //Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems. Los Alamitos: IEEE Computer Society Press, 2022: 10643-10650
[159] Wei X K, Zhang Y D, Li Z W, et al. DeepSFM: structure from motion via deep bundle adjustment[C] //Proceedings of the 16th European Conference on Computer Vision. Heidelberg: Springer, 2020: 230-247
[160] Teed Z, Deng J. DeepV2D: video to depth with differentiable structure from motion[OL]. [2024-03-19]. https://openreview.net/forum?id=HJeO7RNKPr
[161] Gu X D, Yuan W H, Dai Z Z, et al. DRO: deep recurrent optimizer for video to depth[J]. IEEE Robotics and Automation Letters, 2023, 8(5): 2844-2851
[162] Chung J, Gulcehre C, Cho K, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[OL]. [2024-03-19]. https://api.semanticscholar.org/CorpusID:5201925
[163] Moran D, Koslowsky H, Kasten Y, et al. Deep permutation equivariant structure from motion[C] //Proceedings of the IEEE/CVF International Conference on Computer Vision. Los Alamitos: IEEE Computer Society Press, 2021: 5956-5966
[164] Xiao Y X, Xue N, Wu T F, et al. Level-S2fM: structure from motion on neural level set of implicit surfaces[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2023: 17205-17214
[165] Brynte L, Iglesias J P, Olsson C, et al. Learning structurefrom-motion with graph attention networks[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2024: 4808-4817
[166] Brody S, Alon U, Yahav E. How attentive are graph attention networks?[OL]. [2024-03-19]. https://openreview.net/forum?id=F72ximsx7C1
[167] He X Y, Sun J M, Wang Y F, et al. Detector-free structure from motion[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2024: 21594-21603
[168] Sun J M, Shen Z H, Wang Y A, et al. LoFTR: detector-free local feature matching with transformers[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2021: 8918-8927
[169] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C] //Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010
[170] Wang J Y, Karaev N, Rupprecht C, et al. VGGSfM: visual geometry grounded deep structure from motion[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2024: 21686-21697
[171] Harley A W, Fang Z Y, Fragkiadaki K. Particle video revisited: tracking through occlusions using point trajectories[C] //Proceedings of the 17th European Conference on Computer Vision. Heidelberg, Springer, 2022: 59-75
计量
- 文章访问数: 298
- HTML全文浏览量: 23
- PDF下载量: 134