SIFT SLAM ViSion details MIT 16.412J Spring 2004 Vikash K. mansinghka
SIFT SLAM Vision Details MIT 16.412J Spring 2004 Vikash K. Mansinghka 1
Outline Lightning Summary Black Box Model of SIFT Slam vision System Challenges in Computer Vision What these challenges mean for visual Slam e How sift extracts candidate landmarks How landmarks are tracked in SIFT SLAM Alternative vision-based SLAM systems Open questions
Outline • Lightning Summary • Black Box Model of SIFT SLAM Vision System • Challenges in Computer Vision • What these challenges mean for visual SLAM • How SIFT extracts candidate landmarks • How landmarks are tracked in SIFT SLAM • Alternative vision-based SLAM systems • Open questions 2
Lightning Summary Motivation SLAM without modifying the environment Landmark candidates are extracted by the sift process Candidates matched between cameras to get 3D positions Candidates pruned according to consistency w/ robot's expectations Survivors sent off for statistical processing
Lightning Summary • Motivation: SLAM without modifying the environment • Landmark candidates are extracted by the SIFT process • Candidates matched between cameras to get 3D positions • Candidates pruned according to consistency w/ robot’s expectations • Survivors sent off for statistical processing 3
Review of robot specifications ● Triclops3- camera“ stereo vision system Odometry system which produces p, 8 Center camera is "reference
Review of Robot Specifications • Triclops 3-camera “stereo” vision system • Odometry system which produces [p, q, �] • Center camera is “reference” 4
Black box model of vision System For now, based on black-magic(SIFT). Produces landmarks Assume landmarks globally indexed by i ● Per frame inputs p,,8-odometry input(x, z, bearing deltas. List of (i, i)-new landmark pos(from SLAM) Per frame output is a list of (i, ci,Ti, ri, Ci) for each visible landmark讠 where: i is its measured 3D pos(wrt. camera pos i is its map 3D pos(wrt initial robot pos), if it isn't new (ri, ci) is its pixel coordinates in center camera
Black Box Model of Vision System • For now, based on black-magic (SIFT). Produces landmarks. • Assume landmarks globally indexed by i. • Per frame inputs: – [p, q, �] - odometry input (x, z, bearing deltas.) – List of (i, xi) - new landmark pos (from SLAM) • Per frame output is a list of (i, x landmark i where: �, xi, ri, ci) for each visible i – x�i is its measured 3D pos (w.r.t. camera pos) – xi is its map 3D pos (w.r.t. initial robot pos), if it isn’t new – (ri, ci) is its pixel coordinates in center camera 5