Reproject Disparity Images to 3D Point Clouds
Disparity images can be converted to 3D Point Clouds using the factory camera calibration from the MultiSense using the Q matrix.
The Q matrix can be assembled from the stereo calibration’s P Matrix. The below equation represents a general definition for a Q Matrix for non-isotropic pixels.
When the rectified focal lengths for \(f_x\) and \(f_y\) are equal (\(f_x = f_y = f\)) the Q matrix can be simplified to the following representation by dividing each element in the non-isotropic representation by \(f * T_x\)
The Q matrix can be used directly to convert each pixel of the disparity image to a corresponding 3D point in the left rectified camera’s coordinate frame. This can be done using the following expression
Note
Disparity images transmitted from the MultiSense are quantized to 1/16th of a pixel and stored as unsigned 16bit integers. Before converting a disparity image to a 3D point cloud, the disparity image need to be converted into a floating point pixel representation.
C++
// assume we have our disparity image stored in a standard C++ container in row-major order
std::vector<uint16_t> disparity = {...};
// assume this is loaded from the extrinsics calibration
Eigen::Matrix4d Q = {...};
std::vector<Eigen::Vector3d> points;
for (size_t v = 0 ; v < height ; ++v)
{
for (size_t u = 0 ; u < width ; ++u)
{
const uint16_t raw_disparity = disparity[v * width + u];
// Skip invalid disparity values
if (raw_disparity == 0)
{
continue;
}
const double d = static_cast<double>(raw_disparity) / 16.0;
const Eigen::Vector3d point = (Q * Eigen::Vector4d{static_cast<double>(u), static_cast<double>(v), d, 1.0}).hnormalized();
points.emplace_back(point);
}
}
Python
# assume we have our disparity image stored in a numpy array of type np.uint16 in row-major order
disparity = np.array([...])
# assume this is loaded from the extrinsics calibration
Q = np.matrix([...])
points = []
for v in range(height):
for u in range(width):
d = disparity[v * width + u]
if d == 0:
continue
scaled_point = Q @ np.array([float(u), float(v), float(d) / 16.0, 1.0])
point = scaled_point[3]
points.append(point[0:3])
The OpenCV function cv::reprojectImageTo3d can also be used to convert Disparity images to 3D points. This is done using the 4x4 perspective transformation Q matrix.
OpenCV C++
// multisense_disparity is a unsigned 16bit disparity image directly transmitted from the MultiSense. It's assumed
// this is loaded into the program as a cv::Mat with a datatype of CV_16UC1. reprojectImageTo3D takes
// floating point disparity images as input, so ensure we convert to the floating point pixel representation for
// the disparity before calling reprojectImageTo3D
cv::Mat multisense_disparity = ...;
cv::Mat disparity;
multisense_disparity.convertTo(disparity, CV_32FC1, 1.0/16.0);
cv::Mat point_cloud;
cv::reprojectImageTo3D(disparity, point_cloud, Q);
OpenCV Python
# multisense_disparity is a unsigned 16bit disparity image directly transmitted from the MultiSense. It's assumed
# this is loaded into the program as a numpy array with a datatype of np.uint16. reprojectImageTo3D takes
# floating point disparity images as input, so ensure we convert to the floating point pixel representation for
# the disparity before calling reprojectImageTo3D
multisense_disparity = np.array([...])
disparity = multisense_disparity.astype(np.float) * (1.0 / 16.0)
point_cloud = cv2.reprojectImageTo3D(disparity, Q)
Point Clouds with Intensity
The disparity image maps 1-1 with the left luma rectified image. The PointCloudUtility provides and example of how to use the left rectified image to overlay intensity data from the left luma rectified image data onto a 3D point cloud without using any 3rd party dependencies like OpenCV.
Convert Disparity Images to Depth Images
A disparity image can be converted to a depth image using the relationship between the MultiSense camera’s baseline and rectified focal length.
The following depth formula can be evaluated for every disparity pixel using \(f_x\) and \(T_x\) from the stereo calibration’s extrinsic P Matrix.
Note
Disparity images from the MultiSense are transmitted quantized to 1/16th of a pixel and stored with 16bit integers. Before converting a disparity image to a depth image, the disparity image need to be converted into a floating point pixel representation
This operation is equivalent to taking only the z values from a reprojected 3D point cloud
\[z = \frac {f_x * f_y * T_x}{-f_y * d}\]
Warning
Invalid disparity values of 0 should be ignored when converting a disparity image to a depth image. Depth for disparity values of 0 is undefined.
Create Color Images
The chroma image contains the CbCr channels of the planar YCbCr420 compression scheme. A RGB color image can be created using the corresponding luma image.
The OpenCV function cv::cvtColorTwoPlane can be used to convert corresponding luma and chroma images to BGR images:
C++
// Convert from YUV to BGR color space
cv::Mat bgr;
cv::cvtColorTwoPlane(luma, chroma, bgr, cv::COLOR_YUV2BGR_NV12);
Python
# Convert from YUV to BGR color space
bgr = cv2.cvtColorTwoPlane(luma, chroma, cv2.COLOR_YUV2BGR_NV12)
You can reference the LibMultiSense ColorImageUtility for an example of how to convert a luma and chroma image pair into a RGB color image without any 3rd party dependencies.
Create a Color 3D Point Cloud
Each MultiSense’s main left/right stereo pair consists of two mono camera. This configuration was chosen to optimize stereo performance over a variety of environmental conditions. Since many applications require color images, certain models of Stereo Cameras (MultiSense S27, MultiSense S30) include a third wide angle aux color camera. All 3 camera have their intrinsics and extrinsics calibrated which allows for 3D points computed from the main stereo disparity can be colorized using the data from the color aux camera.
To colorize a point cloud computed from the MultiSense disparity image, each 3D point must be projected into the aux rectified color image using the aux extrinsic projection matrix.
The LibMultiSense DepthImageUtility provides a dependency-free example of how to project 3D points computed from the disparity image into the aux rectified image for colorization.
Approximation For Execution Speed
In practice both \(T_y\) and \(T_z\) translations in the aux camera extrinsic calibration are very small (less than 1mm). By assuming both values are zero, the corresponding aux color pixel can be determined using the ratio of the aux stereo baseline and the primary stereo baseline.
Decode a Compressed Image to RGB
The following sections provide examples of how to convert compressed I-Frames streamed via compressed image topics into RGB images.
Rectify Images Using OpenCV
Note
S27, S30, and KS21 cameras support onboard rectification of all image streams including aux camera sources. It’s recommended you use these rectified streams instead of rectifying with a userspace application
Raw distorted images from the MultiSense can be unrectified using the stereo calibration parameters stored on the camera. The following examples use built-in OpenCV functions to generate rectification lookup tables used to remap raw distorted images to rectified images.
Note
The fx, fy, cx, and cy members in the M and P calibration matrices will need to be scaled based on the camera’s operating resolution before generating OpenCV rectification lookup tables