Auto Alignment for Product Photography
Aligning the bottom of product images is crucial for virtual staging tools like https://stylized.ai. We use a combination of affine and rotation transformations to achieve this alignment. The process involves extracting points of interest from the input image, computing the affine transform matrix using OpenCV's cv2.getAffineTransform() function, and applying the transform to the image using cv2.warpAffine(). To fine-tune the alignment, we also compute the rotation matrix using cv2.getRotationMatrix2D()
Figure 1: The original product image. Figure 2: The aligned product image.
As part of our work on Stylized, a product photography tool that virtually stages product images in 3D scenes, we encountered a challenge in ensuring that the bottom of the product was flush with the virtual platform.
To solve this problem, we needed a way to automatically align the images so that our users wouldn't have to take perfect input photos.
In this post, we'll describe the process we followed to extract points of interest from the images, apply affine and rotation transformations to align the bottom of the image, and fine-tune the alignment using rotation. Let’s get started!
Before we can apply the affine and rotation transformations, we need to have three things:The input image: This is simply the original photograph of the product that we want to align.The foreground mask: This is a binary image that segments the foreground (i.e., the product) from the background. To obtain the foreground mask, we applied a simple threshold to the input image, setting all pixels above a certain value to 1 and all others to 0.The points of interest to align: These are the coordinates of the non-zero elements in the foreground mask.Extraction of Points of Interest
To extract the necessary points to align, we wrote a function called closest_to_corners(). This function takes in an array of coordinates and the coordinates of the left and right corners. It then finds the point in coords that is closest to the left corner and the point that is closest to the right corner. These two points represent the points of interest that we want to align. To find the closest points, we used the np.argmin() function along with the Euclidean distance between the points. Finally, the function returns the closest points as a tuple. This process allows us to easily extract the necessary points to align and use them as input to the affine and rotation transformations.
To extract the points of interest from the image, we first applied a binary mask to the image to segment the foreground from the background. We then used the np.argwhere() function to find the coordinates of the non-zero elements in the mask. These coordinates were stored in a 2D Numpy array, which we transposed to obtain the desired shape for the input to the affine transformation.
To further fine-tune the alignment, we applied a rotation transformation using OpenCV's cv2.getRotationMatrix2D() function. This function takes in the center of rotation, the angle of rotation, and the scale of the transformation as input and returns the rotation matrix. To calculate the angle of rotation, we used the np.arctan2() function and passed in the elements from the transform matrix that correspond to the rotation.
Here's a modified version of the toy example above that includes the rotation transformation:
import cv2 import numpy as np # Load an image and apply a binary mask to segment the foreground from the background img = cv2.imread('image.jpg') mask = (img > 128).astype(np.uint8) # Find the coordinates of the non-zero elements in the mask coords = np.argwhere(mask == 1) coords = coords.transpose() # Define left and right corner coordinates left_corner = (0, 0) right_corner = (5, 0) # Extract points of interest using closest_to_corners() function left_point, right_point = closest_to_corners(coords, left_corner, right_corner) # Apply affine transform src = np.array([left_point, right_point, (0, 0)]) dst = np.array([[left_point, right_point], right_point, (0, 0)]) transform_matrix = cv2.getAffineTransform(src, dst) transformed_coords = cv2.transform(coords, transform_matrix) # Calculate center of rotation rows, cols = img.shape[:2] center = (cols//2, rows//2) # Extract the rotation elements from the transform matrix a, b = transform_matrix[0, 0], transform_matrix[0, 1] # Calculate the angle of rotation from the transform matrix angle = np.rad2deg(np.arctan2(b, a)) # Apply rotation transformation rotation_matrix = cv2.getRotationMatrix2D(center, angle, 1.0) rotated_coords = cv2.transform(transformed_coords, rotation_matrix)
Aligning images is a crucial step in product photography and can be done automatically using affine and rotation transformations. By extracting the points of interest and finding the closest points to the corners, we were able to apply the transformations and fine-tune the alignment. This process allowed us to create a seamless virtual staging experience for our users on https://stylized.ai. Visit us if you want to try it out and see the example in action! If you're interested in learning more about image processing techniques, we'll keep updating this page with more.