Soft Segmentation of Images
Realistic editing of photographs requires careful treatment of color mixtures that commonly occur in natural scenes. These color mixtures are typically modeled using soft selection of objects or scene colors. Hence, accurate representation of these soft transitions between image regions is essential for high-quality image editing and compositing. Current techniques for generating such representations depend heavily on interaction by a skilled visual artist, as creating such accurate object selections is a tedious task. In this thesis, we approach the soft segmentation problem from two complementary properties of a photograph. Our first focus is representing images as a mixture of main colors in the scene, by estimating soft segments of homogeneous colors. We present a robust per-pixel nonlinear optimization formulation while simultaneously targeting computational efficiency and high accuracy. We then turn our attention to semantics in a photograph and present our work on soft segmentation of particular objects in a given scene. This work features graph-based formulations that specifically target the accurate representation of soft transitions in linear systems. Each part first presents an interactive segmentation scheme that targets applications popular in professional compositing and movie post-production. The interactive formulations are then generalized to the automatic estimation of generic image representations that can be used to perform a number of otherwise complex image editing tasks effortlessly. The first problem studied is green-screen keying, interactive estimation of a clean foreground layer with accurate opacities in a studio setup with a controlled background, typically set to be green. We present a simple two-step interaction scheme to determine the main scene colors and their locations. The soft segmentation of the foreground layer is done via the novel color unmixing formulation, which can effectively represent a pixel color as a mixture of many colors characterized by statistical distributions. We show our formulation is robust against many challenges in green-screen keying and can be used to achieve production-quality keying results at a fraction of the time compared to commercial software. We then study soft color segmentation, estimation of layers with homogeneous colors and corresponding opacities. The soft color segments can be overlayed to give the original image, providing effective intermediate representation of an image. We decompose the global energy optimization formulation that typically models the soft color segmentation task into three sub-problems that can be implemented with computational efficiency and scalability. Our formulation gets its strength from the color unmixing energy, which is essential in ensuring homogeneous layer colors and accurate opacities. We show that our method achieves a segmentation quality that allows realistic manipulation of colors in natural photographs. Natural image matting is the generalized version of green-screen keying, where an accurate estimation of foreground opacities is targeted in an unconstrained setting. We approach this problem using a graph-based approach, where we model the connections in the graph as forms of information flow that distributes the information from the user input into the whole image. By carefully defining information flows to target challenging regions in complex foreground structures, we show that high-quality soft segmentation of objects can be estimated through a closed-form solution of a linear system. We extend our approach to related problems in natural image matting such as matte refinement and layer color estimation and demonstrate the effectiveness of our formulation through quantitative, qualitative and theoretical analysis. Finally, we introduce semantic soft segments, a set of layers that correspond to semantically meaningful regions in an image with accurate soft transitions between different objects. We approach this problem from a spectral segmentation angle and propose a graph structure that embeds texture and color features from the image as well as higher-level semantic information generated by a neural network. The soft segments are generated via eigendecomposition of the carefully constructed Laplacian matrix fully automatically. We demonstrate that compositing and targeted image editing tasks can be done with little effort using semantic soft segments.