Dev, Kapil and Lau, Manfred (2018) Data-driven modelling of perceptual properties of 3D shapes. PhD thesis, Lancaster University.
Thesis_Final.pdf - Published Version
Available under License Creative Commons Attribution-NoDerivs.
Download (19MB)
Abstract
The recent surge in 3D content generation has led to the evolution of difficult to search, organise and re-use massive online 3D visual content libraries. We explore crowdsourcing and machine learning techniques to help alleviate these difficulties by focusing on the visual perceptual properties of 3D shapes. We study “style similarity” and “aesthetics” as two fundamental perceptual properties of 3D shapes and build data-driven models. We rely on crowdsourcing platforms to collect large number of human judgements on style matching and aesthetics of 3D shapes. The judgement data collected directly from humans is used to learn metrics of style matching and aesthetics. Our style similarity measure can be used to compute style distance between a pair of input 3D shapes. In contrast to previous work, we incorporate colour and texture in addition to geometric features to build a colour and texture aware style similarity metric. We also experiment with learning objective and personalised style metrics 3D shapes. The application prototypes we build demonstrate the use of style based search and scene composition. Further, our style distance metric is built iteratively to consume lesser amount of human style judgement data compared to previous methods. We study the problem of building a data-driven model of 3D shape aesthetics in two steps. We first focus on designing a study to crowdsource human aesthetics judgement data. We then formulate a deep learning based strategy to learn a measure of 3D shape aesthetics from collected data. The results of the study in first step helped us choose an appropriate shape representation i.e. voxels as an input to deep neural networks for learning a measure of visual aesthetics. In the same crowdsourcing study, we experiment with the use of polygonal, volumetric, and point based shape representations to create shape stimuli to collect and compare human shape aesthetics judgements. On analysis of the collected data we found that that humans can reliably distinguish more aesthetic shape in a pair even from coarser shape representations such as voxels. This observation implies that detailed shape representations are not needed to compare aesthetics in pairs. The aesthetic value of a 3D shape has traditionally been explored in terms of specific visual features (or handcrafted features) such as curvature and symmetry. For example, more symmetric and curved shapes are considered aesthetic compared to less curved and symmetric shapes. We call such properties as pre-existing notion (or rules) of aesthetics. In order to develop a measure of perceptual aesthetics of 3D shapes which is independent of any pre-existing notion or shape features, we train deep neural networks directly on human aesthetics judgement data. We demonstrate the usefulness of the learned measure by designing applications to rank a collection of shapes based on their aesthetics scores and interactively build scenes using shapes with high aesthetics scores. The overarching goal of this thesis is to demonstrate the use of machine learning and crowdsourcing approaches to build data-driven models of visual perceptual properties of 3D shapes for applications in search, organisation, scene composition, and visualisation of 3D shape data present in ever increasing online 3D shape content libraries. We believe that our exploration of perceptual properties of 3D shapes will motivate further research by looking into other important perceptual properties related to our vision system and will also fuel development of techniques to automatically enhance such properties of a given 3D shape.