cvpr cvpr2013 cvpr2013-228 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Julien Weissenberg, Hayko Riemenschneider, Mukta Prasad, Luc Van_Gool
Abstract: Urban models are key to navigation, architecture and entertainment. Apart from visualizing fa ¸cades, a number of tedious tasks remain largely manual (e.g. compression, generating new fac ¸ade designs and structurally comparing fa c¸ades for classification, retrieval and clustering). We propose a novel procedural modelling method to automatically learn a grammar from a set of fa c¸ades, generate new fa ¸cade instances and compare fa ¸cades. To deal with the difficulty of grammatical inference, we reformulate the problem. Instead of inferring a compromising, onesize-fits-all, single grammar for all tasks, we infer a model whose successive refinements are production rules tailored for each task. We demonstrate our automatic rule inference on datasets of two different architectural styles. Our method supercedes manual expert work and cuts the time required to build a procedural model of a fa ¸cade from several days to a few milliseconds.
Reference: text
sentIndex sentText sentNum sentScore
1 compression, generating new fac ¸ade designs and structurally comparing fa c¸ades for classification, retrieval and clustering). [sent-8, score-0.764]
2 We propose a novel procedural modelling method to automatically learn a grammar from a set of fa c¸ades, generate new fa ¸cade instances and compare fa ¸cades. [sent-9, score-0.592]
3 Instead of inferring a compromising, onesize-fits-all, single grammar for all tasks, we infer a model whose successive refinements are production rules tailored for each task. [sent-11, score-0.432]
4 Our method supercedes manual expert work and cuts the time required to build a procedural model of a fa ¸cade from several days to a few milliseconds. [sent-13, score-0.353]
5 The introduction of procedural modelling has cut down the required amount of work to synthesize convincing models of cities, which used to take several man-years [ 17]. [sent-17, score-0.352]
6 The main features of procedural models are that they are compact, editable, readable, semantic and advantageous for retrieval and fast graphics generation [10]. [sent-19, score-0.351]
7 While whole virtual cities can be generated in minutes thanks to procedural modelling, it takes a lot more effort when it comes to constructing a model of an existing city. [sent-20, score-0.38]
8 Existing inverse procedural modelling pipelines for exam- a limited field of uses. [sent-21, score-0.379]
9 In this work we propose methods to automatically build procedural fac ¸ade models in milliseconds for compression, comparison and new virtual fac ¸ade creation. [sent-22, score-1.757]
10 However, the possibilities given by a set of labelled fac ¸ades is limited compared to the set of applications offered by a procedural model (see Fig. [sent-30, score-1.009]
11 In this work, we transform a set of labelled fac ¸ades into a procedural model, automatically. [sent-32, score-1.009]
12 Additionally, we use the procedural models to generate new fac ¸ade instances and compare fac ¸ades layouts. [sent-33, score-1.688]
13 In a procedural model, rules are the backbone of the semantic information. [sent-34, score-0.488]
14 The rules created for inverse procedural modelling should not only be able to generate a set of exemplar buildings in a style, but specifically only the possible buildings from a style. [sent-36, score-0.642]
15 To circumvent these problems, we propose a new formulation where the grammar is directly inferred from the labelling of a fac ¸ade or a set of fac ¸ades (see Fig. [sent-40, score-1.652]
16 Our goal is to use the procedural structure of fac ¸ades for compression, comparison and virtual layout generation by exploiting shape grammars. [sent-46, score-1.109]
17 Next we detail two main approaches that have been pursued to tackle architectural inverse procedural modelling using shape grammars, and then present related work in the field of grammatical inference. [sent-53, score-0.574]
18 The first inverse procedural modelling approaches assume that a grammar is given as input and infer the appropriate parameters to represent a given building [21, 24, 25]. [sent-54, score-0.569]
19 Inverse procedural modelling of Lsystems has been tackled by Sˇt’ava et al. [sent-58, score-0.352]
20 First, labellings of rectified fac ¸ade pictures are the input to our method. [sent-67, score-0.726]
21 Those trees are combined and constitute a set of rules that describe the fac ¸ade style. [sent-69, score-0.92]
22 These inferred rules are then used for comparison, editing, virtual fac ¸ade synthesis, and rendering. [sent-70, score-1.011]
23 However, to the best of our knowledge, grammatical inference has never been performed for more than one fac ¸ade at a time and without any architectural style nor grid layout restrictions. [sent-74, score-0.994]
24 Approach Our goal is to automatically derive a set of procedural rules to describe a set of fac ¸ades of the same style, generate new ones and compare them for classification and retrieval. [sent-76, score-1.179]
25 One or several Manhattan-world segmentations of fac ¸ades are given as input, which can be obtained manually or automatically [3, 8, 13, 20]. [sent-77, score-0.706]
26 Our weakly supervised algorithm for inverse procedural modelling provides as output, sets of style rules and for each fac ¸ade, a set of parameters. [sent-78, score-1.338]
27 Insert operations insert an asset in the current scope, while split operations divide a shape into n newly created shapes. [sent-89, score-0.323]
28 With shape grammars, a fac ¸ade is described as a recursive splitting procedure from the root node (corresponding to the axiom S) to the leaf nodes (each a single asset). [sent-92, score-0.768]
29 Hence, a rule set plus a set of parameters generate a single fac ¸ade. [sent-99, score-0.813]
30 Altering one or the other will result in a different fac ¸ade. [sent-100, score-0.706]
31 We refer to a grammar instance as a set of shape grammar production rules plus a set of associated parameters. [sent-101, score-0.62]
32 The same fac ¸ade can be represented by many different sequences of operations, therefore rule sets. [sent-105, score-0.813]
33 In the case of fac ¸ade modelling, we consider visualization of an existing city, novel building generation, comparison between fac ¸ades, reporting and compression. [sent-108, score-1.431]
34 Here, consistency means that two similar fac ¸ade labellings will be described by two similar rule sets. [sent-111, score-0.833]
35 By online algorithm, we mean that new fac ¸ade instances can be added iteratively. [sent-112, score-0.706]
36 The analytic model, which can be used to compare buildings, should generalize such that it encompasses all fac ¸ades from the style. [sent-118, score-0.706]
37 Fac ¸ade parse tree generation In this section, we describe the parsing algorithm which, starting from a labelling, encodes a fac ¸ade as a binary split tree whose nodes correspond to fac ¸ade regions, operations and parameters. [sent-121, score-1.753]
38 The parsing algorithm recursively splits the fac ¸ade into smaller fac ¸ade regions until all of them consist of a single asset. [sent-123, score-1.438]
39 This can be seen as a top-down tree clustering of the fac ¸ade. [sent-124, score-0.74]
40 For each scope s, we start from a set of horizontal and vertical split line proposals P fdreofimne ad as P = Ph ∪ Pv (2) where Ph is the set of horizontal split line proposals and wPvh etrhee Pset of vertical split line proposals. [sent-135, score-0.406]
41 vertical bias, b(p) is a parental bias, α ∈ [0, 1] is a weight between the edge support term abniads t,h αe affinity term, wWei gish tth bee tswizeee onf t hthee fac ¸ade (pthpeo height respectively the width depending on the direction of py), ex = 0 if a pixel y is an edge, 1otherwise. [sent-145, score-0.768]
42 The second term sums over the affinity vd(x, y) between the asset at the edge pixel (x, y) and the asset at the nearest facing edge pixels (x, ¯ y∗), which is located according to y¯∗ = a yr¯∈gmPi¯n(Δy, y¯) (6) where PA¯ is the set of split proposals which do not belong wtoh tehree same asset. [sent-150, score-0.393]
43 The co-occurrence cd between asset pairs, where d is the direction, is computed across all fac ¸ades and stored in two affinity matrices Ch and Cv. [sent-155, score-0.86]
44 =ν1| yi−W y¯ i∗| (8) where ν is the total number of pixels belonging to split proposal lines, and the proximity measure is the normalized distance between a pixel belonging to a split proposal py and its nearest facing asset edge located along p¯ y∗ . [sent-158, score-0.391]
45 In order to reduce the number of computations, the affinity coefficients are pre-computed for each pixel over the fac ¸ade in both horizontal and vertical directions. [sent-159, score-0.778]
46 At the end of this parsing step, we obtain a binary tree of split operations that describes the fac ¸ade. [sent-160, score-0.891]
47 This tree can be represented as set of binary split rules (see Eq. [sent-161, score-0.314]
48 In the next section we show how to use these parse trees to optimize for compression, retrieval and virtual generation of new fac ¸ades. [sent-166, score-0.906]
49 Optimization of Shape Grammars Given the general method to construct a parse tree of the fac ¸ade layouts, it is our goal to optimize the grammar with respect to its production rules for the each of following specific applications, separately. [sent-168, score-1.226]
50 Comparison defines structural features to create a retrieval metric capturing differences in fac ¸ade layout. [sent-172, score-0.741]
51 Virtual fac ¸ade synthesis analysis examples of fac ¸ade types and creates a new set of consistent parameters to instantiate a fac ¸ade of the same style. [sent-174, score-2.148]
52 Grammatical inference In the previous section, we presented a way to infer a parse tree depicting a given fac ¸ade. [sent-177, score-0.832]
53 The grammatical inference phase is a succession of steps to group those single operation rules to infer a shorter representation. [sent-181, score-0.334]
54 In addition, rendering performance benefits from using a compact grammar as fewer rules need to be evaluated at render time. [sent-184, score-0.368]
55 Then, more complex production rules are inferred by comparing n-ary split nodes over all parse trees. [sent-187, score-0.462]
56 2 Production rule inference The production rule inference is based on a similar principle as [5, 19]. [sent-196, score-0.354]
57 11, we note that the more similar rules exist, the more efficient the rule inference will be at reducing the total number of rules. [sent-217, score-0.342]
58 Also, the more child rules ψ, the smaller the total number of rules |R|t with respect to |R|t−1 . [sent-219, score-0.394]
59 First, only nodes whose children are insert operation are compared and replaced by production rules if possible. [sent-222, score-0.332]
60 Later, nodes whose children are insert or production rules are considered until it is not possible to create any new production rule. [sent-223, score-0.396]
61 Comparing fac ¸ades Retrieval and clustering are two application examples for comparing fac ¸ades. [sent-234, score-1.412]
62 More formally, comparing fac ¸ades means finding an adequate distance function δ as δ :R R → R+ (13) As all fac ¸ades of the same style are similar and share visual features, using a feature-based distance would be inappropriate. [sent-235, score-1.468]
63 The MDL-based approach measures the similarity of fac ¸ades in terms of their common rules. [sent-238, score-0.706]
64 As [7] we define that two fac ¸ades A and B are more similar than A and C if | RRAA|∪ + R |RBB||<|R|RAA|∪ + R |RCC|| (14) where |RA | is the number ofrules used to describe fac ¸ade A waftheerr compression. [sent-239, score-1.43]
65 uWmeb cerreoafter a histogram oesfc cthrieb frequency of rules shared by two fac ¸ades and use a χ2 as an appropriate distance. [sent-240, score-0.903]
66 e tF PorS Se(aOch ∪ fac ¸ade parse tree, we adtiroawns a histogram w lahbicehls rL ef. [sent-245, score-0.76]
67 Virtual fac ¸ade synthesis In this section, we show how to create new, non-existent fac ¸ades from a set of real building fac ¸ades in the same style. [sent-250, score-2.167]
68 Each fac ¸ade of the input set is identified by a starting rule and a parameter vector. [sent-254, score-0.831]
69 The starting rules correspond to rules that split the whole fac ¸ade, generally into floors and balcony layout. [sent-255, score-1.24]
70 Each of these starting rules will then call the hierarchy of rules describing the structure of the fac ¸ade. [sent-256, score-1.118]
71 To generate a new fac ¸ade, we instantiate the starting rule with a new set of parameters. [sent-258, score-0.831]
72 Evaluation In this section, we evaluate the compression, fac ¸ade retrieval and virtual fac ¸ade generation. [sent-267, score-1.516]
73 First, the ECP201 1 fac ¸ades dataset [24] comprises 104 labelled images taken in rue Monge, a street in Paris. [sent-272, score-0.733]
74 Second, a subset of Graz2012 [20] is selected, which consists of 30 annotated fac ¸ades in Gruenderzeit style, which is widespread in Germany and Austria. [sent-274, score-0.706]
75 parsing, n-ary split compression, rule inference and data collection for statistics) is linear with respect to the number of input fac ¸ades. [sent-282, score-0.934]
76 Our implementation of the inference algorithm takes about 32 ms per fac ¸ade on a single core of an Intel Core i7 930. [sent-284, score-0.744]
77 A new fac ¸ade can be added to the dataset at a linear cost. [sent-286, score-0.706]
78 Cumulative Match Characteristics (CMC) for ECP201 1 for semantic fac ¸ade retrieval. [sent-289, score-0.721]
79 The growth of the number of inferred rules with respect to the number of input fac ¸ades is shown in Fig. [sent-308, score-0.942]
80 This shows that the core logic principles of the Haussmannian style can be explained after examining about 20 fac ¸ades. [sent-312, score-0.809]
81 Fac ¸ade comparison The goal of fac ¸ade retrieval is to compare a query facade to the set of known fac ¸ades and determine the most similar ones. [sent-320, score-1.472]
82 In our scenario we are not comparing appearance or the sizes of architectural elements, but the procedural layout of the fac ¸ade. [sent-321, score-1.077]
83 It is our goal to group fac ¸ades which have the same layout in terms of floors, window columns, balconies and door placement. [sent-322, score-0.782]
84 3, we evaluated a fac ¸ade retrieval and clustering on the datasets1 . [sent-325, score-0.741]
85 We manually annotated each fac ¸ade with its related data2. [sent-331, score-0.706]
86 For retrieval, we evaluate the distance measures δcommon and δpowersets and count if the ground truth fac ¸ades with no architectural changes (deemed identical) are retrieved in the top-K ranking. [sent-332, score-0.785]
87 Using the powerset method, we see that retrieving the exact instance within k = 1 has a mean expectation accuracy of 87% whereas within k = 2 all the cor- rect fac ¸ades are retrieved. [sent-343, score-0.724]
88 For clustering, we use the distance measure δpowersets and can show distinct groups between and within each ofthe fac ¸ade datasets for Haussmannian and Gruenderzeit styles, as indicated by the dendrogram and the linked heat maps as shown in Fig. [sent-346, score-0.726]
89 Virtual fac ¸ade synthesis For virtual fac ¸ade synthesis, the quality of the sampling improves when the number of parameter vector instances 2The ground truth annotation is available on the author’s website. [sent-354, score-1.511]
90 Rows correspond to different fac ¸ade structure type (i. [sent-356, score-0.706]
91 Hence, the optimization of the grammatical inference for virtual fac ¸ade synthesis follows the same objective function as for compression (see Eq. [sent-360, score-1.027]
92 The most frequent rules correspond to: 7 floors (including the ground and roof floor), 4 window columns, running balconies on the 2nd and 5th floors and shops on the ground floor. [sent-370, score-0.329]
93 Conclusion In this work we show that procedural models provide a much larger flexibility than pure mesh-based or semantic labelled representations by enabling compression, fac ¸ade comparison and new virtual fac ¸ade synthesis. [sent-372, score-1.799]
94 Our method starts by a binary split procedure on labelled image to create parse trees and consequent procedural rule sets. [sent-373, score-0.564]
95 The final grammar models are optimized on the requirements for compression and virtual synthesis (minimum number of rules inspired by MDL) and retrieval (best ranking performance inspired by bag-of-words models). [sent-374, score-0.587]
96 In all, our method removes the need for manual expert work and cuts time to build a procedural fac ¸ade model from days to milliseconds. [sent-378, score-1.036]
97 The benefits of our procedural knowledge can be used to highlight the atypical parts in a fac ¸ade and automatically complete occluded areas. [sent-379, score-0.982]
98 Also the generated grammar rules could be translated to human language to teach architectural principles to humans. [sent-380, score-0.464]
99 We will also investigate improving noisy semantic image labelling methods with the inferred grammar rules and build joint labelling and grammar inference methods. [sent-382, score-0.691]
100 Reconstruction of fac ¸ade structures using a formal grammar and RjMCMC. [sent-534, score-0.895]
wordName wordTfidf (topN-words)
[('fac', 0.706), ('ade', 0.353), ('procedural', 0.276), ('ades', 0.268), ('rules', 0.197), ('grammar', 0.171), ('asset', 0.108), ('rule', 0.107), ('grammatical', 0.099), ('compression', 0.085), ('split', 0.083), ('architectural', 0.079), ('modelling', 0.076), ('grammars', 0.075), ('virtual', 0.069), ('production', 0.064), ('style', 0.056), ('haussmannian', 0.054), ('parse', 0.054), ('wonka', 0.05), ('operations', 0.042), ('powersets', 0.041), ('floors', 0.039), ('inferred', 0.039), ('inference', 0.038), ('py', 0.036), ('balconies', 0.036), ('retrieval', 0.035), ('cities', 0.035), ('tree', 0.034), ('buildings', 0.033), ('vd', 0.033), ('insert', 0.031), ('assets', 0.03), ('gruenderzeit', 0.03), ('stiny', 0.03), ('city', 0.03), ('logic', 0.03), ('cmc', 0.03), ('labelling', 0.03), ('synthesis', 0.03), ('haegler', 0.027), ('weissenberg', 0.027), ('inverse', 0.027), ('labelled', 0.027), ('parsing', 0.026), ('architecture', 0.026), ('facade', 0.025), ('nodes', 0.025), ('proposal', 0.025), ('generation', 0.025), ('horizontal', 0.025), ('affinity', 0.025), ('door', 0.024), ('teboul', 0.024), ('proposals', 0.023), ('fa', 0.023), ('mdl', 0.023), ('days', 0.022), ('vertical', 0.022), ('cd', 0.021), ('urban', 0.02), ('architects', 0.02), ('dendrogram', 0.02), ('halatsch', 0.02), ('hayko', 0.02), ('losslessness', 0.02), ('uller', 0.02), ('splitting', 0.02), ('labellings', 0.02), ('building', 0.019), ('ofrules', 0.018), ('cade', 0.018), ('martinovi', 0.018), ('designates', 0.018), ('egsr', 0.018), ('powerset', 0.018), ('shops', 0.018), ('bp', 0.018), ('starting', 0.018), ('formal', 0.018), ('transformation', 0.017), ('principles', 0.017), ('shape', 0.017), ('trees', 0.017), ('scope', 0.017), ('cga', 0.017), ('ava', 0.017), ('koutsourakis', 0.017), ('riemenschneider', 0.017), ('manual', 0.017), ('layout', 0.016), ('bokeloh', 0.016), ('charikar', 0.016), ('schmitt', 0.016), ('facing', 0.016), ('edge', 0.015), ('children', 0.015), ('expert', 0.015), ('semantic', 0.015)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000002 228 cvpr-2013-Is There a Procedural Logic to Architecture?
Author: Julien Weissenberg, Hayko Riemenschneider, Mukta Prasad, Luc Van_Gool
Abstract: Urban models are key to navigation, architecture and entertainment. Apart from visualizing fa ¸cades, a number of tedious tasks remain largely manual (e.g. compression, generating new fac ¸ade designs and structurally comparing fa c¸ades for classification, retrieval and clustering). We propose a novel procedural modelling method to automatically learn a grammar from a set of fa c¸ades, generate new fa ¸cade instances and compare fa ¸cades. To deal with the difficulty of grammatical inference, we reformulate the problem. Instead of inferring a compromising, onesize-fits-all, single grammar for all tasks, we infer a model whose successive refinements are production rules tailored for each task. We demonstrate our automatic rule inference on datasets of two different architectural styles. Our method supercedes manual expert work and cuts the time required to build a procedural model of a fa ¸cade from several days to a few milliseconds.
2 0.30110765 57 cvpr-2013-Bayesian Grammar Learning for Inverse Procedural Modeling
Author: Andelo Martinovic, Luc Van_Gool
Abstract: Within the fields of urban reconstruction and city modeling, shape grammars have emerged as a powerful tool for both synthesizing novel designs and reconstructing buildings. Traditionally, a human expert was required to write grammars for specific building styles, which limited the scope of method applicability. We present an approach to automatically learn two-dimensional attributed stochastic context-free grammars (2D-ASCFGs) from a set of labeled buildingfacades. To this end, we use Bayesian Model Merging, a technique originally developed in the field of natural language processing, which we extend to the domain of two-dimensional languages. Given a set of labeled positive examples, we induce a grammar which can be sampled to create novel instances of the same building style. In addition, we demonstrate that our learned grammar can be used for parsing existing facade imagery. Experiments conducted on the dataset of Haussmannian buildings in Paris show that our parsing with learned grammars not only outperforms bottom-up classifiers but is also on par with approaches that use a manually designed style grammar.
3 0.11889122 225 cvpr-2013-Integrating Grammar and Segmentation for Human Pose Estimation
Author: Brandon Rothrock, Seyoung Park, Song-Chun Zhu
Abstract: In this paper we present a compositional and-or graph grammar model for human pose estimation. Our model has three distinguishing features: (i) large appearance differences between people are handled compositionally by allowingparts or collections ofparts to be substituted with alternative variants, (ii) each variant is a sub-model that can define its own articulated geometry and context-sensitive compatibility with neighboring part variants, and (iii) background region segmentation is incorporated into the part appearance models to better estimate the contrast of a part region from its surroundings, and improve resilience to background clutter. The resulting integrated framework is trained discriminatively in a max-margin framework using an efficient and exact inference algorithm. We present experimental evaluation of our model on two popular datasets, and show performance improvements over the state-of-art on both benchmarks.
4 0.051601619 98 cvpr-2013-Cross-View Action Recognition via a Continuous Virtual Path
Author: Zhong Zhang, Chunheng Wang, Baihua Xiao, Wen Zhou, Shuang Liu, Cunzhao Shi
Abstract: In this paper, we propose a novel method for cross-view action recognition via a continuous virtual path which connects the source view and the target view. Each point on this virtual path is a virtual view which is obtained by a linear transformation of the action descriptor. All the virtual views are concatenated into an infinite-dimensional feature to characterize continuous changes from the source to the target view. However, these infinite-dimensional features cannot be used directly. Thus, we propose a virtual view kernel to compute the value of similarity between two infinite-dimensional features, which can be readily used to construct any kernelized classifiers. In addition, there are a lot of unlabeled samples from the target view, which can be utilized to improve the performance of classifiers. Thus, we present a constraint strategy to explore the information contained in the unlabeled samples. The rationality behind the constraint is that any action video belongs to only one class. Our method is verified on the IXMAS dataset, and the experimental results demonstrate that our method achieves better performance than the state-of-the-art methods.
5 0.048516639 381 cvpr-2013-Scene Parsing by Integrating Function, Geometry and Appearance Models
Author: Yibiao Zhao, Song-Chun Zhu
Abstract: Indoor functional objects exhibit large view and appearance variations, thus are difficult to be recognized by the traditional appearance-based classification paradigm. In this paper, we present an algorithm to parse indoor images based on two observations: i) The functionality is the most essentialproperty to define an indoor object, e.g. “a chair to sit on ”; ii) The geometry (3D shape) ofan object is designed to serve its function. We formulate the nature of the object function into a stochastic grammar model. This model characterizes a joint distribution over the function-geometryappearance (FGA) hierarchy. The hierarchical structure includes a scene category, , functional groups, , functional objects, functional parts and 3D geometric shapes. We use a simulated annealing MCMC algorithm to find the maximum a posteriori (MAP) solution, i.e. a parse tree. We design four data-driven steps to accelerate the search in the FGA space: i) group the line segments into 3D primitive shapes, ii) assign functional labels to these 3D primitive shapes, iii) fill in missing objects/parts according to the functional labels, and iv) synthesize 2D segmentation maps and verify the current parse tree by the Metropolis-Hastings acceptance probability. The experimental results on several challenging indoor datasets demonstrate theproposed approach not only significantly widens the scope ofindoor sceneparsing algorithm from the segmentation and the 3D recovery to the functional object recognition, but also yields improved overall performance.
6 0.046658095 309 cvpr-2013-Nonparametric Scene Parsing with Adaptive Feature Relevance and Semantic Context
7 0.043200154 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases
8 0.042389885 456 cvpr-2013-Visual Place Recognition with Repetitive Structures
9 0.041607469 461 cvpr-2013-Weakly Supervised Learning for Attribute Localization in Outdoor Scenes
10 0.039639916 340 cvpr-2013-Probabilistic Label Trees for Efficient Large Scale Image Classification
11 0.038470175 207 cvpr-2013-Human Pose Estimation Using a Joint Pixel-wise and Part-wise Formulation
12 0.037871338 166 cvpr-2013-Fast Image Super-Resolution Based on In-Place Example Regression
13 0.036701828 284 cvpr-2013-Mesh Based Semantic Modelling for Indoor and Outdoor Scenes
14 0.036403999 260 cvpr-2013-Learning and Calibrating Per-Location Classifiers for Visual Place Recognition
15 0.03341794 60 cvpr-2013-Beyond Physical Connections: Tree Models in Human Pose Estimation
16 0.032026011 82 cvpr-2013-Class Generative Models Based on Feature Regression for Pose Estimation of Object Categories
17 0.031347599 81 cvpr-2013-City-Scale Change Detection in Cadastral 3D Models Using Images
18 0.030241748 455 cvpr-2013-Video Object Segmentation through Spatially Accurate and Temporally Dense Extraction of Primary Object Regions
19 0.029018307 351 cvpr-2013-Recovering Line-Networks in Images by Junction-Point Processes
20 0.028975939 22 cvpr-2013-A Non-parametric Framework for Document Bleed-through Removal
topicId topicWeight
[(0, 0.075), (1, 0.001), (2, 0.007), (3, -0.009), (4, 0.036), (5, 0.005), (6, 0.001), (7, 0.024), (8, -0.014), (9, -0.016), (10, -0.004), (11, 0.038), (12, -0.012), (13, 0.002), (14, -0.015), (15, -0.011), (16, 0.024), (17, 0.017), (18, 0.006), (19, -0.056), (20, 0.004), (21, 0.036), (22, 0.009), (23, -0.005), (24, -0.001), (25, 0.033), (26, 0.084), (27, -0.047), (28, -0.046), (29, 0.115), (30, -0.03), (31, 0.026), (32, 0.041), (33, 0.017), (34, -0.108), (35, -0.082), (36, -0.074), (37, -0.026), (38, -0.042), (39, -0.125), (40, 0.055), (41, 0.063), (42, -0.037), (43, -0.166), (44, -0.046), (45, 0.007), (46, 0.033), (47, 0.081), (48, 0.116), (49, 0.087)]
simIndex simValue paperId paperTitle
same-paper 1 0.9400537 228 cvpr-2013-Is There a Procedural Logic to Architecture?
Author: Julien Weissenberg, Hayko Riemenschneider, Mukta Prasad, Luc Van_Gool
Abstract: Urban models are key to navigation, architecture and entertainment. Apart from visualizing fa ¸cades, a number of tedious tasks remain largely manual (e.g. compression, generating new fac ¸ade designs and structurally comparing fa c¸ades for classification, retrieval and clustering). We propose a novel procedural modelling method to automatically learn a grammar from a set of fa c¸ades, generate new fa ¸cade instances and compare fa ¸cades. To deal with the difficulty of grammatical inference, we reformulate the problem. Instead of inferring a compromising, onesize-fits-all, single grammar for all tasks, we infer a model whose successive refinements are production rules tailored for each task. We demonstrate our automatic rule inference on datasets of two different architectural styles. Our method supercedes manual expert work and cuts the time required to build a procedural model of a fa ¸cade from several days to a few milliseconds.
2 0.92393571 57 cvpr-2013-Bayesian Grammar Learning for Inverse Procedural Modeling
Author: Andelo Martinovic, Luc Van_Gool
Abstract: Within the fields of urban reconstruction and city modeling, shape grammars have emerged as a powerful tool for both synthesizing novel designs and reconstructing buildings. Traditionally, a human expert was required to write grammars for specific building styles, which limited the scope of method applicability. We present an approach to automatically learn two-dimensional attributed stochastic context-free grammars (2D-ASCFGs) from a set of labeled buildingfacades. To this end, we use Bayesian Model Merging, a technique originally developed in the field of natural language processing, which we extend to the domain of two-dimensional languages. Given a set of labeled positive examples, we induce a grammar which can be sampled to create novel instances of the same building style. In addition, we demonstrate that our learned grammar can be used for parsing existing facade imagery. Experiments conducted on the dataset of Haussmannian buildings in Paris show that our parsing with learned grammars not only outperforms bottom-up classifiers but is also on par with approaches that use a manually designed style grammar.
3 0.61841691 225 cvpr-2013-Integrating Grammar and Segmentation for Human Pose Estimation
Author: Brandon Rothrock, Seyoung Park, Song-Chun Zhu
Abstract: In this paper we present a compositional and-or graph grammar model for human pose estimation. Our model has three distinguishing features: (i) large appearance differences between people are handled compositionally by allowingparts or collections ofparts to be substituted with alternative variants, (ii) each variant is a sub-model that can define its own articulated geometry and context-sensitive compatibility with neighboring part variants, and (iii) background region segmentation is incorporated into the part appearance models to better estimate the contrast of a part region from its surroundings, and improve resilience to background clutter. The resulting integrated framework is trained discriminatively in a max-margin framework using an efficient and exact inference algorithm. We present experimental evaluation of our model on two popular datasets, and show performance improvements over the state-of-art on both benchmarks.
4 0.55255198 381 cvpr-2013-Scene Parsing by Integrating Function, Geometry and Appearance Models
Author: Yibiao Zhao, Song-Chun Zhu
Abstract: Indoor functional objects exhibit large view and appearance variations, thus are difficult to be recognized by the traditional appearance-based classification paradigm. In this paper, we present an algorithm to parse indoor images based on two observations: i) The functionality is the most essentialproperty to define an indoor object, e.g. “a chair to sit on ”; ii) The geometry (3D shape) ofan object is designed to serve its function. We formulate the nature of the object function into a stochastic grammar model. This model characterizes a joint distribution over the function-geometryappearance (FGA) hierarchy. The hierarchical structure includes a scene category, , functional groups, , functional objects, functional parts and 3D geometric shapes. We use a simulated annealing MCMC algorithm to find the maximum a posteriori (MAP) solution, i.e. a parse tree. We design four data-driven steps to accelerate the search in the FGA space: i) group the line segments into 3D primitive shapes, ii) assign functional labels to these 3D primitive shapes, iii) fill in missing objects/parts according to the functional labels, and iv) synthesize 2D segmentation maps and verify the current parse tree by the Metropolis-Hastings acceptance probability. The experimental results on several challenging indoor datasets demonstrate theproposed approach not only significantly widens the scope ofindoor sceneparsing algorithm from the segmentation and the 3D recovery to the functional object recognition, but also yields improved overall performance.
5 0.4513773 136 cvpr-2013-Discriminatively Trained And-Or Tree Models for Object Detection
Author: Xi Song, Tianfu Wu, Yunde Jia, Song-Chun Zhu
Abstract: This paper presents a method of learning reconfigurable And-Or Tree (AOT) models discriminatively from weakly annotated data for object detection. To explore the appearance and geometry space of latent structures effectively, we first quantize the image lattice using an overcomplete set of shape primitives, and then organize them into a directed acyclic And-Or Graph (AOG) by exploiting their compositional relations. We allow overlaps between child nodes when combining them into a parent node, which is equivalent to introducing an appearance Or-node implicitly for the overlapped portion. The learning of an AOT model consists of three components: (i) Unsupervised sub-category learning (i.e., branches of an object Or-node) with the latent structures in AOG being integrated out. (ii) Weaklysupervised part configuration learning (i.e., seeking the globally optimal parse trees in AOG for each sub-category). To search the globally optimal parse tree in AOG efficiently, we propose a dynamic programming (DP) algorithm. (iii) Joint appearance and structural parameters training under latent structural SVM framework. In experiments, our method is tested on PASCAL VOC 2007 and 2010 detection , benchmarks of 20 object classes and outperforms comparable state-of-the-art methods.
6 0.34995207 461 cvpr-2013-Weakly Supervised Learning for Attribute Localization in Outdoor Scenes
7 0.33353111 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases
8 0.32354805 55 cvpr-2013-Background Modeling Based on Bidirectional Analysis
9 0.31353936 274 cvpr-2013-Lost! Leveraging the Crowd for Probabilistic Visual Self-Localization
10 0.30403277 340 cvpr-2013-Probabilistic Label Trees for Efficient Large Scale Image Classification
11 0.28794393 22 cvpr-2013-A Non-parametric Framework for Document Bleed-through Removal
12 0.28494179 173 cvpr-2013-Finding Things: Image Parsing with Regions and Per-Exemplar Detectors
13 0.28233439 406 cvpr-2013-Spatial Inference Machines
14 0.27371445 39 cvpr-2013-Alternating Decision Forests
15 0.27181068 26 cvpr-2013-A Statistical Model for Recreational Trails in Aerial Images
16 0.2578406 190 cvpr-2013-Graph-Based Optimization with Tubularity Markov Tree for 3D Vessel Segmentation
17 0.25718835 20 cvpr-2013-A New Model and Simple Algorithms for Multi-label Mumford-Shah Problems
18 0.24718022 309 cvpr-2013-Nonparametric Scene Parsing with Adaptive Feature Relevance and Semantic Context
19 0.23651065 275 cvpr-2013-Lp-Norm IDF for Large Scale Image Search
20 0.23532493 320 cvpr-2013-Optimizing 1-Nearest Prototype Classifiers
topicId topicWeight
[(10, 0.065), (16, 0.019), (26, 0.039), (28, 0.019), (33, 0.186), (45, 0.079), (59, 0.015), (67, 0.043), (69, 0.056), (80, 0.018), (87, 0.08), (96, 0.251)]
simIndex simValue paperId paperTitle
Author: Stefan Harmeling, Michael Hirsch, Bernhard Schölkopf
Abstract: We establish a link between Fourier optics and a recent construction from the machine learning community termed the kernel mean map. Using the Fraunhofer approximation, it identifies the kernel with the squared Fourier transform of the aperture. This allows us to use results about the invertibility of the kernel mean map to provide a statement about the invertibility of Fraunhofer diffraction, showing that imaging processes with arbitrarily small apertures can in principle be invertible, i.e., do not lose information, provided the objects to be imaged satisfy a generic condition. A real world experiment shows that we can super-resolve beyond the Rayleigh limit.
same-paper 2 0.7868191 228 cvpr-2013-Is There a Procedural Logic to Architecture?
Author: Julien Weissenberg, Hayko Riemenschneider, Mukta Prasad, Luc Van_Gool
Abstract: Urban models are key to navigation, architecture and entertainment. Apart from visualizing fa ¸cades, a number of tedious tasks remain largely manual (e.g. compression, generating new fac ¸ade designs and structurally comparing fa c¸ades for classification, retrieval and clustering). We propose a novel procedural modelling method to automatically learn a grammar from a set of fa c¸ades, generate new fa ¸cade instances and compare fa ¸cades. To deal with the difficulty of grammatical inference, we reformulate the problem. Instead of inferring a compromising, onesize-fits-all, single grammar for all tasks, we infer a model whose successive refinements are production rules tailored for each task. We demonstrate our automatic rule inference on datasets of two different architectural styles. Our method supercedes manual expert work and cuts the time required to build a procedural model of a fa ¸cade from several days to a few milliseconds.
3 0.73091531 218 cvpr-2013-Improving the Visual Comprehension of Point Sets
Author: Sagi Katz, Ayellet Tal
Abstract: Point sets are the standard output of many 3D scanning systems and depth cameras. Presenting the set of points as is, might “hide ” the prominent features of the object from which the points are sampled. Our goal is to reduce the number of points in a point set, for improving the visual comprehension from a given viewpoint. This is done by controlling the density of the reduced point set, so as to create bright regions (low density) and dark regions (high density), producing an effect of shading. This data reduction is achieved by leveraging a limitation of a solution to the classical problem of determining visibility from a viewpoint. In addition, we introduce a new dual problem, for determining visibility of a point from infinity, and show how a limitation of its solution can be leveraged in a similar way.
4 0.72686094 431 cvpr-2013-The Variational Structure of Disparity and Regularization of 4D Light Fields
Author: Bastian Goldluecke, Sven Wanner
Abstract: Unlike traditional images which do not offer information for different directions of incident light, a light field is defined on ray space, and implicitly encodes scene geometry data in a rich structure which becomes visible on its epipolar plane images. In this work, we analyze regularization of light fields in variational frameworks and show that their variational structure is induced by disparity, which is in this context best understood as a vector field on epipolar plane image space. We derive differential constraints on this vector field to enable consistent disparity map regularization. Furthermore, we show how the disparity field is related to the regularization of more general vector-valued functions on the 4D ray space of the light field. This way, we derive an efficient variational framework with convex priors, which can serve as a fundament for a large class of inverse problems on ray space.
5 0.68870884 239 cvpr-2013-Kernel Null Space Methods for Novelty Detection
Author: Paul Bodesheim, Alexander Freytag, Erik Rodner, Michael Kemmler, Joachim Denzler
Abstract: Detecting samples from previously unknown classes is a crucial task in object recognition, especially when dealing with real-world applications where the closed-world assumption does not hold. We present how to apply a null space method for novelty detection, which maps all training samples of one class to a single point. Beside the possibility of modeling a single class, we are able to treat multiple known classes jointly and to detect novelties for a set of classes with a single model. In contrast to modeling the support of each known class individually, our approach makes use of a projection in a joint subspace where training samples of all known classes have zero intra-class variance. This subspace is called the null space of the training data. To decide about novelty of a test sample, our null space approach allows for solely relying on a distance measure instead of performing density estimation directly. Therefore, we derive a simple yet powerful method for multi-class novelty detection, an important problem not studied sufficiently so far. Our novelty detection approach is assessed in com- prehensive multi-class experiments using the publicly available datasets Caltech-256 and ImageNet. The analysis reveals that our null space approach is perfectly suited for multi-class novelty detection since it outperforms all other methods.
7 0.66624206 188 cvpr-2013-Globally Consistent Multi-label Assignment on the Ray Space of 4D Light Fields
8 0.65791708 57 cvpr-2013-Bayesian Grammar Learning for Inverse Procedural Modeling
9 0.6570155 318 cvpr-2013-Optimized Pedestrian Detection for Multiple and Occluded People
10 0.644274 293 cvpr-2013-Multi-attribute Queries: To Merge or Not to Merge?
11 0.64283544 365 cvpr-2013-Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities
12 0.63946301 12 cvpr-2013-A Global Approach for the Detection of Vanishing Points and Mutually Orthogonal Vanishing Directions
13 0.63931078 61 cvpr-2013-Beyond Point Clouds: Scene Understanding by Reasoning Geometry and Physics
14 0.63793057 448 cvpr-2013-Universality of the Local Marginal Polytope
15 0.63748407 298 cvpr-2013-Multi-scale Curve Detection on Surfaces
16 0.63630092 181 cvpr-2013-Fusing Depth from Defocus and Stereo with Coded Apertures
17 0.63600004 155 cvpr-2013-Exploiting the Power of Stereo Confidences
18 0.63547391 71 cvpr-2013-Boundary Cues for 3D Object Shape Recovery
19 0.63473547 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
20 0.63408953 242 cvpr-2013-Label Propagation from ImageNet to 3D Point Clouds