jmlr jmlr2012 jmlr2012-3 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Benjamin I.P. Rubinstein, J. Hyam Rubinstein
Abstract: The Sample Compression Conjecture of Littlestone & Warmuth has remained unsolved for a quarter century. While maximum classes (concept classes meeting Sauer’s Lemma with equality) can be compressed, the compression of general concept classes reduces to compressing maximal classes (classes that cannot be expanded without increasing VC dimension). Two promising ways forward are: embedding maximal classes into maximum classes with at most a polynomial increase to VC dimension, and compression via operating on geometric representations. This paper presents positive results on the latter approach and a first negative result on the former, through a systematic investigation of finite maximum classes. Simple arrangements of hyperplanes in hyperbolic space are shown to represent maximum classes, generalizing the corresponding Euclidean result. We show that sweeping a generic hyperplane across such arrangements forms an unlabeled compression scheme of size VC dimension and corresponds to a special case of peeling the one-inclusion graph, resolving a recent conjecture of Kuzmin & Warmuth. A bijection between finite maximum classes and certain arrangements of piecewise-linear (PL) hyperplanes in either a ball or Euclidean space is established. Finally we show that d-maximum classes corresponding to PL-hyperplane arrangements in Rd have cubical complexes homeomorphic to a d-ball, or equivalently complexes that are manifolds with boundary. A main result is that PL arrangements can be swept by a moving hyperplane to unlabeled d-compress any finite maximum class, forming a peeling scheme as conjectured by Kuzmin & Warmuth. A corollary is that some d-maximal classes cannot be embedded into any maximum class of VC-dimension d + k, for any constant k. The construction of the PL sweeping involves Pachner moves on the one-inclusion graph, corresponding to moves of a hyperplane across the intersection of d other hyperplanes. This extends the well known Pachner moves for triangulations to c
Reference: text
sentIndex sentText sentNum sentScore
1 Simple arrangements of hyperplanes in hyperbolic space are shown to represent maximum classes, generalizing the corresponding Euclidean result. [sent-14, score-0.752]
2 We show that sweeping a generic hyperplane across such arrangements forms an unlabeled compression scheme of size VC dimension and corresponds to a special case of peeling the one-inclusion graph, resolving a recent conjecture of Kuzmin & Warmuth. [sent-15, score-1.204]
3 A bijection between finite maximum classes and certain arrangements of piecewise-linear (PL) hyperplanes in either a ball or Euclidean space is established. [sent-16, score-0.663]
4 Finally we show that d-maximum classes corresponding to PL-hyperplane arrangements in Rd have cubical complexes homeomorphic to a d-ball, or equivalently complexes that are manifolds with boundary. [sent-17, score-0.941]
5 A main result is that PL arrangements can be swept by a moving hyperplane to unlabeled d-compress any finite maximum class, forming a peeling scheme as conjectured by Kuzmin & Warmuth. [sent-18, score-0.868]
6 In this paper we approach peeling from the direction of simple hyperplane arrangement representations of maximum classes. [sent-46, score-0.873]
7 Kuzmin and Warmuth (2007, Conjecture 1) predicted that dmaximum classes corresponding to simple linear-hyperplane arrangements could be unlabeled dcompressed by sweeping a generic hyperplane across the arrangement, and that concepts are min peeled as their corresponding cell is swept away. [sent-47, score-1.048]
8 We positively resolve the first part of the conjecture and show that sweeping such arrangements corresponds to a new form of corner peeling, which we prove is distinct from min peeling. [sent-48, score-0.713]
9 While min peeling removes minimum degree concepts from a one-inclusion graph, corner peeling peels vertices that are contained in unique cubes of maximum dimension. [sent-49, score-0.904]
10 We explore simple hyperplane arrangements in hyperbolic geometry, which we show correspond to a set of maximum classes, properly containing those represented by simple linear Euclidean arrangements. [sent-50, score-0.693]
11 Resolving the main problem left open in the preliminary version of this paper (Rubinstein and Rubinstein, 2008), we show that sweeping of d-contractible PL arrangements does compress all finite maximum classes by corner peeling, completing (Kuzmin and Warmuth, 2007, Conjecture 1). [sent-57, score-0.795]
12 We show that a one-inclusion graph Γ can be represented by a d-contractible PL-hyperplane arrangement if and only if Γ is a strongly contractible cubical complex. [sent-58, score-0.882]
13 1224 A G EOMETRIC A PPROACH TO S AMPLE C OMPRESSION P1 P1 P2 P2 P3 P3 P4 P4 P5 P5 Arrangement P Arrangement P ' (a) (b) Figure 1: (a) An example linear-hyperplane arrangement P and (b) the result of a Pachner move of hyperplane P4 on P . [sent-122, score-0.641]
14 In Figure 1 a change in a hyperplane arrangement is shown, which corresponds to a Pachner move on the corresponding one-inclusion graph (considered as a cubical complex). [sent-129, score-0.929]
15 Finally the cells of a simple linear arrangement of n hyperplanes in Rd form a VC-d maximum class in the n-cube (Edelsbrunner, 1987), but not all finite maximum classes correspond to such Euclidean arrangements (Floyd, 1989). [sent-177, score-1.21]
16 The authors also conjectured that sweeping a hyperplane in general position across a simple linear arrangement forms a compression scheme that corresponds to min peeling the associated maximum class (Kuzmin and Warmuth, 2007, Conjecture 1). [sent-191, score-1.288]
17 Beginning with a concept class C0 = C ⊆ {0, 1}n , min peeling operates by iteratively removing a vertex vt of minimum-degree in G (Ct ) to produce the peeled class Ct+1 = Ct \{vt }. [sent-259, score-0.645]
18 Euclidean Arrangements Definition 20 A linear arrangement is a collection of n ≥ d oriented hyperplanes in Rd . [sent-340, score-0.66]
19 Each region or cell in the complement of the arrangement is naturally associated with a concept in {0, 1}n ; the side of the ith hyperplane on which a cell falls determines the concept’s ith component. [sent-341, score-0.789]
20 A simple arrangement is a linear arrangement in which any subset of d planes has a unique point in common and all subsets of d + 1 planes have an empty mutual intersection. [sent-342, score-1.07]
21 • Projection on [n]\{i} corresponds to removing the ith plane; • The reduction Ci is the new arrangement given by the intersection of C’s arrangement with the ith plane; and • The corresponding lifted reduction is the collection of cells in the arrangement that adjoin the ith plane. [sent-346, score-1.552]
22 Corollary 22 Let A be a simple linear arrangement of n hyperplanes in Rd with corresponding d-maximum C ⊆ {0, 1}n . [sent-357, score-0.66]
23 Proof The intersection of A with a generic hyperplane is again a simple arrangement of n hyperplanes but now in Rd−1 . [sent-360, score-0.911]
24 Corollary 23 Let A be a simple linear arrangement of n hyperplanes in Rd and let C ⊆ {0, 1}n be the corresponding d-maximum class. [sent-379, score-0.66]
25 By induction, on each hyperplane, the induced arrangement has a Voronoi cell decomposition which is a (d − 1)-cubical complex with edges and vertices matching the one-inclusion graph for the tail of C corresponding to the label associated with the hyperplane. [sent-383, score-0.769]
26 As Kuzmin and Warmuth (2007) did previously, consider a generic hyperplane h sweeping across a simple linear arrangement A. [sent-406, score-0.817]
27 At any step in the process, the result of peeling j vertices from C to reach C j , is captured by the arrangement H+ ∩ A for the appropriate h. [sent-409, score-0.798]
28 Figures 8 and 5(a) display a hyperplane arrangement in Euclidean space and its Voronoi cell decomposition, corresponding to this maximum class. [sent-411, score-0.704]
29 In this case, sweeping the vertical dashed line across the arrangement corresponds to a partial corner peeling of the concept class with peeling sequence v7 , v8 , v5 , v9 , v2 , v0 . [sent-412, score-1.374]
30 Theorem 24 Any d-maximum class C ⊆ {0, 1}n corresponding to a simple linear arrangement A can be corner peeled by sweeping A, and this process is a valid unlabeled compression scheme for C of size d. [sent-415, score-1.241]
31 It then follows that sweeping a generic hyperplane h across A corresponds to corner peeling C to a (d − 1)-maximum sub-class C′ ⊆ ∂C by Corollary 22. [sent-417, score-0.789]
32 Moreover C′ corresponds to a simple linear arrangement of n hyperplanes in Rd−1 . [sent-418, score-0.66]
33 The d planes defining this point intersect h in a simple arrangement of hyperplanes on h. [sent-421, score-0.74]
34 We claim that the cell c for the arrangement A, whose intersection with h is ∆, is a corner vertex v j of C j−1 . [sent-423, score-0.873]
35 Thus, there are no edges in C j−1 starting at the vertex corresponding to p j , except for those in the cube C′j−1 , which consists of all cells adjacent to p j in the arrangement A. [sent-427, score-0.74]
36 While corner peeling and min peeling share some properties in common, they are distinct procedures. [sent-436, score-0.66]
37 Corollary 26 There is no constant c so that all maximal classes of VC-dimension d can be embedded into maximum classes corresponding to simple hyperplane arrangements of dimension d + c. [sent-450, score-0.747]
38 Hyperbolic Arrangements To motivate the introduction of hyperbolic arrangements, note that linear-hyperplane arrangements can be efficiently described, since each hyperplane is determined by its unit normal and distance from the origin. [sent-452, score-0.654]
39 4 However the family of hyperbolic hyperplanes has more flexibility than linear hyperplanes since there are many disjoint hyperbolic hyperplanes, whereas in the linear case only parallel hyperplanes do not meet. [sent-455, score-1.045]
40 Thus we turn to hyperbolic arrangements to represent a larger collection of concept classes than those represented by simple linear arrangements. [sent-456, score-0.652]
41 We can now see immediately that a simple hyperplane arrangement in Hk can be described by taking a simple hyperplane arrangement in Rk and intersecting it with the unit ball. [sent-464, score-1.202]
42 Definition 27 A simple hyperbolic d-arrangement is a collection of n hyperplanes in Hk with the property that every sub-collection of d hyperplanes mutually intersect in a (k − d)-dimensional hyperbolic plane, and that every sub-collection of d + 1 hyperplanes mutually intersect as the empty set. [sent-468, score-1.069]
43 Note that the new maximum classes are embedded in maximum classes induced by arrangements of linear hyperplanes in Euclidean space. [sent-476, score-0.793]
44 Note first that intersections of the hyperplanes of the arrangement with the moving hyperplane appear precisely when there is a first intersection at the ideal boundary. [sent-482, score-0.944]
45 Note also that new intersections of the sweeping hyperplane with the various lower dimensional planes of intersection between the hyperplanes appear similarly at the ideal boundary. [sent-484, score-0.744]
46 We next observe that sweeping by generic hyperbolic hyperplanes induces corner peeling of the corresponding maximum class, extending Theorem 24. [sent-493, score-1.092]
47 Moreover, the order of the dimensions of the cubes which are corner peeled can be arbitrary—lower dimensional cubes may be corner peeled before all the higher dimensional cubes are corner peeled. [sent-495, score-1.175]
48 Theorem 29 Any d-maximum class C ⊆ {0, 1}n corresponding to a simple hyperbolic d-arrangement A can be corner peeled by sweeping A with a generic hyperbolic hyperplane. [sent-499, score-0.998]
49 For the case k = d + 1, as for the simple linear arrangements just prior to the corner peeling of c, H+ ∩ c is homeomorphic to a (d + 1)-simplex with a missing face on the ideal boundary. [sent-508, score-0.891]
50 Although swept corners in hyperbolic arrangements can be of cubes of differing dimensions, these dimensions never exceed d and so the proof that sweeping simple linear arrangements induces d-compression schemes is still valid. [sent-513, score-1.169]
51 Figures 11(a) and 11(b) display the sweeping of a general hyperplane across the former arrangement and the corresponding corner peeling. [sent-516, score-0.97]
52 If the boundary of the ball is removed, then we obtain an arrangement of PL hyperplanes in Euclidean space. [sent-527, score-0.765]
53 Infinite Euclidean and Hyperbolic Arrangements We consider a simple example of an infinite maximum class which admits corner peeling and a compression scheme analogous to those of previous sections. [sent-529, score-0.684]
54 Then the cubical complex C, with vertices vmn , can be corner peeled and hence compressed, using a sweepout by the lines {(x, y) | x + (1 + ε)y = t} for t ≥ 0 and any small fixed irrational ε > 0. [sent-532, score-0.757]
55 To verify the properties of this example, notice that sweeping as specified corresponds to corner peeling the vertex v00 , then the vertices v10 , v01 , then the remaining vertices vmn . [sent-536, score-0.918]
56 Note that the compression scheme is associated with sweeping across the arrangement in the direction of decreasing t. [sent-539, score-0.824]
57 In concluding this brief discussion, we note that many infinite collections of simple hyperbolic hyperplanes and Euclidean hyperplanes can also be corner peeled and compressed, even if intersection points and cells accumulate. [sent-543, score-1.088]
58 This arrangement has corner peeling and compression schemes given by sweeping across L using the generic line {y = t}. [sent-552, score-1.3]
59 A simple PL d-arrangement is an arrangement of n PL hyperplanes such that every subcollection of j hyperplanes meet transversely in a (k − j)-dimensional PL plane for 2 ≤ j ≤ d and every subcollection of d + 1 hyperplanes are disjoint. [sent-560, score-1.258]
60 Theorem 32 Every d-maximum class C ⊆ {0, 1}n can be represented by a simple arrangement of PL hyperplanes in an n-ball. [sent-566, score-0.684]
61 It remains to show this arrangement of hyperspheres gives the same cubical complex as C, unless n = d + 1. [sent-609, score-0.78]
62 Then it is easy to see that each cell c in the complement of the PLhyperplane arrangement in N has part of its boundary on the ideal boundary ∂N . [sent-611, score-0.697]
63 Note that our method cannot be realized as an arrangement of PL hyperplanes in the 3-ball B ˜ gives C as an arrangement in B5 and this example shows that B4 is the best one might hope for in terms of dimension of the hyperplane arrangement. [sent-626, score-1.287]
64 2 Maximum Classes with Manifold Cubical Complexes We prove a partial converse to Corollary 23: if a d-maximum class has a ball as cubical complex, then it can always be realized by a simple PL-hyperplane arrangement in Rd . [sent-634, score-0.752]
65 Then the following properties of C, considered as a cubical complex, are equivalent: (i) There is a simple arrangement A of n PL hyperplanes in Rd which represents C. [sent-636, score-0.901]
66 By induction on n, it follows that there is a PL-hyperplane arrangement A, consisting of n − 1 PL hyperplanes in Bd , which represents p(C). [sent-652, score-0.686]
67 Moreover, after peeling, we still have a family of PL hyperspheres which give an arrangement corresponding to the new peeled class. [sent-670, score-0.655]
68 Definition 35 Suppose that a finite arrangement P of PL hyperplanes {Pα }, each properly embedded in an n-ball Bn , satisfies the following conditions: 1248 A G EOMETRIC A PPROACH TO S AMPLE C OMPRESSION i. [sent-689, score-0.699]
69 The arrangements in Definition 35 are called d-contractible because we prove later that their corresponding one-inclusion graphs are strongly contractible cubical complexes of dimension d. [sent-693, score-0.817]
70 Definition 37 A one-inclusion graph Γ is strongly contractible if it is contractible as a cubical complex and moreover, all reductions and multiple reductions of Γ are also contractible. [sent-704, score-0.662]
71 So a hyperplane P may start sweeping across an arrangement P . [sent-707, score-0.776]
72 Then a second generic hyperplane P′ can start sweeping across this new arrangement P+ . [sent-709, score-0.817]
73 Below we show that a suitable multiple sweeping of a PL-hyperplane arrangement P gives a corner-peeling sequence of all finite maximum classes. [sent-712, score-0.669]
74 1249 RUBINSTEIN AND RUBINSTEIN P1 P1 P2 P3 Ball B ¡¡ P2 P3 B P4 P4 P5 P5 Arrangement P ' Arrangement P (b) (a) Figure 14: (a) An example PL-hyperplane arrangement P and (b) the result of a Pachner move of hyperplane P4 on P . [sent-717, score-0.641]
75 This gives an arrangement with fewer hyperplanes and clearly the complexity has decreased to (r − 1, s) for some s. [sent-743, score-0.66]
76 Select a hyperplane P1 which splits the arrangement into two smaller arrangements P+ , P− in the balls B+ , B− . [sent-747, score-0.924]
77 P4 P5 P5 P5 Figure 16: Partial corner-peeling sequence for the (B+ , P+ ) arrangement split from the arrangement of Figure 15, in the proof of Theorem 39. [sent-757, score-0.91]
78 The corresponding effect on the one-inclusion graph is peeling of a vertex which is a corner of a d ′ -cube in the binary class corresponding to the arrangement P+ , where d ′ ≤ d. [sent-762, score-1.049]
79 Moreover, since we assumed that the hyperplane P1 satisfies P + has a minimum number s of complementary regions, it follows that the move pushing P1 across R+ produces a new arrangement P ⋆ with smaller complexity (r, s − 1) than the original arrangement P . [sent-765, score-1.096]
80 Note that a vertex which is a corner of a single cube in U ′ remains so after corner peeling at v1 . [sent-792, score-0.763]
81 ) The key to understanding this is that firstly, when we initially peel only vertices in ∂U ′ , these are not adjacent to any vertices of the one-inclusion graph outside U ′ and so cannot produce any new opportunities for corner peeling of vertices not in U ′ . [sent-796, score-0.867]
82 Now assume that an initial sequence of corner peeling of vertices in ∂U ′ allows the next step to be corner peeling of the unique interior vertex v. [sent-809, score-1.06]
83 Definition 41 A linear or hyperbolic-hyperplane arrangement P in Rn or Hn respectively, is called generic, if any subcollection of k hyperplanes of P , for 2 ≤ k ≤ n has the property that there are no intersection points or the subcollection intersects transversely in a plane of dimension n − k. [sent-833, score-0.932]
84 Remark 43 The proof of Corollary 42 is immediate since it is obvious that any generic linear or hyperbolic-hyperplane arrangement is a d-contractible PL-hyperplane arrangement, where d is the cardinality of the largest subcollection of hyperplanes which mutually intersect. [sent-836, score-0.769]
85 Note that many generic linear, hyperbolic or d-contractible PL-hyperplane arrangements do not embed in any simple linear, hyperbolic or PL-hyperplane arrangement. [sent-837, score-0.754]
86 For if there are two hyperplanes in P which are disjoint, then this is an obstruction to enlarging the arrangement by adding additional hyperplanes to obtain a simple arrangement. [sent-838, score-0.865]
87 the proof of Theorem 39) is that a hyperplane Pα in P can be found which splits Bn into pieces B+ , B− so that one, say B+ gives a new arrangement for which the maximum number of mutually intersecting hyperplanes is strictly less than that for P . [sent-847, score-0.867]
88 There is an ordering of the planes in P so that if we split Bn successively along the planes, then at each stage, at least one of the two resulting balls has an arrangement with a smaller maximum number of planes which mutually intersect. [sent-854, score-0.696]
89 ) Finally to show that ii implies i, by Theorem 39, a d-contractible PL-hyperplane arrangement P has a peeling sequence and so the corresponding one-inclusion graph Γ is contractible. [sent-891, score-0.735]
90 Finally for the second example, there is a non simple hyperplane arrangement consisting of lines in the hyperbolic plane which represents the class. [sent-910, score-0.866]
91 Moreover any hyperplane arrangement represents a connected complex so there cannot be such an arrangement for this example. [sent-917, score-1.094]
92 It is easy to form a hyperbolic-line arrangement consisting of three lines meeting in three points forming a triangle and three further lines near the boundary of the hyperbolic plane which do not meet any other line. [sent-923, score-0.843]
93 Conclusions and Open Problems We saw in Corollary 23 that d-maximum classes represented by simple linear-hyperplane arrangements in Rd have underlying cubical complexes that are homeomorphic to a d-ball. [sent-932, score-0.833]
94 Moreover in Theorem 33, we proved that d-maximum classes represented by PL-hyperplane arrangements in Rd are those whose underlying cubical complexes are manifolds or equivalently d-balls. [sent-934, score-0.736]
95 1257 RUBINSTEIN AND RUBINSTEIN Question 47 Does every simple PL-hyperplane arrangement in Bd , where every subcollection of d planes transversely meet in a point, represent the same concept class as some simple linearhyperplane arrangement? [sent-935, score-0.721]
96 Question 48 What is the connection between the VC dimension of a maximum class induced by a simple hyperbolic-hyperplane arrangement and the smallest dimension of hyperbolic space containing such an arrangement? [sent-936, score-0.775]
97 Can such an embedding be used to construct a hyperbolic arrangement in H 2d or a PL arrangement in R2d ? [sent-940, score-1.115]
98 Question 51 Can any d-maximum class in {0, 1}n be represented by a simple arrangement of hyperplanes in Hn ? [sent-947, score-0.684]
99 So compression schemes arising from sweeping across simple arrangements of hyperplanes in Euclidean or hyperbolic space are also acyclic. [sent-951, score-1.09]
100 We have established peeling of all finite maximum and a family of infinite maximum classes by representing them as PL-hyperplane arrangements and sweeping by multiple generic hyperplanes. [sent-953, score-0.914]
wordName wordTfidf (topN-words)
[('arrangement', 0.455), ('arrangements', 0.303), ('rubinstein', 0.244), ('cubical', 0.241), ('peeling', 0.233), ('hyperbolic', 0.205), ('hyperplanes', 0.205), ('corner', 0.194), ('sweeping', 0.175), ('pl', 0.155), ('peeled', 0.154), ('compression', 0.15), ('hyperplane', 0.146), ('contractible', 0.139), ('kuzmin', 0.115), ('vertices', 0.11), ('complexes', 0.108), ('ompression', 0.108), ('homeomorphic', 0.097), ('vertex', 0.096), ('cubes', 0.095), ('pachner', 0.092), ('eometric', 0.088), ('classes', 0.084), ('planes', 0.08), ('pproach', 0.079), ('warmuth', 0.076), ('boundary', 0.073), ('ample', 0.068), ('cell', 0.064), ('vc', 0.064), ('intersection', 0.064), ('lifted', 0.062), ('cells', 0.061), ('concept', 0.06), ('edges', 0.055), ('vt', 0.054), ('schemes', 0.052), ('homotopy', 0.051), ('graph', 0.047), ('homeomorphism', 0.046), ('hyperspheres', 0.046), ('subcollection', 0.046), ('cube', 0.046), ('unlabeled', 0.045), ('ct', 0.044), ('voronoi', 0.044), ('scheme', 0.044), ('intersections', 0.042), ('conjecture', 0.041), ('faces', 0.041), ('generic', 0.041), ('move', 0.04), ('plane', 0.04), ('simplicial', 0.04), ('lifting', 0.04), ('maximum', 0.039), ('embedded', 0.039), ('complex', 0.038), ('ci', 0.037), ('peel', 0.036), ('swept', 0.036), ('ideal', 0.032), ('ball', 0.032), ('face', 0.032), ('aximum', 0.032), ('incident', 0.032), ('floyd', 0.031), ('innermost', 0.031), ('spheres', 0.031), ('meet', 0.03), ('reductions', 0.029), ('topological', 0.029), ('hk', 0.029), ('lasses', 0.028), ('adjacent', 0.027), ('dimension', 0.026), ('sk', 0.026), ('maximal', 0.026), ('transversely', 0.026), ('induction', 0.026), ('moves', 0.026), ('compressing', 0.025), ('class', 0.024), ('intersects', 0.024), ('sauer', 0.024), ('bd', 0.023), ('conjectured', 0.022), ('topology', 0.022), ('simplices', 0.022), ('shifting', 0.022), ('mutually', 0.022), ('colors', 0.021), ('edge', 0.021), ('contractibility', 0.021), ('litman', 0.021), ('disjoint', 0.02), ('lines', 0.02), ('rourke', 0.02), ('balls', 0.02)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000015 3 jmlr-2012-A Geometric Approach to Sample Compression
Author: Benjamin I.P. Rubinstein, J. Hyam Rubinstein
Abstract: The Sample Compression Conjecture of Littlestone & Warmuth has remained unsolved for a quarter century. While maximum classes (concept classes meeting Sauer’s Lemma with equality) can be compressed, the compression of general concept classes reduces to compressing maximal classes (classes that cannot be expanded without increasing VC dimension). Two promising ways forward are: embedding maximal classes into maximum classes with at most a polynomial increase to VC dimension, and compression via operating on geometric representations. This paper presents positive results on the latter approach and a first negative result on the former, through a systematic investigation of finite maximum classes. Simple arrangements of hyperplanes in hyperbolic space are shown to represent maximum classes, generalizing the corresponding Euclidean result. We show that sweeping a generic hyperplane across such arrangements forms an unlabeled compression scheme of size VC dimension and corresponds to a special case of peeling the one-inclusion graph, resolving a recent conjecture of Kuzmin & Warmuth. A bijection between finite maximum classes and certain arrangements of piecewise-linear (PL) hyperplanes in either a ball or Euclidean space is established. Finally we show that d-maximum classes corresponding to PL-hyperplane arrangements in Rd have cubical complexes homeomorphic to a d-ball, or equivalently complexes that are manifolds with boundary. A main result is that PL arrangements can be swept by a moving hyperplane to unlabeled d-compress any finite maximum class, forming a peeling scheme as conjectured by Kuzmin & Warmuth. A corollary is that some d-maximal classes cannot be embedded into any maximum class of VC-dimension d + k, for any constant k. The construction of the PL sweeping involves Pachner moves on the one-inclusion graph, corresponding to moves of a hyperplane across the intersection of d other hyperplanes. This extends the well known Pachner moves for triangulations to c
2 0.070249639 94 jmlr-2012-Query Strategies for Evading Convex-Inducing Classifiers
Author: Blaine Nelson, Benjamin I. P. Rubinstein, Ling Huang, Anthony D. Joseph, Steven J. Lee, Satish Rao, J. D. Tygar
Abstract: Classifiers are often used to detect miscreant activities. We study how an adversary can systematically query a classifier to elicit information that allows the attacker to evade detection while incurring a near-minimal cost of modifying their intended malfeasance. We generalize the theory of Lowd and Meek (2005) to the family of convex-inducing classifiers that partition their feature space into two sets, one of which is convex. We present query algorithms for this family that construct undetected instances of approximately minimal cost using only polynomially-many queries in the dimension of the space and in the level of approximation. Our results demonstrate that nearoptimal evasion can be accomplished for this family without reverse engineering the classifier’s decision boundary. We also consider general ℓ p costs and show that near-optimal evasion on the family of convex-inducing classifiers is generally efficient for both positive and negative convexity for all levels of approximation if p = 1. Keywords: query algorithms, evasion, reverse engineering, adversarial learning
3 0.049547795 15 jmlr-2012-Algebraic Geometric Comparison of Probability Distributions
Author: Franz J. Király, Paul von Bünau, Frank C. Meinecke, Duncan A.J. Blythe, Klaus-Robert Müller
Abstract: We propose a novel algebraic algorithmic framework for dealing with probability distributions represented by their cumulants such as the mean and covariance matrix. As an example, we consider the unsupervised learning problem of finding the subspace on which several probability distributions agree. Instead of minimizing an objective function involving the estimated cumulants, we show that by treating the cumulants as elements of the polynomial ring we can directly solve the problem, at a lower computational cost and with higher accuracy. Moreover, the algebraic viewpoint on probability distributions allows us to invoke the theory of algebraic geometry, which we demonstrate in a compact proof for an identifiability criterion. Keywords: computational algebraic geometry, approximate algebra, unsupervised Learning
4 0.048358034 25 jmlr-2012-Characterization and Greedy Learning of Interventional Markov Equivalence Classes of Directed Acyclic Graphs
Author: Alain Hauser, Peter Bühlmann
Abstract: The investigation of directed acyclic graphs (DAGs) encoding the same Markov property, that is the same conditional independence relations of multivariate observational distributions, has a long tradition; many algorithms exist for model selection and structure learning in Markov equivalence classes. In this paper, we extend the notion of Markov equivalence of DAGs to the case of interventional distributions arising from multiple intervention experiments. We show that under reasonable assumptions on the intervention experiments, interventional Markov equivalence defines a finer partitioning of DAGs than observational Markov equivalence and hence improves the identifiability of causal models. We give a graph theoretic criterion for two DAGs being Markov equivalent under interventions and show that each interventional Markov equivalence class can, analogously to the observational case, be uniquely represented by a chain graph called interventional essential graph (also known as CPDAG in the observational case). These are key insights for deriving a generalization of the Greedy Equivalence Search algorithm aimed at structure learning from interventional data. This new algorithm is evaluated in a simulation study. Keywords: causal inference, interventions, graphical model, Markov equivalence, greedy equivalence search
5 0.046566557 68 jmlr-2012-Minimax Manifold Estimation
Author: Christopher Genovese, Marco Perone-Pacifico, Isabella Verdinelli, Larry Wasserman
Abstract: We find the minimax rate of convergence in Hausdorff distance for estimating a manifold M of dimension d embedded in RD given a noisy sample from the manifold. Under certain conditions, we show that the optimal rate of convergence is n−2/(2+d) . Thus, the minimax rate depends only on the dimension of the manifold, not on the dimension of the space in which M is embedded. Keywords: manifold learning, minimax estimation
6 0.042235322 91 jmlr-2012-Plug-in Approach to Active Learning
7 0.039017271 40 jmlr-2012-Exact Covariance Thresholding into Connected Components for Large-Scale Graphical Lasso
8 0.033238236 13 jmlr-2012-Active Learning via Perfect Selective Classification
9 0.032923635 35 jmlr-2012-EP-GIG Priors and Applications in Bayesian Sparse Learning
10 0.032190379 97 jmlr-2012-Regularization Techniques for Learning with Matrices
11 0.031668469 108 jmlr-2012-Sparse and Unique Nonnegative Matrix Factorization Through Data Preprocessing
12 0.030001765 48 jmlr-2012-High-Dimensional Gaussian Graphical Model Selection: Walk Summability and Local Separation Criterion
13 0.029167239 67 jmlr-2012-Minimax-Optimal Rates For Sparse Additive Models Over Kernel Classes Via Convex Programming
14 0.027803233 58 jmlr-2012-Linear Fitted-Q Iteration with Multiple Reward Functions
15 0.027528008 98 jmlr-2012-Regularized Bundle Methods for Convex and Non-Convex Risks
16 0.027260596 8 jmlr-2012-A Primal-Dual Convergence Analysis of Boosting
17 0.027014052 17 jmlr-2012-An Active Learning Algorithm for Ranking from Pairwise Preferences with an Almost Optimal Query Complexity
18 0.025745785 12 jmlr-2012-Active Clustering of Biological Sequences
19 0.022944856 14 jmlr-2012-Activized Learning: Transforming Passive to Active with Improved Label Complexity
20 0.02277115 24 jmlr-2012-Causal Bounds and Observable Constraints for Non-deterministic Models
topicId topicWeight
[(0, -0.112), (1, 0.027), (2, 0.004), (3, -0.044), (4, -0.02), (5, 0.025), (6, 0.05), (7, -0.02), (8, 0.125), (9, 0.028), (10, -0.07), (11, -0.037), (12, -0.13), (13, -0.012), (14, -0.035), (15, 0.038), (16, -0.118), (17, -0.055), (18, 0.095), (19, 0.007), (20, 0.098), (21, 0.029), (22, -0.168), (23, 0.137), (24, -0.097), (25, 0.014), (26, 0.023), (27, -0.14), (28, -0.108), (29, -0.055), (30, 0.05), (31, -0.019), (32, 0.123), (33, 0.006), (34, -0.031), (35, 0.189), (36, 0.065), (37, -0.088), (38, -0.108), (39, 0.041), (40, -0.157), (41, 0.032), (42, -0.035), (43, 0.103), (44, 0.054), (45, 0.304), (46, 0.071), (47, -0.154), (48, -0.251), (49, 0.346)]
simIndex simValue paperId paperTitle
same-paper 1 0.97573394 3 jmlr-2012-A Geometric Approach to Sample Compression
Author: Benjamin I.P. Rubinstein, J. Hyam Rubinstein
Abstract: The Sample Compression Conjecture of Littlestone & Warmuth has remained unsolved for a quarter century. While maximum classes (concept classes meeting Sauer’s Lemma with equality) can be compressed, the compression of general concept classes reduces to compressing maximal classes (classes that cannot be expanded without increasing VC dimension). Two promising ways forward are: embedding maximal classes into maximum classes with at most a polynomial increase to VC dimension, and compression via operating on geometric representations. This paper presents positive results on the latter approach and a first negative result on the former, through a systematic investigation of finite maximum classes. Simple arrangements of hyperplanes in hyperbolic space are shown to represent maximum classes, generalizing the corresponding Euclidean result. We show that sweeping a generic hyperplane across such arrangements forms an unlabeled compression scheme of size VC dimension and corresponds to a special case of peeling the one-inclusion graph, resolving a recent conjecture of Kuzmin & Warmuth. A bijection between finite maximum classes and certain arrangements of piecewise-linear (PL) hyperplanes in either a ball or Euclidean space is established. Finally we show that d-maximum classes corresponding to PL-hyperplane arrangements in Rd have cubical complexes homeomorphic to a d-ball, or equivalently complexes that are manifolds with boundary. A main result is that PL arrangements can be swept by a moving hyperplane to unlabeled d-compress any finite maximum class, forming a peeling scheme as conjectured by Kuzmin & Warmuth. A corollary is that some d-maximal classes cannot be embedded into any maximum class of VC-dimension d + k, for any constant k. The construction of the PL sweeping involves Pachner moves on the one-inclusion graph, corresponding to moves of a hyperplane across the intersection of d other hyperplanes. This extends the well known Pachner moves for triangulations to c
2 0.35486084 15 jmlr-2012-Algebraic Geometric Comparison of Probability Distributions
Author: Franz J. Király, Paul von Bünau, Frank C. Meinecke, Duncan A.J. Blythe, Klaus-Robert Müller
Abstract: We propose a novel algebraic algorithmic framework for dealing with probability distributions represented by their cumulants such as the mean and covariance matrix. As an example, we consider the unsupervised learning problem of finding the subspace on which several probability distributions agree. Instead of minimizing an objective function involving the estimated cumulants, we show that by treating the cumulants as elements of the polynomial ring we can directly solve the problem, at a lower computational cost and with higher accuracy. Moreover, the algebraic viewpoint on probability distributions allows us to invoke the theory of algebraic geometry, which we demonstrate in a compact proof for an identifiability criterion. Keywords: computational algebraic geometry, approximate algebra, unsupervised Learning
Author: Alain Hauser, Peter Bühlmann
Abstract: The investigation of directed acyclic graphs (DAGs) encoding the same Markov property, that is the same conditional independence relations of multivariate observational distributions, has a long tradition; many algorithms exist for model selection and structure learning in Markov equivalence classes. In this paper, we extend the notion of Markov equivalence of DAGs to the case of interventional distributions arising from multiple intervention experiments. We show that under reasonable assumptions on the intervention experiments, interventional Markov equivalence defines a finer partitioning of DAGs than observational Markov equivalence and hence improves the identifiability of causal models. We give a graph theoretic criterion for two DAGs being Markov equivalent under interventions and show that each interventional Markov equivalence class can, analogously to the observational case, be uniquely represented by a chain graph called interventional essential graph (also known as CPDAG in the observational case). These are key insights for deriving a generalization of the Greedy Equivalence Search algorithm aimed at structure learning from interventional data. This new algorithm is evaluated in a simulation study. Keywords: causal inference, interventions, graphical model, Markov equivalence, greedy equivalence search
4 0.27200162 104 jmlr-2012-Security Analysis of Online Centroid Anomaly Detection
Author: Marius Kloft, Pavel Laskov
Abstract: Security issues are crucial in a number of machine learning applications, especially in scenarios dealing with human activity rather than natural phenomena (e.g., information ranking, spam detection, malware detection, etc.). In such cases, learning algorithms may have to cope with manipulated data aimed at hampering decision making. Although some previous work addressed the issue of handling malicious data in the context of supervised learning, very little is known about the behavior of anomaly detection methods in such scenarios. In this contribution,1 we analyze the performance of a particular method—online centroid anomaly detection—in the presence of adversarial noise. Our analysis addresses the following security-related issues: formalization of learning and attack processes, derivation of an optimal attack, and analysis of attack efficiency and limitations. We derive bounds on the effectiveness of a poisoning attack against centroid anomaly detection under different conditions: attacker’s full or limited control over the traffic and bounded false positive rate. Our bounds show that whereas a poisoning attack can be effectively staged in the unconstrained case, it can be made arbitrarily difficult (a strict upper bound on the attacker’s gain) if external constraints are properly used. Our experimental evaluation, carried out on real traces of HTTP and exploit traffic, confirms the tightness of our theoretical bounds and the practicality of our protection mechanisms. Keywords: anomaly detection, adversarial, security analysis, support vector data description, computer security, network intrusion detection
5 0.264382 68 jmlr-2012-Minimax Manifold Estimation
Author: Christopher Genovese, Marco Perone-Pacifico, Isabella Verdinelli, Larry Wasserman
Abstract: We find the minimax rate of convergence in Hausdorff distance for estimating a manifold M of dimension d embedded in RD given a noisy sample from the manifold. Under certain conditions, we show that the optimal rate of convergence is n−2/(2+d) . Thus, the minimax rate depends only on the dimension of the manifold, not on the dimension of the space in which M is embedded. Keywords: manifold learning, minimax estimation
7 0.21743055 12 jmlr-2012-Active Clustering of Biological Sequences
8 0.20571017 94 jmlr-2012-Query Strategies for Evading Convex-Inducing Classifiers
9 0.19679078 35 jmlr-2012-EP-GIG Priors and Applications in Bayesian Sparse Learning
10 0.1849544 8 jmlr-2012-A Primal-Dual Convergence Analysis of Boosting
11 0.16614431 91 jmlr-2012-Plug-in Approach to Active Learning
12 0.16587876 6 jmlr-2012-A Model of the Perception of Facial Expressions of Emotion by Humans: Research Overview and Perspectives
13 0.15695465 55 jmlr-2012-Learning Algorithms for the Classification Restricted Boltzmann Machine
14 0.14399873 116 jmlr-2012-Transfer in Reinforcement Learning via Shared Features
15 0.14130968 48 jmlr-2012-High-Dimensional Gaussian Graphical Model Selection: Walk Summability and Local Separation Criterion
16 0.14090909 13 jmlr-2012-Active Learning via Perfect Selective Classification
17 0.13532108 67 jmlr-2012-Minimax-Optimal Rates For Sparse Additive Models Over Kernel Classes Via Convex Programming
18 0.13299866 108 jmlr-2012-Sparse and Unique Nonnegative Matrix Factorization Through Data Preprocessing
19 0.12960237 71 jmlr-2012-Multi-Instance Learning with Any Hypothesis Class
20 0.1284533 40 jmlr-2012-Exact Covariance Thresholding into Connected Components for Large-Scale Graphical Lasso
topicId topicWeight
[(21, 0.045), (26, 0.023), (27, 0.504), (29, 0.028), (35, 0.019), (56, 0.01), (69, 0.017), (75, 0.034), (77, 0.011), (79, 0.029), (92, 0.082), (96, 0.079)]
simIndex simValue paperId paperTitle
1 0.97545218 90 jmlr-2012-Pattern for Python
Author: Tom De Smedt, Walter Daelemans
Abstract: Pattern is a package for Python 2.4+ with functionality for web mining (Google + Twitter + Wikipedia, web spider, HTML DOM parser), natural language processing (tagger/chunker, n-gram search, sentiment analysis, WordNet), machine learning (vector space model, k-means clustering, Naive Bayes + k-NN + SVM classifiers) and network analysis (graph centrality and visualization). It is well documented and bundled with 30+ examples and 350+ unit tests. The source code is licensed under BSD and available from http://www.clips.ua.ac.be/pages/pattern. Keywords: Python, data mining, natural language processing, machine learning, graph networks
same-paper 2 0.88528317 3 jmlr-2012-A Geometric Approach to Sample Compression
Author: Benjamin I.P. Rubinstein, J. Hyam Rubinstein
Abstract: The Sample Compression Conjecture of Littlestone & Warmuth has remained unsolved for a quarter century. While maximum classes (concept classes meeting Sauer’s Lemma with equality) can be compressed, the compression of general concept classes reduces to compressing maximal classes (classes that cannot be expanded without increasing VC dimension). Two promising ways forward are: embedding maximal classes into maximum classes with at most a polynomial increase to VC dimension, and compression via operating on geometric representations. This paper presents positive results on the latter approach and a first negative result on the former, through a systematic investigation of finite maximum classes. Simple arrangements of hyperplanes in hyperbolic space are shown to represent maximum classes, generalizing the corresponding Euclidean result. We show that sweeping a generic hyperplane across such arrangements forms an unlabeled compression scheme of size VC dimension and corresponds to a special case of peeling the one-inclusion graph, resolving a recent conjecture of Kuzmin & Warmuth. A bijection between finite maximum classes and certain arrangements of piecewise-linear (PL) hyperplanes in either a ball or Euclidean space is established. Finally we show that d-maximum classes corresponding to PL-hyperplane arrangements in Rd have cubical complexes homeomorphic to a d-ball, or equivalently complexes that are manifolds with boundary. A main result is that PL arrangements can be swept by a moving hyperplane to unlabeled d-compress any finite maximum class, forming a peeling scheme as conjectured by Kuzmin & Warmuth. A corollary is that some d-maximal classes cannot be embedded into any maximum class of VC-dimension d + k, for any constant k. The construction of the PL sweeping involves Pachner moves on the one-inclusion graph, corresponding to moves of a hyperplane across the intersection of d other hyperplanes. This extends the well known Pachner moves for triangulations to c
3 0.66804338 98 jmlr-2012-Regularized Bundle Methods for Convex and Non-Convex Risks
Author: Trinh Minh Tri Do, Thierry Artières
Abstract: Machine learning is most often cast as an optimization problem. Ideally, one expects a convex objective function to rely on efficient convex optimizers with nice guarantees such as no local optima. Yet, non-convexity is very frequent in practice and it may sometimes be inappropriate to look for convexity at any price. Alternatively one can decide not to limit a priori the modeling expressivity to models whose learning may be solved by convex optimization and rely on non-convex optimization algorithms. The main motivation of this work is to provide efficient and scalable algorithms for non-convex optimization. We focus on regularized unconstrained optimization problems which cover a large number of modern machine learning problems such as logistic regression, conditional random fields, large margin estimation, etc. We propose a novel algorithm for minimizing a regularized objective that is able to handle convex and non-convex, smooth and non-smooth risks. The algorithm is based on the cutting plane technique and on the idea of exploiting the regularization term in the objective function. It may be thought as a limited memory extension of convex regularized bundle methods for dealing with convex and non convex risks. In case the risk is convex the algorithm is proved to converge to a stationary solution with accuracy ε with a rate O(1/λε) where λ is the regularization parameter of the objective function under the assumption of a Lipschitz empirical risk. In case the risk is not convex getting such a proof is more difficult and requires a stronger and more disputable assumption. Yet we provide experimental results on artificial test problems, and on five standard and difficult machine learning problems that are cast as convex and non-convex optimization problems that show how our algorithm compares well in practice with state of the art optimization algorithms. Keywords: optimization, non-convex, non-smooth, cutting plane, bundle method, regularized risk
4 0.31101668 45 jmlr-2012-Finding Recurrent Patterns from Continuous Sign Language Sentences for Automated Extraction of Signs
Author: Sunita Nayak, Kester Duncan, Sudeep Sarkar, Barbara Loeding
Abstract: We present a probabilistic framework to automatically learn models of recurring signs from multiple sign language video sequences containing the vocabulary of interest. We extract the parts of the signs that are present in most occurrences of the sign in context and are robust to the variations produced by adjacent signs. Each sentence video is first transformed into a multidimensional time series representation, capturing the motion and shape aspects of the sign. Skin color blobs are extracted from frames of color video sequences, and a probabilistic relational distribution is formed for each frame using the contour and edge pixels from the skin blobs. Each sentence is represented as a trajectory in a low dimensional space called the space of relational distributions. Given these time series trajectories, we extract signemes from multiple sentences concurrently using iterated conditional modes (ICM). We show results by learning single signs from a collection of sentences with one common pervading sign, multiple signs from a collection of sentences with more than one common sign, and single signs from a mixed collection of sentences. The extracted signemes demonstrate that our approach is robust to some extent to the variations produced within a sign due to different contexts. We also show results whereby these learned sign models are used for spotting signs in test sequences. Keywords: pattern extraction, sign language recognition, signeme extraction, sign modeling, iterated conditional modes
5 0.30734828 30 jmlr-2012-DARWIN: A Framework for Machine Learning and Computer Vision Research and Development
Author: Stephen Gould
Abstract: We present an open-source platform-independent C++ framework for machine learning and computer vision research. The framework includes a wide range of standard machine learning and graphical models algorithms as well as reference implementations for many machine learning and computer vision applications. The framework contains Matlab wrappers for core components of the library and an experimental graphical user interface for developing and visualizing machine learning data flows. Keywords: machine learning, graphical models, computer vision, open-source software
6 0.29332858 106 jmlr-2012-Sign Language Recognition using Sub-Units
7 0.29236576 11 jmlr-2012-A Unifying Probabilistic Perspective for Spectral Dimensionality Reduction: Insights and New Models
8 0.28667858 49 jmlr-2012-Hope and Fear for Discriminative Training of Statistical Translation Models
9 0.27920306 75 jmlr-2012-NIMFA : A Python Library for Nonnegative Matrix Factorization
10 0.27417204 8 jmlr-2012-A Primal-Dual Convergence Analysis of Boosting
11 0.27267927 13 jmlr-2012-Active Learning via Perfect Selective Classification
12 0.26830828 72 jmlr-2012-Multi-Target Regression with Rule Ensembles
13 0.26645717 68 jmlr-2012-Minimax Manifold Estimation
14 0.26561803 17 jmlr-2012-An Active Learning Algorithm for Ranking from Pairwise Preferences with an Almost Optimal Query Complexity
15 0.26471353 65 jmlr-2012-MedLDA: Maximum Margin Supervised Topic Models
16 0.26412901 42 jmlr-2012-Facilitating Score and Causal Inference Trees for Large Observational Studies
17 0.26256117 116 jmlr-2012-Transfer in Reinforcement Learning via Shared Features
18 0.2624585 55 jmlr-2012-Learning Algorithms for the Classification Restricted Boltzmann Machine
19 0.26227167 84 jmlr-2012-Online Submodular Minimization
20 0.26179206 94 jmlr-2012-Query Strategies for Evading Convex-Inducing Classifiers